UNIFIED FRAMEWORK FOR FINANCIAL DATA TABLE STRUCTURE RECOGNITION AND PARSING
15.02.2024 12:21
[1. Информационные системы и технологии]
Автор: Pavlo Prokhorov, PhD student, Chernivtsi National University name Yuriy Fedkovych, Chernivtsi; Oleh Pavliuchenko, PhD student, Chernivtsi National University name Yuriy Fedkovych, Chernivtsi; Dmytro Hanzhelo, PhD student, Chernivtsi National University name Yuriy Fedkovych, Chernivtsi; Yuri Dobrovolsky, Dr. Tech. Sciences, Prof. of department of software of computer systems, Chernivtsi National University name Yuriy Fedkovych, Chernivtsi
The growing digitalization of business in recent decades has naturally led to an increase in the amount of financial data produced, as well as to an increase and complexity of technologies for their collection, processing and generation. On the other hand, due to the emergence of global markets, there is a need to process data on various companies that provide public financial statements. Although the content and general format of such financial reporting documents as Balance Sheets or Income Statements are regulated by international standards (IFRS [1], US GAAP, etc.), actual form of presentation and storage of such data can significantly vary between different organizations and companies. For example, from the point of view of the format for storing financial reporting data, the following groups can be distinguished:
1. Structured data in a format convenient for processing, but without internal context and not specifically intended for this (CSV, Spreadsheet, JSON).
2. Specially designed and structured formats that contain special metadata (XBRL).
3. Structured, weakly structured and unstructured data in the format of graphic files or printable documents (JPEG, PNG, RAW, PDF).
The last group can be further divided into data suitable for recognition by different OCR methods (Fig. 1), and data for which such recognition is difficult in the case of distorted table geometry (Fig. 2).
Fig. 1. Weakly structured table example.
Fig. 2. Distorted weakly structured table example.
Thus, we conclude that if the format for storing or distributing reports is not known in advance, then there is a need to develop a system that can support a wide range of such formats. The approach we propose assumes that any type of standard financial reporting could be presented in the form of a structured table. In the case of groups 1 and 2, most high-level programming languages have developed components that make it relatively easy to retrieve and process such data.
To extract and process data from files of the 3rd group, we propose to use a combination of 2 different neural network architectures, which, when applied sequentially, allow performing the task of table structure recognition and parsing. The general problems of table recognition were considered in [2], where also were given a classification of the most common table formats and a solution to this problem based on Graph Neural Networks and OCR. Problems of recognizing and processing tables obtained because of photographing or scanning with possible subsequent violations of the geometric structure of the table were described in [3].
Since the approach we propose involves supporting both options mentioned above, we propose to split the process of structure recognition and data extraction into 2 consecutive stages:
1. Restoration and/or forming table borders.
2. Restored table structure recognition and parsing.
For the 1-st stage we suggest using approach based on DQ-DETR architecture [4]. The given approach allows to recognize rows, columns and borders of distorted and/or unstructured tables through separation line regression process. For the 2-nd stage we offer to use an approach based on Graph Neural Networks and OCR, which performs table data recognition and parsing [5]. Architecture of proposed framework is shown in Fig. 3.
Fig. 3. Unified framework architecture.
References
1. Robert J. Kirk. IFRS: A Quick Reference Guide. 2008. DOI: https://doi.org/10.1016/B978-1-85617-545-6.X0001-0
2. Shah Rukh Qasim, Hassan Mahmood, Faisal Shafait. Rethinking Table Recognition using Graph Neural Networks. arXiv preprint arXiv:1905.13391v2, 2019. DOI: https://doi.org/10.48550/arXiv.1905.13391
3. Rujiao Long et. al. Parsing Table Structures in the Wild. arXiv preprint arXiv:2109.02199v1, 2021. DOI: https://doi.org/10.48550/arXiv.2109.02199
4. Jiawei Wang, Weihong Lin, Chixiang Ma, Mingze Li, Zheng Sun, Lei Sun, Qiang Huo. Robust table structure recognition with dynamic queries enhanced detection transformer. Pattern Recognition Volume 144, December 2023, 109817 DOI: https://doi.org/10.1016/j.patcog.2023.109817
5. Xiao-Hui Li, Fei Yin, He-Sen Dai, Cheng-Lin Liu. Table Structure Recognition and Form Parsing by End-to-End Object Detection and Relation Parsing. Pattern Recognition Volume 132, December 2022, 108946. DOI: https://doi.org/10.1016/j.patcog.2022.108946