skip to main content
survey
Free Access
Just Accepted

Deep Learning for Table Detection and Structure Recognition: A Survey

Authors Info & Claims
Online AM:10 April 2024Publication History
Skip Abstract Section

Abstract

Tables are everywhere, from scientific journals, papers, websites, and newspapers all the way to items we buy at the supermarket. Detecting them is thus of utmost importance to automatically understanding the content of a document. The performance of table detection has substantially increased thanks to the rapid development of deep learning networks. The goals of this survey are to provide a profound comprehension of the major developments in the field of Table Detection, offer insight into the different methodologies, and provide a systematic taxonomy of the different approaches. Furthermore, we provide an analysis of both classic and new applications in the field. Lastly, the datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature. Finally, we go over the architecture of utilizing various object detection and table structure recognition methods to create an effective and efficient system, as well as a set of development trends to keep up with state-of-the-art algorithms and future research. We have also set up a public GitHub repository where we will be updating the most recent publications, open data, and source code. The GitHub repository is available at https://github.com/abdoelsayed2016/table-detection-structure-recognition.

References

  1. Abdelrahman Abdallah, Alexander Berendeyev, Islam Nuradin, and Daniyar Nurseitov. 2022. TNCR:Table net detection and classification dataset. Neurocomputing 473(2022), 79–97. https://doi.org/10.1016/j.neucom.2021.11.101Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Abdelrahman Abdallah, Daniel Eberharter, Zoe Pfister, and Adam Jatowt. 2024. Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis. arXiv preprint arXiv:2403.04080(2024).Google ScholarGoogle Scholar
  3. Abdelrahman Abdallah and Adam Jatowt. 2023. Generator-retriever-generator: A novel approach to open-domain question answering. arXiv preprint arXiv:2307.11278(2023).Google ScholarGoogle Scholar
  4. Abdelrahman Abdallah, Mahmoud Kasem, Mahmoud Abdalla, Mohamed Mahmoud, Mohamed Elkasaby, Yasser Elbendary, and Adam Jatowt. 2024. ArabicaQA: A Comprehensive Dataset for Arabic Question Answering. arXiv preprint arXiv:2403.17848(2024).Google ScholarGoogle Scholar
  5. Madhav Agarwal, Ajoy Mondal, and CV Jawahar. 2021. Cdec-net: Composite deformable cascade network for table detection in document images. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 9491–9498.Google ScholarGoogle ScholarCross RefCross Ref
  6. Ahmed Alsayat. 2023. Customer decision-making analysis based on big social data using machine learning: a case study of hotels in Mecca. Neural Computing and Applications 35, 6 (2023), 4701–4722.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Saman Arif and Faisal Shafait. 2018. Table detection in document images using foreground and background features. In 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 1–8.Google ScholarGoogle Scholar
  8. Anders Arpteg, Björn Brinne, Luka Crnkovic-Friis, and Jan Bosch. 2018. Software engineering challenges of deep learning. In 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). IEEE, 50–59.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8(2013), 1798–1828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In European conference on computer vision. Springer, 213–229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ángela Casado-García, César Domínguez, Jónathan Heras, Eloy Mata, and Vico Pascual. 2020. The benefits of close-domain fine-tuning for table detection in document images. In International workshop on document analysis systems. Springer, 199–215.Google ScholarGoogle ScholarCross RefCross Ref
  12. Francesca Cesarini, Simone Marinai, L Sarti, and Giovanni Soda. 2002. Trainable table location in document images. In Object recognition supported by user interaction for service robots, Vol.  3. IEEE, 236–240.Google ScholarGoogle Scholar
  13. Surekha Chandran and Rangachar Kasturi. 1993. Structural recognition of tabulated data. In Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR’93). IEEE, 516–519.Google ScholarGoogle ScholarCross RefCross Ref
  14. Zewen Chi, Heyan Huang, Heng-Da Xu, Houjin Yu, Wanxuan Yin, and Xian-Ling Mao. 2019. Complicated Table Structure Recognition. arXiv preprint arXiv:1908.04729(2019).Google ScholarGoogle Scholar
  15. Bertrand Coüasnon and Aurélie Lemaitre. 2014. Recognition of tables and forms.Google ScholarGoogle Scholar
  16. Yuntian Deng, David Rosenberg, and Gideon Mann. 2019. Challenges in end-to-end neural scientific table recognition. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 894–901.Google ScholarGoogle ScholarCross RefCross Ref
  17. Haoyu Dong, Shijie Liu, Shi Han, Zhouyu Fu, and Dongmei Zhang. 2019. Tablesense: Spreadsheet table detection with convolutional neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol.  33. 69–76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ana Costa e Silva. 2009. Learning rich hidden markov models in document analysis: Table location. In 2009 10th International Conference on Document Analysis and Recognition. IEEE, 843–847.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. David W Embley, Matthew Hurst, Daniel Lopresti, and George Nagy. 2006. Table-processing paradigms: a research survey. International Journal of Document Analysis and Recognition (IJDAR) 8, 2(2006), 66–86.Google ScholarGoogle ScholarCross RefCross Ref
  20. Rasool Fakoor, Faisal Ladhak, Azade Nazi, and Manfred Huber. 2013. Using deep learning to enhance cancer diagnosis and classification. In Proceedings of the international conference on machine learning, Vol.  28. ACM, New York, USA, 3937–3949.Google ScholarGoogle Scholar
  21. Miao Fan and Doo Soon Kim. 2015. Table region detection on large-scale PDF files without labeled data. CoRR, abs/1506.08891(2015).Google ScholarGoogle Scholar
  22. Jing Fang, Prasenjit Mitra, Zhi Tang, and C Lee Giles. 2012. Table header detection and classification. In Twenty-Sixth AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  23. Jing Fang, Xin Tao, Zhi Tang, Ruiheng Qiu, and Ying Liu. 2012. Dataset, ground-truth and performance metrics for table detection evaluation. In 2012 10th IAPR International Workshop on Document Analysis Systems. IEEE, 445–449.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Pascal Fischer, Alen Smajic, Giuseppe Abrami, and Alexander Mehler. 2021. Multi-type-td-tsr–extracting tables from document images using a multi-stage pipeline for table detection and table structure recognition: From ocr to structured table representations. In KI 2021: Advances in Artificial Intelligence: 44th German Conference on AI, Virtual Event, September 27–October 1, 2021, Proceedings 44. Springer, 95–108.Google ScholarGoogle Scholar
  25. Liangcai Gao, Yilun Huang, Hervé Déjean, Jean-Luc Meunier, Qinqin Yan, Yu Fang, Florian Kleber, and Eva Lang. 2019. ICDAR 2019 competition on table detection and recognition (cTDaR). In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1510–1515.Google ScholarGoogle Scholar
  26. Liangcai Gao, Xiaohan Yi, Zhuoren Jiang, Leipeng Hao, and Zhi Tang. 2017. ICDAR2017 competition on page object detection. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol.  1. IEEE, 1417–1422.Google ScholarGoogle Scholar
  27. Arnab Ghosh Chowdhury, Martin ben Ahmed, and Martin Atzmueller. 2022. Towards Tabular Data Extraction From Richly-Structured Documents Using Supervised and Weakly-Supervised Learning. In 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, 1–4.Google ScholarGoogle Scholar
  28. Azka Gilani, Shah Rukh Qasim, Imran Malik, and Faisal Shafait. 2017. Table detection using deep learning. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), Vol.  1. IEEE, 771–776.Google ScholarGoogle Scholar
  29. Max Göbel, Tamir Hassan, Ermelinda Oro, and Giorgio Orsi. 2012. A methodology for evaluating algorithms for table understanding in PDF documents. In Proceedings of the 2012 ACM symposium on Document engineering. 45–48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Max Göbel, Tamir Hassan, Ermelinda Oro, and Giorgio Orsi. 2013. ICDAR 2013 table competition. In 2013 12th International Conference on Document Analysis and Recognition. IEEE, 1449–1453.Google ScholarGoogle Scholar
  31. Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. AA Gurav and Manisha J Nene. 2020. Weakly Supervised Learning-based Table Detection. SN Computer Science 1(2020), 1–9.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mrinal Haloi, Shashank Shekhar, Nikhil Fande, Siddhant Swaroop Dash, et al. 2022. Table Detection in the Wild: A Novel Diverse Table Detection Dataset and Method. arXiv preprint arXiv:2209.09207(2022).Google ScholarGoogle Scholar
  34. Mohamed A Hamada, Abdelrahman Abdallah, Mahmoud Kasem, and Mohamed Abokhalil. 2021. Neural Network Estimation Model to Optimize Timing and Schedule of Software Projects. In 2021 IEEE International Conference on Smart Information Systems and Technologies (SIST). IEEE, 1–7.Google ScholarGoogle ScholarCross RefCross Ref
  35. Leipeng Hao, Liangcai Gao, Xiaohan Yi, and Zhi Tang. 2016. A table detection method for pdf documents based on convolutional neural networks. In 2016 12th IAPR Workshop on Document Analysis Systems (DAS). IEEE, 287–292.Google ScholarGoogle ScholarCross RefCross Ref
  36. Gaurav Harit and Anukriti Bansal. 2012. Table detection in document images using header and trailer patterns. In Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing. 1–8.Google ScholarGoogle Scholar
  37. Adam W Harley, Alex Ufkes, and Konstantinos G Derpanis. 2015. Evaluation of deep convolutional nets for document image classification and retrieval. In 2015 13th International Conference on Document Analysis and Recognition (ICDAR). IEEE, 991–995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, Muhammad Noman Afzal, and Muhammad Zeshan Afzal. 2021. Guided table structure recognition through anchor optimization. IEEE Access 9(2021), 113521–113534.Google ScholarGoogle ScholarCross RefCross Ref
  39. Tamir Hassan and Robert Baumgartner. 2007. Table recognition and understanding from pdf files. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Vol.  2. IEEE, 1143–1147.Google ScholarGoogle Scholar
  40. Dafang He, Scott Cohen, Brian Price, Daniel Kifer, and C Lee Giles. 2017. Multi-scale multi-task fcn for semantic page segmentation and table detection. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Vol.  1. IEEE, 254–261.Google ScholarGoogle ScholarCross RefCross Ref
  41. Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. 2017. Mask R-CNN. 2017 IEEE International Conference on Computer Vision (ICCV) (Oct 2017).Google ScholarGoogle Scholar
  42. Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno, and Julian Martin Eisenschlos. 2020. TaPas: Weakly supervised table parsing via pre-training. arXiv preprint arXiv:2004.02349(2020).Google ScholarGoogle Scholar
  43. Martin Holeček, Antonín Hoskovec, Petr Baudiš, and Pavel Klinger. 2019. Table understanding in structured documents. In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), Vol.  5. IEEE, 158–164.Google ScholarGoogle Scholar
  44. Jianying Hu, Ramanujan S Kashi, Daniel Lopresti, and Gordon T Wilfong. 2002. Evaluating the performance of table processing algorithms. International Journal on Document Analysis and Recognition 4, 3(2002), 140–153.Google ScholarGoogle ScholarCross RefCross Ref
  45. Yuan-Ting Hu, Jia-Bin Huang, and Alexander Schwing. 2017. Maskrnn: Instance level video object segmentation. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  46. Zilong Hu, Jinshan Tang, Ziming Wang, Kai Zhang, Ling Zhang, and Qingling Sun. 2018. Deep learning for image-based cancer detection and diagnosis- A survey. Pattern Recognition 83(2018), 134–149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Yilun Huang, Qinqin Yan, Yibo Li, Yifan Chen, Xiong Wang, Liangcai Gao, and Zhi Tang. 2019. A YOLO-based table detection method. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 813–818.Google ScholarGoogle ScholarCross RefCross Ref
  48. Katsuhiko Itonori. 1993. Table structure recognition based on textblock arrangement and ruled line position. In Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR’93). IEEE, 765–768.Google ScholarGoogle ScholarCross RefCross Ref
  49. MAC Akmal Jahan and Roshan G Ragel. 2014. Locating tables in scanned documents for reconstructing and republishing. In 7th International Conference on Information and Automation for Sustainability. IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  50. Arushi Jain, Shubham Paliwal, Monika Sharma, and Lovekesh Vig. 2022. TSR-DSAW: Table Structure Recognition via Deep Spatial Association of Words. arXiv preprint arXiv:2203.06873(2022).Google ScholarGoogle Scholar
  51. K Jain, Anoop M Namboodiri, and Jayashree Subrahmonia. 2001. Structure in on-line documents. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 844–848.Google ScholarGoogle ScholarCross RefCross Ref
  52. Ertugrul Kara, Mark Traquair, Murat Simsek, Burak Kantarci, and Shahzad Khan. 2020. Holistic design for deep learning-based discovery of tabular structures in datasheet images. Engineering Applications of Artificial Intelligence 90 (2020), 103551.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Thotreingam Kasar, Philippine Barlas, Sebastien Adam, Clément Chatelain, and Thierry Paquet. 2013. Learning to detect tables in scanned document images using line information. In 2013 12th International Conference on Document Analysis and Recognition. IEEE, 1185–1189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Mahmoud SalahEldin Kasem, Mohamed Hamada, and Islam Taj-Eddin. 2023. Customer Profiling, Segmentation, and Sales Prediction using AI in Direct Marketing. arXiv preprint arXiv:2302.01786(2023).Google ScholarGoogle Scholar
  55. Mahmoud SalahEldin Kasem, Mohamed Mahmoud, and Hyun-Soo Kang. 2023. Advancements and Challenges in Arabic Optical Character Recognition: A Comprehensive Survey. arXiv preprint arXiv:2312.11812(2023).Google ScholarGoogle Scholar
  56. Isaak Kavasidis, Carmelo Pino, Simone Palazzo, Francesco Rundo, Daniela Giordano, P Messina, and Concetto Spampinato. 2019. A saliency-based convolutional neural network for table and chart detection in digitized documents. In International conference on image analysis and processing. Springer, 292–302.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Saqib Ali Khan, Syed Muhammad Daniyal Khalid, Muhammad Ali Shahzad, and Faisal Shafait. 2019. Table structure extraction with bi-directional gated recurrent unit networks. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1366–1371.Google ScholarGoogle Scholar
  58. Shah Khusro, Asima Latif, and Irfan Ullah. 2015. On methods and tools of table detection, extraction and annotation in PDF documents. Journal of Information Science 41, 1 (2015), 41–57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Thomas Kieninger and Andreas Dengel. 1998. The t-recs table recognition and analysis system. In International Workshop on Document Analysis Systems. Springer, 255–270.Google ScholarGoogle Scholar
  60. Yeon-Seok Kim and Kyong-Ho Lee. 2008. Extracting logical structures from HTML tables. Computer Standards & Interfaces 30, 5 (2008), 296–308.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Stefan Klampfl, Kris Jack, and Roman Kern. 2014. A comparison of two unsupervised table recognition methods from digital scientific articles. D-Lib Magazine 20, 11 (2014), 7.Google ScholarGoogle ScholarCross RefCross Ref
  62. Elvis Koci, Maik Thiele, Wolfgang Lehner, and Oscar Romero. 2018. Table recognition in spreadsheets via a graph representation. In 2018 13th IAPR International Workshop on Document Analysis Systems (DAS). IEEE, 139–144.Google ScholarGoogle Scholar
  63. Elvis Koci, Maik Thiele, Josephine Rehak, Oscar Romero, and Wolfgang Lehner. 2019. DECO: A dataset of annotated spreadsheets for layout and table recognition. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1280–1285.Google ScholarGoogle ScholarCross RefCross Ref
  64. Elvis Koci, Maik Thiele, Oscar Romero, and Wolfgang Lehner. 2019. A genetic-based search for adaptive table recognition in spreadsheets. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1274–1279.Google ScholarGoogle ScholarCross RefCross Ref
  65. Tarun Kumar and Himanshu Sharad Bhatt. 2022. Evaluating Table Structure Recognition: A New Perspective. arXiv preprint arXiv:2208.00385(2022).Google ScholarGoogle Scholar
  66. Yann LeCun, Yoshua Bengio, Geoffrey Hinton, et al. 2015. Deep learning. nature, 521 (7553), 436-444. Google Scholar Google Scholar Cross Ref Cross Ref (2015).Google ScholarGoogle Scholar
  67. Benjamin Charles Germain Lee. 2017. Line detection in binary document scans: a case study with the International Tracing Service archives. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 2256–2261.Google ScholarGoogle Scholar
  68. Huichao Li, Lingze Zeng, Weiyu Zhang, Jianing Zhang, Ju Fan, and Meihui Zhang. 2022. A Two-Phase Approach for Recognizing Tables with Complex Structures. In International Conference on Database Systems for Advanced Applications. Springer, 587–595.Google ScholarGoogle Scholar
  69. Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, and Furu Wei. 2022. DiT: Self-supervised Pre-training for Document Image Transformer. arXiv preprint arXiv:2203.02378(2022).Google ScholarGoogle Scholar
  70. Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou, and Zhoujun Li. 2020. Tablebank: Table benchmark for image-based table detection and recognition. In Proceedings of the 12th Language Resources and Evaluation Conference. 1918–1925.Google ScholarGoogle Scholar
  71. Shun Li, WeiDong Liu, and GongBing Xiao. 2019. Detection of Srew Nut Images Based on Deep Transfer Learning Network. In 2019 Chinese Automation Congress (CAC). IEEE, 951–955.Google ScholarGoogle Scholar
  72. Yibo Li, Liangcai Gao, Zhi Tang, Qinqin Yan, and Yilun Huang. 2019. A GAN-based feature generator for table detection. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 763–768.Google ScholarGoogle ScholarCross RefCross Ref
  73. Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen Awm Van Der Laak, Bram Van Ginneken, and Clara I Sánchez. 2017. A survey on deep learning in medical image analysis. Medical image analysis 42 (2017), 60–88.Google ScholarGoogle Scholar
  74. Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen, Xinwang Liu, and Matti Pietikäinen. 2020. Deep learning for generic object detection: A survey. International journal of computer vision 128, 2 (2020), 261–318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Ruixue Liu, Shaozu Yuan, Aijun Dai, Lei Shen, Tiangang Zhu, Meng Chen, and Xiaodong He. 2022. Few-Shot Table Understanding: A Benchmark Dataset and Pre-Training Baseline. In Proceedings of the 29th International Conference on Computational Linguistics. 3741–3752.Google ScholarGoogle Scholar
  76. Rujiao Long, Wen Wang, Nan Xue, Feiyu Gao, Zhibo Yang, Yongpan Wang, and Gui-Song Xia. 2021. Parsing table structures in the wild. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 944–952.Google ScholarGoogle ScholarCross RefCross Ref
  77. Nam Tuan Ly, Atsuhiro Takasu, Phuc Nguyen, and Hideaki Takeda. 2023. Rethinking Image-based Table Recognition Using Weakly Supervised Methods. arXiv preprint arXiv:2303.07641(2023).Google ScholarGoogle Scholar
  78. Chixiang Ma, Weihong Lin, Lei Sun, and Qiang Huo. 2023. Robust table detection and structure recognition from heterogeneous document images. Pattern Recognition 133(2023), 109006.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Mohamed Mahmoud and Hyun-Soo Kang. 2023. GANMasker: A Two-Stage Generative Adversarial Network for High-Quality Face Mask Removal. Sensors 23, 16 (2023), 7094.Google ScholarGoogle ScholarCross RefCross Ref
  80. Mohamed Mahmoud, Mahmoud Kasem, Abdelrahman Abdallah, and Hyun Soo Kang. 2022. AE-LSTM: Autoencoder with LSTM-Based Intrusion Detection in IoT. In 2022 International Telecommunications Conference (ITC-Egypt). IEEE, 1–6.Google ScholarGoogle Scholar
  81. Sabri A Mahmoud, Irfan Ahmad, Wasfi G Al-Khatib, Mohammad Alshayeb, Mohammad Tanvir Parvez, Volker Märgner, and Gernot A Fink. 2014. KHATT: An open Arabic offline handwritten text database. Pattern Recognition 47, 3 (2014), 1096–1112.Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Song Mao, Azriel Rosenfeld, and Tapas Kanungo. 2003. Document structure analysis algorithms: a literature survey. Document recognition and retrieval X 5010 (2003), 197–207.Google ScholarGoogle Scholar
  83. Katleho L Masita, Ali N Hasan, and Satyakama Paul. 2018. Pedestrian detection using R-CNN object detector. In 2018 IEEE Latin American Conference on Computational Intelligence (LA-CCI). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  84. Shervin Minaee and Zhu Liu. 2017. Automatic question-answering using a deep similarity neural network. In 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP). IEEE, 923–927.Google ScholarGoogle ScholarCross RefCross Ref
  85. Ajoy Mondal, Peter Lipps, and CV Jawahar. 2020. IIIT-AR-13K: a new dataset for graphical object detection in documents. In International Workshop on Document Analysis Systems. Springer, 216–230.Google ScholarGoogle ScholarCross RefCross Ref
  86. Marcin Namysl, Alexander M Esser, Sven Behnke, and Joachim Köhler. 2022. Flexible Table Recognition and Semantic Interpretation System.. In VISIGRAPP (4: VISAPP). 27–37.Google ScholarGoogle Scholar
  87. Marcin Namysł, Alexander M Esser, Sven Behnke, and Joachim Köhler. 2023. Flexible Hybrid Table Recognition and Semantic Interpretation System. SN Computer Science 4, 3 (2023), 246.Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Ahmed Nassar, Nikolaos Livathinos, Maksym Lysak, and Peter Staar. 2022. TableFormer: Table Structure Understanding with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4614–4623.Google ScholarGoogle ScholarCross RefCross Ref
  89. Duc-Dung Nguyen. 2022. TableSegNet: a fully convolutional network for table detection and segmentation in document images. International Journal on Document Analysis and Recognition (IJDAR) 25, 1(2022), 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Anssi Nurminen. 2013. Algorithmic extraction of data in tables in PDF documents. Master’s thesis.Google ScholarGoogle Scholar
  91. Daniyar Nurseitov, Kairat Bostanbekov, Daniyar Kurmankhojayev, Anel Alimova, Abdelrahman Abdallah, and Rassul Tolegenov. 2021. Handwritten Kazakh and Russian (HKR) database for text recognition. Multimedia Tools and Applications 80, 21 (2021), 33075–33097.Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Lawrence O’Gorman. 1993. The document spectrum for page layout analysis. IEEE Transactions on pattern analysis and machine intelligence 15, 11(1993), 1162–1173.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Ermelinda Oro and Massimo Ruffolo. 2009. TREX: An approach for recognizing and extracting tables from PDF documents. In 2009 10th International Conference on Document Analysis and Recognition. IEEE, 906–910.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Shubham Singh Paliwal, D Vishwanath, Rohit Rahul, Monika Sharma, and Lovekesh Vig. 2019. Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 128–133.Google ScholarGoogle ScholarCross RefCross Ref
  95. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.Google ScholarGoogle Scholar
  96. Ihsin Tsaiyun Phillips. 1996. User’s reference manual for the UW english/technical document image database III. UW-III English/technical document image database manual (1996).Google ScholarGoogle Scholar
  97. Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave, and Kavita Sultanpure. 2020. CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 572–573.Google ScholarGoogle ScholarCross RefCross Ref
  98. P Pyreddy and WB Croft. 1997. Tinti: A system for retrieval in text tables title2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. Shah Rukh Qasim, Hassan Mahmood, and Faisal Shafait. 2019. Rethinking table recognition using graph neural networks. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 142–147.Google ScholarGoogle ScholarCross RefCross Ref
  100. Liang Qiao, Zaisheng Li, Zhanzhan Cheng, Peng Zhang, Shiliang Pu, Yi Niu, Wenqi Ren, Wenming Tan, and Fei Wu. 2021. Lgpma: Complicated table structure recognition with local and global pyramid mask alignment. In International conference on document analysis and recognition. Springer, 99–114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Sachin Raja, Ajoy Mondal, and CV Jawahar. 2020. Table structure recognition using top-down and bottom-up cues. In European Conference on Computer Vision. Springer, 70–86.Google ScholarGoogle Scholar
  102. Sachin Raja, Ajoy Mondal, and CV Jawahar. 2022. Visual Understanding of Complex Table Structures from Document Images. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2299–2308.Google ScholarGoogle Scholar
  103. Susie Xi Rao12, Johannes Rausch, Peter Egger, and Ce Zhang. 2021. TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets. (2021).Google ScholarGoogle Scholar
  104. Sheikh Faisal Rashid, Abdullah Akmal, Muhammad Adnan, Ali Adnan Aslam, and Andreas Dengel. 2017. Table recognition in heterogeneous documents using machine learning. In 2017 14th IAPR International conference on document analysis and recognition (ICDAR), Vol.  1. IEEE, 777–782.Google ScholarGoogle Scholar
  105. Mohammad Mohsin Reza, Syed Saqib Bukhari, Martin Jenckel, and Andreas Dengel. 2019. Table localization and segmentation using gan and cnn. In 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), Vol.  5. IEEE, 152–157.Google ScholarGoogle Scholar
  106. Pau Riba, Anjan Dutta, Lutz Goldmann, Alicia Fornés, Oriol Ramos, and Josep Lladós. 2019. Table detection in invoice documents by graph neural networks. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 122–127.Google ScholarGoogle Scholar
  107. Pau Riba, Lutz Goldmann, Oriol Ramos Terrades, Diede Rusticus, Alicia Fornés, and Josep Lladós. 2022. Table detection in business document images by message passing networks. Pattern Recognition 127(2022), 108641.Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Arash Samari, Andrew Piper, Alison Hedley, and Mohamed Cheriet. 2021. Weakly supervised bounding box extraction for unlabeled data in table detection. In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10-15, 2021, Proceedings, Part VII. Springer, 339–352.Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Sebastian Schreiber, Stefan Agne, Ivo Wolf, Andreas Dengel, and Sheraz Ahmed. 2017. Deepdesrt: Deep learning for detection and structure recognition of tables in document images. In 2017 14th IAPR international conference on document analysis and recognition (ICDAR), Vol.  1. IEEE, 1162–1167.Google ScholarGoogle ScholarCross RefCross Ref
  110. Wonkyo Seo, Hyung Il Koo, and Nam Ik Cho. 2015. Junction-based table detection in camera-captured document images. International Journal on Document Analysis and Recognition (IJDAR) 18, 1(2015), 47–57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Faisal Shafait and Ray Smith. 2010. Table detection in heterogeneous documents. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. 65–72.Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Asif Shahab, Faisal Shafait, Thomas Kieninger, and Andreas Dengel. 2010. An open approach towards the benchmarking of table structure recognition systems. In Proceedings of the 9th IAPR International Workshop on Document Analysis Systems. 113–120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, and Muhammad Zeshan Afzal. 2023. Towards End-to-End Semi-Supervised Table Detection with Deformable Transformer. In International Conference on Document Analysis and Recognition. Springer, 51–76.Google ScholarGoogle Scholar
  114. Xinyi Shen, Lingjun Kong, Yunchao Bao, Yaowei Zhou, and Weiguang Liu. 2022. RCANet: A Rows and Columns Aggregated Network for Table Structure Recognition. In 2022 3rd Information Communication Technologies Conference (ICTC). IEEE, 112–116.Google ScholarGoogle Scholar
  115. Shoaib Ahmed Siddiqui, Imran Ali Fateh, Syed Tahseen Raza Rizvi, Andreas Dengel, and Sheraz Ahmed. 2019. DeepTabStR: deep learning based table structure recognition. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1403–1409.Google ScholarGoogle ScholarCross RefCross Ref
  116. Shoaib Ahmed Siddiqui, Pervaiz Iqbal Khan, Andreas Dengel, and Sheraz Ahmed. 2019. Rethinking semantic segmentation for table structure recognition in documents. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1397–1402.Google ScholarGoogle ScholarCross RefCross Ref
  117. Shoaib Ahmed Siddiqui, Muhammad Imran Malik, Stefan Agne, Andreas Dengel, and Sheraz Ahmed. 2018. Decnt: Deep deformable cnn for table detection. IEEE access 6(2018), 74151–74161.Google ScholarGoogle Scholar
  118. Grigori Sidorov, Helena Gómez-Adorno, Ilia Markov, David Pinto, and Nahun Loya. 2015. Computing text similarity using Tree Edit Distance. In 2015 Annual Conference of the North American Fuzzy Information Processing Society (NAFIPS) held jointly with 2015 5th World Conference on Soft Computing (WConSC). 1–4. https://doi.org/10.1109/NAFIPS-WConSC.2015.7284129Google ScholarGoogle ScholarCross RefCross Ref
  119. Noah Siegel, Nicholas Lourie, Russell Power, and Waleed Ammar. 2018. Extracting scientific figures with distantly supervised neural networks. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries. 223–232.Google ScholarGoogle ScholarDigital LibraryDigital Library
  120. Brandon Smock, Rohith Pesala, and Robin Abraham. 2023. GriTS: Grid table similarity metric for table structure recognition. In International Conference on Document Analysis and Recognition. Springer, 535–549.Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. Brandon Smock, Rohith Pesala, Robin Abraham, and WA Redmond. 2021. PubTables-1M: Towards comprehensive table extraction from unstructured documents. arXiv preprint arXiv:2110.00061(2021).Google ScholarGoogle Scholar
  122. Ningning Sun, Yuanping Zhu, and Xiaoming Hu. 2019. Faster R-CNN based table detection combining corner locating. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 1314–1319.Google ScholarGoogle ScholarCross RefCross Ref
  123. Richard Szeliski. 2010. Computer vision: algorithms and applications. Springer Science & Business Media.Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Chris Tensmeyer, Vlad I Morariu, Brian Price, Scott Cohen, and Tony Martinez. 2019. Deep splitting and merging for table structure decomposition. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 114–121.Google ScholarGoogle ScholarCross RefCross Ref
  125. Nazgul Toiganbayeva, Mahmoud Kasem, Galymzhan Abdimanap, Kairat Bostanbekov, Abdelrahman Abdallah, Anel Alimova, and Daniyar Nurseitov. 2022. Kohtd: Kazakh offline handwritten text dataset. Signal Processing: Image Communication 108 (2022), 116827.Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Mark Traquair, Ertugrul Kara, Burak Kantarci, and Shahzad Khan. 2019. Deep learning for the detection of tabular information from electronic component datasheets. In 2019 IEEE Symposium on Computers and Communications (ISCC). IEEE, 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  127. Scott Tupaj, Zhongwen Shi, C Hwa Chang, and Hassan Alam. 1996. Extracting tabular information from text files. EECS Department, Tufts University, Medford, USA 1 (1996).Google ScholarGoogle Scholar
  128. Yalin Wang and Jianying Hu. 2002. A machine learning based approach for table detection on the web. In Proceedings of the 11th international conference on World Wide Web. 242–250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. Yalin Wangt, Ihsin T Phillipst, and Robert Haralick. 2001. Automatic table ground truth generation and a background-analysis-based table structure extraction method. In Proceedings of Sixth International Conference on Document Analysis and Recognition. IEEE, 528–532.Google ScholarGoogle Scholar
  130. Shengkai Wu, Jinrong Yang, Xinggang Wang, and Xiaoping Li. 2019. Iou-balanced loss functions for single-stage object detection. arXiv preprint arXiv:1908.05641(2019).Google ScholarGoogle Scholar
  131. Bin Xiao, Murat Simsek, Burak Kantarci, and Ala Abu Alkheir. 2022. Table Structure Recognition with Conditional Attention. arXiv preprint arXiv:2203.03819(2022).Google ScholarGoogle Scholar
  132. Bin Xiao, Murat Simsek, Burak Kantarci, and Ala Abu Alkheir. 2023. Revisiting Table Detection Datasets for Visually Rich Documents. arXiv preprint arXiv:2305.04833(2023).Google ScholarGoogle Scholar
  133. Wen Xu, Julian Jang-Jaccard, Amardeep Singh, Yuanyuan Wei, and Fariza Sabrina. 2021. Improving performance of autoencoder-based network anomaly detection on nsl-kdd dataset. IEEE Access 9(2021), 140136–140146.Google ScholarGoogle ScholarCross RefCross Ref
  134. Wenyuan Xue, Qingyong Li, and Dacheng Tao. 2019. ReS2TIM: Reconstruct syntactic structures from table images. In 2019 International Conference on Document Analysis and Recognition (ICDAR). IEEE, 749–755.Google ScholarGoogle ScholarCross RefCross Ref
  135. Fan Yang, Lei Hu, Xinwu Liu, Shuangping Huang, and Zhenghui Gu. 2023. A large-scale dataset for end-to-end table recognition in the wild. Scientific Data 10, 1 (2023), 110.Google ScholarGoogle ScholarCross RefCross Ref
  136. Jing Yang and Guanci Yang. 2018. Modified convolutional neural network based on dropout and the stochastic gradient descent optimizer. Algorithms 11, 3 (2018), 28.Google ScholarGoogle ScholarCross RefCross Ref
  137. Tom Young, Devamanyu Hazarika, Soujanya Poria, and Erik Cambria. 2018. Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine 13, 3 (2018), 55–75.Google ScholarGoogle Scholar
  138. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2019. Free-form image inpainting with gated convolution. In Proceedings of the IEEE/CVF international conference on computer vision. 4471–4480.Google ScholarGoogle ScholarCross RefCross Ref
  139. Richard Zanibbi, Dorothea Blostein, and James R Cordy. 2004. A survey of table recognition. Document Analysis and Recognition 7, 1 (2004), 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. Daqian Zhang, Ruibin Mao, Runting Guo, Yang Jiang, and Jing Zhu. 2022. YOLO-table: disclosure document table detection with involution. International Journal on Document Analysis and Recognition (IJDAR) (2022), 1–14.Google ScholarGoogle Scholar
  141. Xi-wen Zhang, Michael R Lyu, and Guo-zhong Dai. 2007. Extraction and segmentation of tables from Chinese ink documents based on a matrix model. Pattern recognition 40, 7 (2007), 1855–1867.Google ScholarGoogle Scholar
  142. Zixing Zhang, Jürgen Geiger, Jouni Pohjalainen, Amr El-Desoky Mousa, Wenyu Jin, and Björn Schuller. 2018. Deep learning for environmentally robust speech recognition: An overview of recent developments. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 5(2018), 1–28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. Zhenrong Zhang, Jianshu Zhang, Jun Du, and Fengren Wang. 2022. Split, embed and merge: An accurate table structure recognizer. Pattern Recognition 126(2022), 108565.Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. Xinyi Zheng, Doug Burdick, Lucian Popa, Peter Zhong, and Nancy Xin Ru Wang. 2021. Global Table Extractor (GTE): A Framework for Joint Table Identification and Cell Structure Recognition Using Visual Context. Winter Conference for Applications in Computer Vision (WACV) (2021).Google ScholarGoogle Scholar
  145. Xinyi Zheng, Douglas Burdick, Lucian Popa, Xu Zhong, and Nancy Xin Ru Wang. 2021. Global table extractor (gte): A framework for joint table identification and cell structure recognition using visual context. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. 697–706.Google ScholarGoogle ScholarCross RefCross Ref
  146. Xu Zhong, Elaheh ShafieiBavani, and Antonio Jimeno Yepes. 2020. Image-based table recognition: data, model, and evaluation. In European Conference on Computer Vision. Springer, 564–580.Google ScholarGoogle ScholarDigital LibraryDigital Library
  147. Yajun Zou and Jinwen Ma. 2020. A deep semantic segmentation model for image-based table structure recognition. In 2020 15th IEEE International Conference on Signal Processing (ICSP), Vol.  1. IEEE, 274–280.Google ScholarGoogle ScholarCross RefCross Ref
  148. Arthur Zucker, Younes Belkada, Hanh Vu, and Van Nam Nguyen. 2021. ClusTi: Clustering method for table structure recognition in scanned images. Mobile Networks and Applications 26, 4 (2021), 1765–1776.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Deep Learning for Table Detection and Structure Recognition: A Survey

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Computing Surveys
        ACM Computing Surveys Just Accepted
        ISSN:0360-0300
        EISSN:1557-7341
        Table of Contents

        Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Online AM: 10 April 2024
        • Accepted: 2 April 2024
        • Revised: 11 February 2024
        • Received: 13 December 2022

        Check for updates

        Qualifiers

        • survey
      • Article Metrics

        • Downloads (Last 12 months)136
        • Downloads (Last 6 weeks)136

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader