This page (revision-61) was last changed on 19-Sep-2022 11:15 by Arnab Ghosh Chowdhury

This page was created on 10-May-2022 15:30 by Arnab Ghosh Chowdhury

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Page revision history

Version Date Modified Size Author Changes ... Change note
61 19-Sep-2022 11:15 5 KB Arnab Ghosh Chowdhury to previous

Page References

Incoming links Outgoing links

Version management

Difference between version and

At line 16 added one line
- Linux OS (operating system)\\
At line 24 changed one line
\\- Files from GitHub (https://github.com/cslab-hub/MatrixDataExtractor)
\\- Code from GitHub ([https://github.com/cslab-hub/MatrixDataExtractor])
At line 41 changed one line
\\
\\ - Extract textual and tabular information from PDF documents.
\\ - ⚠️ For brief overview about the tool, we recommend to open and save the presentation before proceeding: [Data Extractor/Di-Plast_MDE_UI.pdf]
At line 43 removed one line
At line 45 changed one line
\\
\\ - Open-source document table detection tools are not suitable enough to extract tabular information from PDF documents by considering all possible document templates and table templates. Due to diverse document templates and table templates, computer vision and transfer learning based document table detection emerged significantly. This tool helps to extract textual and tabular data (in excel files) from your domain specific dataset. The extracted data can be used in Big Data technologies and Natural Language Processing (NLP).
At line 49 changed 2 lines
\\ - The tool can be accessed throughout the following link: [https://share.streamlit.io/cslab-hub/data_validation/main/main.py]
\\- Get the code/installation files from github [https://cslab-hub-data-validation-main-bx6ggw.streamlitapp.com/] and start using the app by browsing through the pages.
\\ - Get GitHub [https://github.com/cslab-hub/MatrixDataExtractor], copy code into your computer, prepare your annotated dataset, build or request about table detection model weight and model description file, and start using it\\
At line 53 removed 4 lines
Get the GitHub [https://github.com/cslab-hub/MatrixDataExtractor], copy
the code into your computer, prepare your annotated dataset and start using it\\