Converting scanned documents or pdf files to Excel, Google Sheets, CSV and other spreadsheet formats usually involves one of these two scenarios.
- Individual documents or reports that have well-defined rows and columns
- Large batches of documents with complex tables or separate fields combined to make a table
Convert pdf to Excel Spreadsheet or Google Sheet
Simple Tables and Reports
The first scenario can be managed using desktop OCR applications like ABBYY FineReader, ReadIRIS and Kofax OmniPage. These applications can convert standard table data to individual spreadsheets. The output would be one Excel or CSV file per document, and will usually require a bit of clean-up to remove extra text on the document that isn't part of the table. What they can't do well is data validation or appending multiple documents to a single spreadsheet to build a dataset.
Complex Tables or Multiple Field Regions
The second scenario uses data capture software that identifies common data elements across multiple documents and maps them to columns in your Excel, CSV or database output. In these projects, zonal OCR is the most basic application, capturing several data points from each document and exporting them as a single row of data.
Complex documents that include header/detail data, multiple tables, nested tables or tables with overlapping columns can all be captured and converted to structured data like XML, JSON or relational database tables.
Data Capture Expertise
ScanStore has been exclusively focused on OCR data capture and forms processing solutions for over 20 years.
Let our experts help you with your OCR project. Use the contact form or online chat in the sidebar for a consultation.