Skip to content

Table Extraction#

With DOC² it is possible to extract tables from PDF-Files. That will be done via the "Line Items" functionality. It is used for extracting the tables from all types of documents (invoices, contracts, forms, medical prescriptions etc.).

The extraction of data in tables can be achieved by following the steps below:

1. Login to your account with the Email and Password that you were given.

2. At your Dashboard, proceed to import any document.

3. Click on the document to open it.

4. Now, scroll down the extracted fields page to the Line Items section.

5. Click on the table icon inside the textbox.

You will end up in the table extraction view screen.

If the document contains very simple tables it will detect and extract them automatically:

In practice, tables on documents are often much more complex and have a wide variety of formatting and arrangements. For example, text may extend across several columns or there may be several lines of text in one position line. For example, in the case of long item descriptions or similar:

And this is where the advantage of DOC² and its table extraction functionality comes into play. There are several ways to train the table extraction functionality and to achieve the best possible result, even with demanding tables.

In the following sections you will learn how to train a table manually and what functionalities are available for this.