Saltar al contenido principal

Scan Documents

This is the documentation for the scan documents.

With Totalum scan documents you can scan your custom documents and extract JSON data from them.

Totalum scan documents only works with images and PDF files. It can scan any image or pdf with any format and extract the data you want. It works with machine learning to extract the data from the documents.

You can use TotalumSdk for javascript or use the Totalum API directly. If you chose to use TotalumSdk, Please check the previous installation step to see how to install the TotalumSdk.

Video Tutorial

1. Upload the file to Totalum

To scan a document, first you need to upload the file to Totalum. You can do this using the Totalum API directly or using the Totalum SDK.

1.1 Transform the file to a blob

If you are not using javascript, you will need to do this step using the language you are using. Search on internet how to transform a file to a blob in your language. Or also you can ask chatgpt to transform the following examples to your language.

Depending on the platform you are using, you will need to transform the file to a blob. Here are some examples:

From a file input (Frontend)

    const fileInput = document.getElementById('fileInput');
const fileBlob = fileInput.files[0];

From a file in storage (Backend)

    const fs = require('fs');
const yourFilePath = 'your_file_path'; // example: /user/your_file.pdf
const fileBlob = fs.readFileSync(yourFilePath);

From a remote file (Backend)

    const response = await axios.get('your_file_url', { responseType: 'stream' });
const fileBlob = response.data;

From a base64 string (Frontend/Backend)

    // Convert base64 to binary
const binaryStr = atob(base64String);
const len = binaryStr.length;
const bytes = new Uint8Array(len);
for (let i = 0; i < len; i++) {
bytes[i] = binaryStr.charCodeAt(i);
}

let fileBlob;

// Environment check: Node.js or Browser
if (typeof process === 'object' && process.version) {
// Node.js environment
// Convert Uint8Array to Buffer for Node.js usage
const buffer = Buffer.from(bytes.buffer);
// Here, 'buffer' can be used similarly to how you'd use a Blob in the browser
// Note: Direct Blob emulation isn't possible in Node.js, but Buffer is a close alternative for file handling
fileBlob = buffer;
} else {
// Browser environment
// Create a Blob from the Uint8Array
const blob = new Blob([bytes], { type: fileType });
fileBlob = blob;
}

1.2 Upload the file to Totalum

Using Totalum SDK


const FormData = require('form-data'); // if you are using node.js

const fileName = 'your_file_name.png'; // replace 'your_file_name' with the name of your file, replace .png with the extension of your file
const file = yourFileBlob // your blob file created in the previous step
const formData = new FormData();
formData.append('file', file, fileName);
const result = await totalumClient.files.uploadFile(formData);
const fileNameId = result.data.data;

Using Totalum API


const FormData = require('form-data'); // if you are using node.js

const fileName = 'your_file_name.png'; // replace 'your_file_name' with the name of your file, replace .png with the extension of your file
const file = yourFileBlob // your blob file created in the previous step
const formData = new FormData();
formData.append('file', file, fileName);
const result = await axios.post('https://api.totalum.app/api/v1/files/upload', formData, {
headers: {
'Content-Type': 'multipart/form-data',
'api-key': 'your api key here', // replace 'your api key here' with your api key
}
});
const fileNameId = result.data.data;

2. Scan the document

Using Totalum SDK


// last steps here

const result = await totalumClient.files.uploadFile(formData);
const fileNameId = result.data.data;

// this properties are just an example, you need to replace them with the properties that you want to extract from your document
const properties = {
"name": {
"type": "string",
"description": "the name visible in the top of document"
},
"born": {
"type": "string",
"format": "date",
"description": "The date of birth of the person in the document"
},
"total": {
"type": "number",
"description": "the total amount of that the person has to pay in the document"
},
"currency": {
"type": "string",
"enum": ["EUR", "USD", "GBP", "OTHER"],
"description": "the currency of the total amount in the document, set to 'OTHER' if the currency is not in the list"
}
}
const scanResult = await totalumClient.files.scanDocument(fileNameId, properties);
const scanData = scanResult.data.data;
console.log(scanData);

Using Totalum API


// last steps here

const result = await axios.post('https://api.totalum.app/api/v1/files/upload', formData, {
headers: {
'Content-Type': 'multipart/form-data',
'api-key': 'your api key here', // replace 'your api key here' with your api key
}
});
const fileNameId = result.data.data;

const properties = {
// see the example of before
}

const scanResult = await axios.post('https://api.totalum.app/api/v1/files/scan-document', {
"fileName": fileNameId,
"properties": properties
}, {
headers: {
'Content-Type': 'application/json',
'api-key': 'your api key here', // replace 'your api key here' with your api key
}
});
const scanData = scanResult.data.data;
console.log(scanData);

2.1 Properties

The properties object is a JSON object that describes the data you want to extract from the document following json schema format. The keys of the object are the names of the properties you want to extract, and the values are objects that describe the type of the property and other details.

2.1.1 How it works

    const properties = {
"propertyName": {
"type": "property type",
"description": "property description"
},
// you can add more properties here
// etc...
}

propertyName is the name of the property you want to extract from the document. You can name it as you want.

type is the type of the property you want to extract. It can be "string", "number", "enum", "array", "object"

description is a description of the property you want to extract. You can write here any description you want. The description helps the machine learning model to extract the data from the document.

get a string from the document
    "someString": { // replace 'someString' with the name you need
"type": "string",
"description": "your description"
}
get a Date from the document
    "someDate": { // replace 'someDate' with the name you need
"type": "string",
"format": "date",
"description": "your description"
}
get a number from the document
    "someNumber": { // replace 'someNumber' with the name you need
"type": "number",
"description": "your description"
}
get an enum from the document
    "someEnum": { // replace 'someEnum' with the name you need
"type": "string",
"enum": ["EUR", "USD", "GBP", "OTHER"], // replace with the values you need
"description": "your description"
}
get an array of strings from the document
    "someArray": { // replace 'someArray' with the name you need
"type": "array",
"description": "your description",
"items": {
type: "string",
description: "your description"
}
}
get an array of objects from the document
    "someArray": { // replace 'someArray' with the name you need
"type": "array",
"description": "your description",
"items": {
type: "object",
properties: {
rowName: { // replace 'rowName' with the name you need
type: "string",
description: "your description"
}
}
}
}

2.2 Properties examples

REMEMBER: YOU CAN SCAN ANY KIND OF DOCUMENT, YOU ONLY NEED TO DESCRIBE THE PROPERTIES YOU WANT TO EXTRACT FROM THE DOCUMENT.

Scan a passport (Pasaporte)
    const properties = {
"name": {
"type": "string",
"description": "the name of the person in the document"
},
"nationality": {
"type": "string",
"description": "the nationality of the person in the document"
},
"passportNumber": {
"type": "string",
"description": "the passport number of the person in the document"
},
"birthDate": {
"type": "string",
"format": "date",
"description": "the birth date of the person in the document"
},
"expirationDate": {
"type": "string",
"format": "date",
"description": "the expiration date of the person in the document"
}
}
Scan a college degree (Título Universitario)
    const properties = {
"name": {
"type": "string",
"description": "the name of the person in the document"
},
"degree": {
"type": "string",
"description": "the degree of the person in the document"
},
"university": {
"type": "string",
"description": "the university of the person in the document"
},
"graduationDate": {
"type": "string",
"format": "date",
"description": "the graduation date of the person in the document"
},
"registrationNumber": {
"type": "string",
"description": "the registration number of the person in the document"
}
}

2.3 Configuration options

model

models available: scanum, scanum-eye-pro
(scanum is the default model, it's a general model for documents and for extract a lot of data, scanum-eye has the capacity of detect colors, shapes and understand the image , scanum-eye-pro the same as scanum-eye but with more capacity of understand the image, also is more expensive)

Scanum eye pro only works with images. And its a good choice if you have a complex image with a lot of shapes and colors. Or the text is a little bit hard to extract (is not in a straight line, or is not in a good quality).

Scanum model works with images and PDF files. It's a general model that works with a lot of documents. It's a good choice if you have a simple document with text in a straight line. Also is better for large text extraction.

removeFileAfterScan

(boolean) set to true if you want to remove the file after the scan.

If is true, after the scan the file will be removed forever.

returnOcrFullResult

(boolean) set to true if you want to return the full OCR result.

The OCR result contains a big JSON with all text extracted from the document, including all coordinates of every word and every character. Also includes all text inline.

maxPages

(number) set the maximum number of pages to scan in a PDF file

Example: if you set maxPages to 5, and the PDF file has 10 pages, only the first 5 pages will be scanned.

pdfPages

(array of numbers) set the pages to scan in a PDF file.

Example: [1, 5, 10] (only pages 1, 5 and 10 will be scanned)

scanDescription

(string) set a additional description for add more context to what to scan and how to obtain it.

Example: "The file is an invoice and the Invoice Issuer company is Apple Inc."

processEveryPdfPageAsDifferentScan

(boolean) set to true if you want to process every page of a PDF as a different scan (the result will be an array of objects)

If you have a PDF with 3 pages and you set this option to true, the result will be an array with 3 objects, each object will contain the data extracted from each page.

Example:

Imagine you have a PDF with 3 pages, and every page contains a different invoice. If you set this option to true, you will get an array with 3 objects, each object will contain the data extracted from each invoice. If is false, you will get only one object with the data extracted from the 3 invoices (data can wrong or mixed).

2.3.1 How to add configuration options

Its easy, just add a options field with a json with the configuration that you want.

Using Curl

    curl -X POST https://api.totalum.app/api/v1/files/scan-document \
-H 'Content-Type: application/json' \
-H 'api-key: your api key here' \
-d '{
"fileName": "your_file_name_id",
"properties": {
// your properties here
},
"options": {
"model": "scanum",
"removeFileAfterScan": false,
"returnOcrFullResult": false,
"maxPages": 5,
"pdfPages": [1,2,3],
"scanDescription": "The file is an invoice and the Invoice Issuer company is Apple Inc.",
"processEveryPdfPageAsDifferentScan": false
}
}'

Using Totalum SDK


// last steps here

const options = {
model: 'scanum',
removeFileAfterScan: false,
returnOcrFullResult: false,
maxPages: 5,
pdfPages: [1,2,3],
scanDescription: "The file is an invoice and the Invoice Issuer company is Apple Inc.",
processEveryPdfPageAsDifferentScan: false
}

const scanResult = await totalumClient.files.scanDocument(fileNameId, properties, options);

Using Totalum API in Javascript


// last steps here

const options = {
model: 'scanum',
removeFileAfterScan: false,
returnOcrFullResult: false,
maxPages: 5,
pdfPages: [1, 2, 3],
scanDescription: "The file is an invoice and the Invoice Issuer company is Apple Inc.",
processEveryPdfPageAsDifferentScan: false
}

const scanResult = await axios.post('https://api.totalum.app/api/v1/files/scan-document', {
"fileName": fileNameId,
"properties": properties,
"options": options
}, {
headers: {
'Content-Type': 'application/json',
'api-key': 'your api key here', // replace 'your api key here' with your api key
}
});