Tutorial: How to Send Tables for Recognition

Learning Objectives

In this tutorial, you’ll learn how to:

Prepare a table image for recognition
Authenticate with the Aspose.OCR Cloud API
Configure optimal recognition settings
Submit a table image for OCR processing
Handle the API response

Prerequisites

Before starting this tutorial, you should have:

An Aspose Cloud account with an active subscription or free trial
Client credentials (Client ID and Client Secret) from the Aspose Cloud Dashboard
Basic knowledge of REST APIs and JSON
A tool for making HTTP requests (cURL, Postman, or your preferred programming language)
A scanned or photographed table image for testing

Introduction

Extracting text from tables in scanned documents or photographs is a common requirement in many document processing applications. Aspose.OCR Cloud provides a powerful API that makes this process straightforward. In this tutorial, we’ll walk through the process of sending a table image to the Aspose.OCR Cloud API for recognition.

Real-World Scenario

Imagine you work for a financial services company that receives hundreds of scanned financial statements every day. These statements contain tables with critical data that needs to be extracted and stored in a database. Manually entering this data would be time-consuming and error-prone. By implementing table recognition with Aspose.OCR Cloud, you can automate this process.

Step 1: Obtain an Access Token

Before sending a table for recognition, you need to authenticate with the Aspose.OCR Cloud API by obtaining an access token.

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

The response will contain an access token that you’ll use in the next step:

{
  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expires_in": 3600,
  "token_type": "bearer"
}

Step 2: Prepare Your Table Image

For this tutorial, you’ll need a scanned or photographed table. The image should be:

Clearly visible with good contrast
In a common format (JPEG, PNG, BMP, TIFF)
Reasonably aligned (though the API can correct minor skew)

You’ll need to convert your image to a Base64 string. Here’s a simple way to do it:

Using Linux/macOS Terminal:

base64 -i your_table_image.jpg -o base64_output.txt

Using Windows PowerShell:

[Convert]::ToBase64String([IO.File]::ReadAllBytes("your_table_image.jpg")) | Out-File base64_output.txt

Step 3: Configure Recognition Settings

Before sending your request, you need to decide on the recognition settings. Here’s a breakdown of the available options:

Setting	Description	Recommendation
`language`	Recognition language	Use “English” for English text
`makeSkewCorrect`	Auto-correct image tilt	Enable for slightly tilted images (≤15°)
`rotate`	Manual rotation angle	Use if image is rotated >15°
`makeBinarization`	Convert to black and white	Enable for low-contrast images
`makeContrastCorrection`	Enhance contrast	Enable for poor quality scans
`makeUpsampling`	Intelligently upscale image	Enable for small fonts
`makeSpellCheck`	Auto-correct misspelled words	Enable for improved accuracy
`dsrMode`	Document structure analysis	Use “Regions” for tables
`dsrConfidence`	Content block filtering	Use “Default” to start
`resultTypeTable`	Result format	Use “Csv” for structured data

Step 4: Send the Table for Recognition

Now, let’s submit the table image for recognition using a POST request to the Aspose.OCR Cloud API:

curl --request POST --location 'https://api.aspose.cloud/v5.0/ocr/RecognizeTable' \
--header 'Accept: text/plain' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_ACCESS_TOKEN' \
--data-raw '{
  "image": "YOUR_BASE64_IMAGE_STRING",
  "settings": {
    "language": "English",
    "makeSkewCorrect": true,
    "rotate": 0,
    "makeBinarization": false,
    "makeContrastCorrection": true,
    "makeUpsampling": false,
    "makeSpellCheck": true,
    "dsrMode": "Regions",
    "dsrConfidence": "Default",
    "resultTypeTable": "Csv"
  }
}'

Try it yourself!

Replace YOUR_ACCESS_TOKEN with the token you obtained in Step 1 and YOUR_BASE64_IMAGE_STRING with your Base64-encoded image. Run the command and observe the response.

Step 5: Understand the Response

If your request is successful, the API will return a unique identifier (GUID) for your recognition task, like this:

db212989-42b9-422c-8e0d-70acb08474a6

This identifier is crucial as you’ll use it to fetch the recognition results once processing is complete.

Troubleshooting Common Issues

Error: 401 Unauthorized

Verify that your access token is valid and hasn’t expired
Ensure you’re using the correct authorization header format

Error: 400 Bad Request

Check that your JSON body is properly formatted
Ensure your Base64-encoded image string is valid

Command Line Length Limitations

When using cURL in a shell command, you might encounter errors with very large Base64-encoded strings. Use the getconf ARG_MAX command to check the maximum command length on your system. To work around this limitation:

Save your request body to a file and use @filename in your cURL command
Use a REST client like Postman instead of command line
Implement the request using a programming language

What You’ve Learned

In this tutorial, you’ve learned how to:

Authenticate with the Aspose.OCR Cloud API
Prepare a table image for recognition
Configure recognition settings based on your needs
Send a table image to the API for processing
Handle the API response

Next Steps

Now that you’ve successfully submitted a table for recognition, the next step is to learn how to fetch and process the recognition results. Continue to our next tutorial:

Tutorial: How to Fetch Table Recognition Results

Further Practice

To reinforce what you’ve learned:

Try submitting images with different quality levels to see how the settings affect recognition
Experiment with different recognition settings and observe the differences in results
Implement the table submission process in your preferred programming language

Helpful Resources

Have questions about this tutorial? Feel free to post them on our support forum.