Tutorial: How to Send PDFs for Recognition
Learning Objectives
By the end of this tutorial, you will be able to:
- Properly format a PDF recognition request
- Submit a scanned PDF document to Aspose.OCR Cloud API
- Configure recognition settings to optimize text extraction
- Handle the API response for further processing
Prerequisites
Before starting this tutorial, make sure you have:
- An Aspose Cloud account with an active subscription or free trial
- Your Client ID and Client Secret from the Aspose Cloud Dashboard
- Basic knowledge of REST APIs and HTTP requests
- An API client like cURL, Postman, or your programming language’s HTTP library
- A sample scanned PDF document to use for testing
Understanding the Process
When working with scanned PDF documents, the first step is to submit them to the Aspose.OCR Cloud API for processing. This tutorial focuses specifically on how to properly send your PDF files to the recognition service.
Step 1: Obtain an Access Token
Before sending any PDF for recognition, you need to authenticate your request with an access token.
curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"
The response will contain your access token:
{
"access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
"expires_in": 3600,
"token_type": "bearer"
}
Save this token as you’ll need it for the next steps.
Step 2: Prepare Your PDF File
For this tutorial, you’ll need to convert your PDF file to a Base64 string. Here’s how you can do it:
Using Command Line:
Linux/macOS:
base64 -i your_pdf_file.pdf | tr -d '\n'
Windows (PowerShell):
[Convert]::ToBase64String([IO.File]::ReadAllBytes("your_pdf_file.pdf"))
Try it yourself:
Convert a small PDF document to Base64 using the method appropriate for your operating system. For large files, consider using a programming language that can handle the conversion more efficiently.
Step 3: Configure Recognition Settings
Aspose.OCR Cloud allows you to customize how your PDF is processed. Let’s look at the available settings:
{
"image": "YOUR_BASE64_PDF_STRING",
"settings": {
"language": "English",
"makeSkewCorrect": true,
"rotate": 0,
"makeBinarization": false,
"makeContrastCorrection": true,
"makeUpsampling": false,
"makeSpellCheck": false,
"dsrMode": "Regions",
"dsrConfidence": "Default",
"resultType": "Text"
}
}
Let’s understand these settings:
Setting | Purpose | Recommended Value |
---|---|---|
language | Recognition language | Choose based on your document’s language |
makeSkewCorrect | Fix tilted pages | true for slightly skewed documents |
rotate | Rotate page if needed | 0 for normal orientation |
makeBinarization | Convert to black and white | false for most documents |
makeContrastCorrection | Improve contrast | true for low contrast scans |
makeUpsampling | Enhance small text | true for documents with small fonts |
makeSpellCheck | Fix recognition errors | true for improved accuracy |
dsrMode | Document structure analysis | Regions for most documents |
dsrConfidence | Content block filtering | Default for balanced results |
resultType | Output format | Text , Pdf , TextAndPdf or Json |
Learning Checkpoint:
What setting would you adjust if your PDF has small text that’s difficult to read?
Answer
You would setmakeUpsampling
to true
to intellectually upscale the content for better recognition of small fonts.Step 4: Send the PDF for Recognition
Now that you have your access token and have prepared your PDF file, you can send it for recognition:
curl --request POST --location 'https://api.aspose.cloud/v5.0/ocr/RecognizePdf' \
--header 'Accept: text/plain' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_ACCESS_TOKEN' \
--data-raw '{
"image": "JVBERi0xLjUNJeLjz9...YOUR_BASE64_PDF_STRING...g0xMTYNJSVFT0YN",
"settings": {
"language": "English",
"makeSpellCheck": true,
"resultType": "Text"
}
}'
Try it yourself:
Create a complete request with your own access token and PDF file. Start with basic settings and gradually experiment with different parameters to see how they affect recognition quality.
Step 5: Understanding the Response
If your request is successful, the API will return a unique task ID (GUID) as a plain text response:
db03b9ea-3eed-4954-a1d4-b2712773bbe
This ID is crucial as it will be used to retrieve the recognition results once processing is complete.
Troubleshooting Tips:
- 413 Request Entity Too Large: Your Base64 encoded PDF might be too large. Try splitting your document into smaller parts or using a different approach to submit the file.
- 401 Unauthorized: Check that your access token is valid and properly formatted in the Authorization header.
- 400 Bad Request: Verify your JSON structure, especially when encoding the Base64 string.
What You’ve Learned
Congratulations! In this tutorial, you’ve learned how to:
- Authenticate with the Aspose.OCR Cloud API
- Convert a PDF file to Base64 format
- Configure optimal recognition settings for your document
- Submit a PDF document for OCR processing
- Handle the task ID response for later retrieval
Next Steps
Now that you know how to send PDFs for recognition, the next logical step is to learn how to retrieve and process the recognition results. Continue your learning journey with our next tutorial:
Tutorial: Fetching PDF Recognition Results
Further Practice
To reinforce what you’ve learned, try these exercises:
- Submit the same PDF with different recognition settings and compare the results
- Create a script in your preferred programming language that automates the PDF submission process
- Process a multi-page PDF document and analyze the response
Helpful Resources
Have questions about this tutorial? Feel free to post in our support forum for assistance!