Tutorial: Using Aspose.OCR Cloud SDK for Label Recognition
Learning Objectives
In this tutorial, you’ll learn:
- How to set up and configure the Aspose.OCR Cloud SDK
- How to use the SDK to submit labels for recognition
- How to retrieve and process recognition results via the SDK
- How the SDK simplifies error handling and authentication
- How to integrate label recognition into your application workflow
Prerequisites
Before starting this tutorial, you should have:
- Completed the previous tutorials on sending labels and fetching results
- Your Aspose Cloud Client ID and Client Secret
- Basic knowledge of your preferred programming language (.NET in this tutorial)
- Development environment set up for your chosen platform
Practical Scenario
You’re developing a retail inventory management application that needs to quickly scan and process product labels. Using the SDK will significantly reduce your development time and simplify maintenance compared to direct API calls.
Tutorial Steps
1. Understanding the Benefits of Using the SDK
While you can directly call the Aspose.OCR Cloud REST API as we learned in previous tutorials, using the SDK offers several advantages:
- Simplified Authentication: The SDK handles authentication token management automatically
- Type Safety: Language-specific data structures instead of working with raw JSON
- Error Handling: Better exception handling and error messages
- Reduced Boilerplate: Common operations like encoding/decoding are handled for you
2. Setting Up the SDK
Let’s start by setting up the Aspose.OCR Cloud SDK for .NET:
Create a Basic Project Structure
Create a new console application and add the following namespace imports:
using Aspose.OCR.Cloud.SDK.Api;
using Aspose.OCR.Cloud.SDK.Model;
using System;
using System.IO;
using System.Text;
3. Initializing the SDK Client
Now, let’s initialize the SDK client with your credentials:
// Initialize the Aspose.OCR Cloud SDK client
RecognizeLabelApi recognizeLabelApi = new RecognizeLabelApi("<Your Client ID>", "<Your Client Secret>");
Security Tip: In a production environment, don’t hardcode your credentials. Use secure configuration management like environment variables, Azure Key Vault, or AWS Secrets Manager to store and retrieve sensitive information.
4. Configuring Recognition Settings
Next, we’ll configure the recognition settings for our label:
// Configure recognition settings
OCRSettingsRecognizeLabel recognitionSettings = new OCRSettingsRecognizeLabel
{
Language = Language.English, // Currently only English is supported
MakeSkewCorrect = true, // Automatically correct image tilt
MakeUpsampling = false, // Set to true for small text
ResultType = ResultType.Text // Get plain text result
};
5. Sending a Label for Recognition
Now, let’s read an image file and send it for recognition:
try
{
// Read image file to byte array
byte[] imageData = File.ReadAllBytes("path_to_your_label_image.jpg");
// Create recognition request body
OCRRecognizeLabelBody requestBody = new OCRRecognizeLabelBody(imageData, recognitionSettings);
// Send image for recognition and get task ID
string taskId = recognizeLabelApi.PostRecognizeLabel(requestBody);
Console.WriteLine($"Label submitted successfully. Task ID: {taskId}");
// Continue to the next step to fetch results
}
catch (Exception ex)
{
Console.WriteLine($"Error sending label for recognition: {ex.Message}");
}
6. Fetching and Processing Recognition Results
After submitting the image, we can fetch the recognition results:
try
{
// Wait a moment for the recognition to complete
Console.WriteLine("Waiting for recognition to complete...");
System.Threading.Thread.Sleep(2000); // Simple delay for demonstration
// Fetch recognition result using task ID
OCRResponse result = recognizeLabelApi.GetRecognizeLabel(taskId);
// Check task status
if (result.TaskStatus == TaskStatus.Completed)
{
// Process completed result
if (result.Results != null && result.Results.Count > 0)
{
// Decode the result - SDK handles Base64 decoding automatically
string recognizedText = Encoding.UTF8.GetString(result.Results[0].Data);
Console.WriteLine("Recognition completed successfully!");
Console.WriteLine("Recognized text:");
Console.WriteLine(recognizedText);
}
else
{
Console.WriteLine("Recognition completed but no text was found in the image.");
}
}
else if (result.TaskStatus == TaskStatus.Pending || result.TaskStatus == TaskStatus.Processing)
{
Console.WriteLine("Recognition is still in progress. Implement polling to check again.");
}
else
{
// Handle error or other statuses
Console.WriteLine($"Recognition failed with status: {result.TaskStatus}");
if (result.Error != null && result.Error.Messages != null)
{
foreach (var message in result.Error.Messages)
{
Console.WriteLine($"Error: {message}");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error fetching recognition result: {ex.Message}");
}
7. Implementing a Complete Solution with Polling
For a production-ready implementation, we need to poll the API until the results are available or an error occurs. This is a more robust approach than using a simple delay:
public static string RecognizeLabelWithPolling(string imagePath, int maxAttempts = 10, int delayMs = 1500)
{
try
{
// Initialize API client
RecognizeLabelApi recognizeLabelApi = new RecognizeLabelApi("<Your Client ID>", "<Your Client Secret>");
// Configure recognition settings
OCRSettingsRecognizeLabel recognitionSettings = new OCRSettingsRecognizeLabel
{
Language = Language.English,
MakeSkewCorrect = true,
MakeUpsampling = true, // Enable upsampling for better accuracy
ResultType = ResultType.Text
};
// Read image file
Console.WriteLine($"Reading image from: {imagePath}");
byte[] imageData = File.ReadAllBytes(imagePath);
// Create recognition request
OCRRecognizeLabelBody requestBody = new OCRRecognizeLabelBody(imageData, recognitionSettings);
// Send image for recognition
Console.WriteLine("Sending label for recognition...");
string taskId = recognizeLabelApi.PostRecognizeLabel(requestBody);
Console.WriteLine($"Label submitted. Task ID: {taskId}");
// Poll for results
Console.WriteLine("Polling for results...");
int attempts = 0;
while (attempts < maxAttempts)
{
attempts++;
Console.WriteLine($"Attempt {attempts} of {maxAttempts}...");
// Get current status
OCRResponse result = recognizeLabelApi.GetRecognizeLabel(taskId);
switch (result.TaskStatus)
{
case TaskStatus.Completed:
// Success! Process the results
if (result.Results != null && result.Results.Count > 0)
{
string recognizedText = Encoding.UTF8.GetString(result.Results[0].Data);
Console.WriteLine("Recognition completed successfully!");
return recognizedText;
}
else
{
return "Recognition completed but no text was found in the image.";
}
case TaskStatus.Error:
// Handle error
string errorMsg = "Recognition failed with error: ";
if (result.Error != null && result.Error.Messages != null)
{
errorMsg += string.Join(", ", result.Error.Messages);
}
else
{
errorMsg += "Unknown error";
}
return errorMsg;
case TaskStatus.NotExist:
return "Task ID not found or results expired.";
default:
// Still processing, wait and try again
Console.WriteLine($"Status: {result.TaskStatus} - Waiting for processing to complete...");
System.Threading.Thread.Sleep(delayMs);
break;
}
}
// If we get here, polling timed out
return "Recognition timed out after maximum polling attempts.";
}
catch (Exception ex)
{
return $"Error in recognition process: {ex.Message}";
}
}
8. Creating a Complete Application
Let’s put everything together in a complete console application:
using Aspose.OCR.Cloud.SDK.Api;
using Aspose.OCR.Cloud.SDK.Model;
using System;
using System.IO;
using System.Text;
namespace LabelRecognitionApp
{
class Program
{
// Credentials - in real app, store these securely
private const string ClientId = "<Your Client ID>";
private const string ClientSecret = "<Your Client Secret>";
static void Main(string[] args)
{
Console.WriteLine("Aspose.OCR Cloud Label Recognition Demo");
Console.WriteLine("======================================");
try
{
// Prompt for image path
Console.Write("Enter path to label image: ");
string imagePath = Console.ReadLine();
if (!File.Exists(imagePath))
{
Console.WriteLine("Error: File not found!");
return;
}
// Process the image
string recognizedText = RecognizeLabelWithPolling(imagePath);
// Display results
Console.WriteLine("\n=== Recognition Result ===");
Console.WriteLine(recognizedText);
Console.WriteLine("=========================");
// Example of how you might use the result
Console.WriteLine("\nWould you like to save the result to a file? (y/n)");
string response = Console.ReadLine().ToLower();
if (response == "y" || response == "yes")
{
File.WriteAllText("recognition_result.txt", recognizedText);
Console.WriteLine("Result saved to recognition_result.txt");
}
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
}
Console.WriteLine("\nPress any key to exit...");
Console.ReadKey();
}
public static string RecognizeLabelWithPolling(string imagePath, int maxAttempts = 10, int delayMs = 1500)
{
// Implementation from previous section
// ...
}
}
}
Try It Yourself
Now it’s your turn to implement label recognition using the SDK:
- Create a new project in your preferred development environment
- Install the Aspose.OCR Cloud SDK package
- Set up your credentials (get them from the Aspose Cloud Dashboard)
- Implement the complete recognition flow with polling
- Test it with different types of label images
Troubleshooting Common Issues
Issue | Possible Solution |
---|---|
SDK installation failures | Check your package manager settings and internet connection |
Authentication errors | Verify your Client ID and Client Secret are correct and active |
File not found exceptions | Double-check your file paths and ensure the application has read permissions |
Memory issues with large images | Resize large images before processing or increase your application’s memory allocation |
Timeout during polling | Increase the maximum polling attempts or delay between attempts |
What You’ve Learned
In this tutorial, you’ve learned:
- How to set up and configure the Aspose.OCR Cloud SDK
- How to structure your code to handle the complete label recognition workflow
- How to implement proper polling and error handling
- How to process and utilize recognition results
- How to create a complete console application for label recognition
Next Steps
Now that you’ve mastered the basics of using the SDK for label recognition, you might want to:
- Explore how to optimize recognition accuracy: Tutorial: Optimizing Label Recognition Accuracy
Further Practice
To reinforce your learning:
- Convert the console application to a web API that can be called from a mobile app
- Add support for batch processing multiple images
- Implement more advanced error handling and logging
- Create a simple user interface for uploading images and displaying results
Example Implementation in Other Languages
While this tutorial focused on .NET, the Aspose.OCR Cloud SDK is available for other languages as well. Here’s a brief example of how you might implement similar functionality in Python:
# This is a conceptual example - check Aspose documentation for the actual Python SDK API
import aspose_ocr_cloud
import time
import base64
def recognize_label_with_polling(image_path, max_attempts=10, delay_seconds=1.5):
# Initialize client
client = aspose_ocr_cloud.RecognizeLabelApi(client_id="YOUR_CLIENT_ID", client_secret="YOUR_CLIENT_SECRET")
# Read image file
with open(image_path, 'rb') as image_file:
image_data = image_file.read()
# Configure settings
settings = aspose_ocr_cloud.OCRSettingsRecognizeLabel(
language="English",
make_skew_correct=True,
make_upsampling=True,
result_type="Text"
)
# Send image for recognition
request_body = aspose_ocr_cloud.OCRRecognizeLabelBody(image=image_data, settings=settings)
task_id = client.post_recognize_label(request_body)
# Poll for results
attempts = 0
while attempts < max_attempts:
attempts += 1
print(f"Checking status: attempt {attempts} of {max_attempts}")
# Get current status
result = client.get_recognize_label(task_id)
if result.task_status == "Completed":
# Process completed result
if result.results and len(result.results) > 0:
# Decode result
recognized_text = result.results[0].data.decode('utf-8')
return recognized_text
else:
return "No text found in image"
elif result.task_status in ["Error", "NotExist"]:
# Handle error states
error_msg = "Error occurred: "
if result.error and result.error.messages:
error_msg += ", ".join(result.error.messages)
return error_msg
else:
# Still processing
time.sleep(delay_seconds)
return "Recognition timed out"
Helpful Resources
Have questions about this tutorial? Feel free to post them on our support forum for assistance!