Tutorial: Understanding Available Object Detection Labels

Learning Objectives

In this tutorial, you’ll learn:

What object detection labels are and why they matter
The complete list of object categories supported by Aspose.Imaging Cloud
How to effectively use different object categories in your applications
Strategies for working with specific object types
Best practices for object detection with various categories

Prerequisites

Before starting this tutorial, it’s helpful to:

Have basic familiarity with object detection concepts
Understand how the Aspose.Imaging Cloud API works (covered in our previous tutorials

Introduction

Object detection labels (also called classes or categories) define what types of objects the AI model can recognize in images. Aspose.Imaging Cloud’s object detection service is built on a comprehensive model that can identify a wide range of everyday objects, from people and animals to household items and environmental elements.

Understanding available labels is crucial for:

Developing targeted applications that focus on specific object types
Interpreting detection results accurately
Filtering detections to focus on objects relevant to your use case

1. What Are Object Detection Labels?

In computer vision, labels are predefined categories that an AI model is trained to recognize. When the model processes an image, it assigns each detected object to one of these categories based on its visual features.

Aspose.Imaging Cloud uses the Single Shot Detector (SSD) method with a rich set of labels organized into intuitive categories.

2. Complete List of Available Labels

The Aspose object detection service currently supports the following labels:

People and Personal Items

person
hat
backpack
umbrella
shoe
eye-glasses
handbag
tie
suitcase

Vehicles and Transportation

bicycle
car
motorcycle
airplane
bus
train
truck
boat

Animals

bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe

Street and Traffic Objects

traffic-light
fire-hydrant
street-sign
stop-sign
parking-meter
bench

Sports and Recreation

frisbee
skis
snowboard
sports-ball
kite
baseball-bat
baseball-glove
skateboard
surfboard
tennis-racket

Food and Dining

bottle
plate
wine-glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
salad

Household and Furniture

chair
couch
potted-plant
bed
mirror
dining-table
window
desk
toilet
door
tv
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
blender
book
clock
vase
scissors
teddy-bear
hair-drier
toothbrush
hair-brush

Indoor Materials and Structures

banner
blanket
cabinet
cage
cardboard
carpet
ceiling-other
ceiling-tile
cloth
clothes
counter
cupboard
curtain
desk-stuff
mirror-stuff
napkin
paper
pillow
plastic
rug
shelf
towel
textile-other

Outdoor and Environmental Elements

branch
bridge
building-other
bush
clouds
dirt
fence
flower
fog
food-other
fruit
furniture-other
grass
gravel
ground-other
hill
house
leaves
light
mat
metal
moss
mountain
mud
net
pavement
plant-other
platform
playingfield
railing
railroad
river
road
rock
roof
sand
sea
sky-other
skyscraper
snow
solid-other
stairs
stone
straw
structural-other
table
tent
tree
vegetable
water-other
waterdrops
wood

Building Materials and Surfaces

floor-marble
floor-other
floor-stone
floor-tile
floor-wood
wall-brick
wall-concrete
wall-other
wall-panel
wall-stone
wall-tile
wall-wood
window-blind
window-other

3. Using Labels in Your API Requests

When using the Aspose.Imaging Cloud API for object detection, you don’t need to specify which labels to detect—the service automatically identifies all recognizable objects in the image. However, you can filter the results in your application code based on the labels you’re interested in.

Try it yourself: Filtering Detection Results by Label

Here’s a simple C# example of filtering detection results to focus on specific labels:

// Tutorial Code Example - Filtering Object Detection Results by Label
using Aspose.Imaging.Cloud.Sdk.Api;
using Aspose.Imaging.Cloud.Sdk.Model.Requests;
using System;
using System.Collections.Generic;
using System.Linq;

namespace AsposeTutorials
{
    class ObjectLabelFilteringTutorial
    {
        // Your Client ID and Client Secret
        private const string ClientId = "YOUR_CLIENT_ID";
        private const string ClientSecret = "YOUR_CLIENT_SECRET";

        static void Main(string[] args)
        {
            // Create instance of API
            var imagingApi = new ImagingApi(ClientId, ClientSecret);
            
            // Detect objects in an image
            string imageName = "street_scene.jpg";
            
            try
            {
                var request = new GetObjectBoundsRequest(
                    name: imageName,
                    method: "ssd",
                    threshold: 40, // Lower threshold to catch more objects
                    includeLabel: true,
                    includeScore: true
                );

                var detectionResult = imagingApi.GetObjectBounds(request);
                
                // Define categories of interest
                var vehicleLabels = new HashSet<string> { "car", "truck", "motorcycle", "bus", "bicycle" };
                var personLabels = new HashSet<string> { "person" };
                var trafficLabels = new HashSet<string> { "traffic-light", "stop-sign" };
                
                // Filter results by category
                var vehicles = detectionResult.DetectedObjects
                    .Where(obj => vehicleLabels.Contains(obj.Label))
                    .ToList();
                    
                var people = detectionResult.DetectedObjects
                    .Where(obj => personLabels.Contains(obj.Label))
                    .ToList();
                    
                var trafficSigns = detectionResult.DetectedObjects
                    .Where(obj => trafficLabels.Contains(obj.Label))
                    .ToList();
                
                // Report results by category
                Console.WriteLine($"Found {vehicles.Count} vehicles:");
                foreach (var vehicle in vehicles)
                {
                    Console.WriteLine($"- {vehicle.Label} (Score: {vehicle.Score:F2})");
                }
                
                Console.WriteLine($"\nFound {people.Count} people");
                
                Console.WriteLine($"\nFound {trafficSigns.Count} traffic signs:");
                foreach (var sign in trafficSigns)
                {
                    Console.WriteLine($"- {sign.Label} (Score: {sign.Score:F2})");
                }
                
                // Count all unique object types
                var labelCounts = detectionResult.DetectedObjects
                    .GroupBy(obj => obj.Label)
                    .Select(g => new { Label = g.Key, Count = g.Count() })
                    .OrderByDescending(x => x.Count);
                    
                Console.WriteLine("\nObject types detected:");
                foreach (var label in labelCounts)
                {
                    Console.WriteLine($"- {label.Label}: {label.Count}");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Error: {ex.Message}");
            }
        }
    }
}

4. Practical Applications for Different Label Categories

Different label categories are suited for specific application domains:

Retail and Inventory

Focus on product-related labels like:

food items (apple, banana, etc.)
packaged goods (bottle)
clothing items (backpack, shoe, etc.)

Traffic Monitoring

Focus on transportation and infrastructure labels:

vehicles (car, truck, bus, etc.)
traffic signs (traffic-light, stop-sign)
roads and bridges

Home Automation

Focus on household and furniture labels:

appliances (refrigerator, microwave, etc.)
furniture (chair, couch, bed, etc.)
electronics (tv, laptop, etc.)

Security Systems

Focus on people and activity-related labels:

person
vehicles (car, motorcycle)
personal items (suitcase, backpack)

5. Label Accuracy and Challenges

The accuracy of object detection varies by label and is influenced by factors like:

Object size: Larger objects typically have higher detection accuracy
Occlusion: Partially hidden objects are harder to detect correctly
Lighting conditions: Poor lighting can reduce detection accuracy
Object orientation: Unusual angles can challenge the detection model
Visual similarity: Some objects may be confused with visually similar categories

Label Confidence Tips

For critical applications, filter objects by confidence score (e.g., above 0.7)
For general-purpose applications, a threshold of 0.5 (50%) often works well
For maximum recall at the cost of precision, use a lower threshold (e.g., 0.3)

6. Advanced Label Strategies

Hierarchical Grouping

You can create logical hierarchies to organize detection results:

# Tutorial Code Example - Hierarchical Label Grouping in Python
def group_detections_hierarchically(detections):
    hierarchy = {
        "living_beings": {
            "person": [],
            "animals": {
                "pets": ["dog", "cat"],
                "farm": ["cow", "horse", "sheep"],
                "wild": ["elephant", "bear", "zebra", "giraffe", "bird"]
            }
        },
        "vehicles": {
            "road": ["car", "motorcycle", "bus", "truck", "bicycle"],
            "rail": ["train"],
            "air": ["airplane"],
            "water": ["boat"]
        },
        # Add more categories as needed
    }
    
    # Implementation would map detections to this hierarchy
    # (Code for the actual mapping would go here)
    
    return grouped_results

Label Co-occurrence

Detecting relationships between commonly co-occurring objects can provide context:

// Tutorial Code Example - Detecting Label Co-occurrence
function detectScenes(detectedObjects) {
    // Define scene patterns based on co-occurring labels
    const scenePatterns = {
        "kitchen": ["refrigerator", "microwave", "sink", "oven"],
        "dining": ["dining-table", "chair", "plate", "cup", "fork", "knife"],
        "living_room": ["couch", "tv", "remote", "chair"],
        "office": ["desk", "laptop", "keyboard", "mouse"],
        "traffic": ["car", "traffic-light", "road"],
        // Add more scenes as needed
    };
    
    // Count matches for each scene
    let sceneScores = {};
    const labels = detectedObjects.map(obj => obj.label);
    
    for (const [scene, sceneLabels] of Object.entries(scenePatterns)) {
        const matchCount = sceneLabels.filter(label => labels.includes(label)).length;
        const matchPercentage = matchCount / sceneLabels.length;
        sceneScores[scene] = matchPercentage;
    }
    
    // Find the best matching scene(s)
    return Object.entries(sceneScores)
        .filter(([_, score]) => score > 0.5) // At least 50% match
        .sort(([_, scoreA], [_, scoreB]) => scoreB - scoreA)
        .map(([scene, score]) => ({ scene, confidence: score }));
}

What You’ve Learned

In this tutorial, you’ve learned:

The complete list of object categories supported by Aspose.Imaging Cloud
How to filter detection results by specific labels
Practical applications for different label categories
Advanced strategies for working with labels
Best practices for improving detection accuracy

Further Practice

To reinforce your learning:

Test the object detection on images containing multiple object types
Create a label filtering system for your specific application domain
Build a simple scene recognition system based on detected labels
Experiment with different confidence thresholds for various labels

Next Steps

Now that you understand the available labels, you can build more sophisticated applications using the techniques from our previous tutorials:

Helpful Resources

Have questions about object detection labels? We’re here to help! Reach out on our support forum with any questions about implementing label-specific object detection in your applications.