Document Metadata and Properties: Complete Tutorial
Prerequisites
Before starting this tutorial, make sure you have:
- An Aspose Cloud account (sign up for a free trial if you don’t have one)
- Your Client ID and Client Secret from the Aspose Cloud Dashboard
- Basic understanding of REST APIs
- Familiarity with your chosen programming language
Introduction to Document Properties in Excel
Excel files can contain various metadata, also known as document properties. These properties provide information about the file itself, such as its author, creation date, last modified date, and more. Document properties fall into two categories:
- Built-in properties: Standard properties like Title, Author, Subject, Keywords, Category, etc.
- Custom properties: User-defined properties that you can create to store specific information.
Using Aspose.Cells Cloud API, you can programmatically manage these document properties in your Excel files without requiring Microsoft Excel to be installed, providing a powerful way to automate metadata management.
Tutorial Steps
1. Authentication
First, let’s set up authentication to access the Aspose.Cells Cloud API:
# Base request URL
BASE_URL="https://api.aspose.cloud/v3.0/cells"
# Get JWT token
TOKEN=$(curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=YOUR_CLIENT_ID&client_secret=YOUR_CLIENT_SECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
| jq -r '.access_token')
Remember to replace YOUR_CLIENT_ID
and YOUR_CLIENT_SECRET
with your actual credentials.
2. Getting All Document Properties
Let’s start by retrieving all document properties from an Excel file:
Using cURL:
curl -X GET "$BASE_URL/test.xlsx/documentproperties" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json"
Using Python SDK:
import asposecellscloud
from asposecellscloud.apis.cells_api import CellsApi
from asposecellscloud.models import *
from asposecellscloud.api_client import ApiClient
# Configure API key authorization
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
# Create API client
api_instance = CellsApi(client_id, client_secret)
try:
# Get all document properties
response = api_instance.cells_properties_get_document_properties(
name="test.xlsx")
# Print property information
if response.document_properties and response.document_properties.document_property_list:
print(f"Document Properties:")
# Print each property
for prop in response.document_properties.document_property_list:
print(f" {prop.name}: {prop.value} (Built-in: {prop.built_in})")
else:
print("No document properties found.")
except Exception as e:
print("Exception when calling CellsApi->cells_properties_get_document_properties: %s\n" % e)
3. Getting a Specific Document Property
To retrieve a specific document property by its name:
Using cURL:
curl -X GET "$BASE_URL/test.xlsx/documentproperties/Author" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json"
Using C# SDK:
using System;
using Aspose.Cells.Cloud.SDK.Api;
using Aspose.Cells.Cloud.SDK.Model;
namespace AsposeCellsCloudTutorial
{
class Program
{
static void Main(string[] args)
{
// Configure API credentials
var cellsApi = new CellsApi("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");
// Specify parameters
string fileName = "test.xlsx";
string propertyName = "Author";
try
{
// Get the specific document property
var response = cellsApi.CellsPropertiesGetDocumentProperty(
fileName,
propertyName);
// Display property details
if (response.DocumentProperty != null)
{
Console.WriteLine("Document Property Details:");
Console.WriteLine($"Name: {response.DocumentProperty.Name}");
Console.WriteLine($"Value: {response.DocumentProperty.Value}");
Console.WriteLine($"Built-in: {response.DocumentProperty.BuiltIn}");
}
else
{
Console.WriteLine($"Property '{propertyName}' not found.");
}
}
catch (Exception e)
{
Console.WriteLine("Error: " + e.Message);
}
}
}
}
4. Adding or Updating a Document Property
To add a new document property or update an existing one:
Using cURL:
curl -X PUT "$BASE_URL/test.xlsx/documentproperties/Company" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-d '{"Value": "Aspose", "Name": "Company", "BuiltIn": "False"}'
Using Java SDK:
import com.aspose.cells.cloud.api.CellsApi;
import com.aspose.cells.cloud.model.*;
import com.aspose.cells.cloud.request.*;
public class UpdateDocumentPropertyExample {
public static void main(String[] args) {
// Configure API credentials
CellsApi cellsApi = new CellsApi("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");
try {
// Specify parameters
String fileName = "test.xlsx";
String propertyName = "Company";
// Create document property object
DocumentProperty property = new DocumentProperty();
property.setName(propertyName);
property.setValue("Aspose");
property.setBuiltIn("False");
// Create request
PutDocumentPropertyRequest request = new PutDocumentPropertyRequest(
fileName,
propertyName,
property,
null, // folder
null // storage name
);
// Add/update the document property
cellsApi.putDocumentProperty(request);
System.out.println("Document property added/updated successfully.");
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
}
}
}
5. Deleting a Specific Document Property
To delete a specific document property by its name:
Using cURL:
curl -X DELETE "$BASE_URL/test.xlsx/documentproperties/Company" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json"
Using Node.js SDK:
const { CellsApi } = require('asposecellscloud');
// Configure API credentials
const cellsApi = new CellsApi('YOUR_CLIENT_ID', 'YOUR_CLIENT_SECRET');
// Specify parameters
const fileName = "test.xlsx";
const propertyName = "Company";
// Delete the document property
cellsApi.cellsPropertiesDeleteDocumentProperty(fileName, propertyName)
.then(() => {
console.log(`Document property '${propertyName}' deleted successfully.`);
})
.catch(error => {
console.error("Error:", error);
});
6. Deleting All Document Properties
To remove all document properties from an Excel file:
Using cURL:
curl -X DELETE "$BASE_URL/test.xlsx/documentproperties" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json"
Using Python SDK:
import asposecellscloud
from asposecellscloud.apis.cells_api import CellsApi
from asposecellscloud.models import *
from asposecellscloud.api_client import ApiClient
# Configure API key authorization
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
# Create API client
api_instance = CellsApi(client_id, client_secret)
try:
# Specify parameters
name = "test.xlsx"
# Delete all document properties
response = api_instance.cells_properties_delete_document_properties(
name=name
)
print("All document properties deleted successfully.")
except Exception as e:
print("Exception when calling CellsApi->cells_properties_delete_document_properties: %s\n" % e)
7. Getting Metadata Without Using Storage
Aspose.Cells Cloud API also allows you to get metadata from Excel files without storing them first:
Using cURL:
curl -X POST "$BASE_URL/metadata/get" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: multipart/form-data" \
-H "Accept: application/json" \
-F "file=@test.xlsx"
Using C# SDK:
using System;
using System.IO;
using Aspose.Cells.Cloud.SDK.Api;
using Aspose.Cells.Cloud.SDK.Model;
namespace AsposeCellsCloudTutorial
{
class Program
{
static void Main(string[] args)
{
// Configure API credentials
var cellsApi = new CellsApi("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");
// Specify file path
string filePath = "test.xlsx";
try
{
// Read file content
var fileStream = File.OpenRead(filePath);
// Get metadata without storage
var response = cellsApi.CellsMetadataGet(fileStream);
// Display metadata
if (response != null && response.Count > 0)
{
Console.WriteLine("Document Properties:");
foreach (var property in response)
{
Console.WriteLine($"{property.Name}: {property.Value}");
}
}
else
{
Console.WriteLine("No metadata found.");
}
}
catch (Exception e)
{
Console.WriteLine("Error: " + e.Message);
}
}
}
}
8. Updating Metadata Without Using Storage
You can also update metadata in Excel files without storing them first:
Using cURL:
curl -X POST "$BASE_URL/metadata/update" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: multipart/form-data" \
-H "Accept: application/json" \
-F "file=@test.xlsx" \
-F 'properties=[{"name":"Author","value":"John Doe"},{"name":"Company","value":"Aspose"},{"name":"Category","value":"Tutorial"}]'
Using Java SDK:
import com.aspose.cells.cloud.api.CellsApi;
import com.aspose.cells.cloud.model.*;
import com.aspose.cells.cloud.request.*;
import java.io.File;
import java.util.ArrayList;
import java.util.List;
public class UpdateMetadataWithoutStorageExample {
public static void main(String[] args) {
// Configure API credentials
CellsApi cellsApi = new CellsApi("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET");
try {
// Specify file path
String filePath = "test.xlsx";
// Create properties to update
List<DocumentProperty> properties = new ArrayList<>();
DocumentProperty author = new DocumentProperty();
author.setName("Author");
author.setValue("John Doe");
properties.add(author);
DocumentProperty company = new DocumentProperty();
company.setName("Company");
company.setValue("Aspose");
properties.add(company);
DocumentProperty category = new DocumentProperty();
category.setName("Category");
category.setValue("Tutorial");
properties.add(category);
// Update metadata without storage
File result = cellsApi.cellsMetadataUpdate(new File(filePath), properties);
System.out.println("Metadata updated successfully. Updated file: " + result.getAbsolutePath());
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
}
}
}
9. Deleting Metadata Without Using Storage
Similarly, you can delete metadata from Excel files without storing them:
Using cURL:
curl -X POST "$BASE_URL/metadata/delete" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: multipart/form-data" \
-H "Accept: application/json" \
-F "file=@test.xlsx" \
-F "type=all"
Using Python SDK:
import asposecellscloud
from asposecellscloud.apis.cells_api import CellsApi
from asposecellscloud.models import *
from asposecellscloud.api_client import ApiClient
# Configure API key authorization
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
# Create API client
api_instance = CellsApi(client_id, client_secret)
try:
# Specify file path
file_path = "test.xlsx"
# Delete metadata without storage
with open(file_path, 'rb') as file:
response = api_instance.cells_metadata_delete(
file=file,
type="all" # Delete all properties (can also be "BuiltIn" or "Custom")
)
# Save updated file
with open("updated_" + file_path, 'wb') as out_file:
for file_info in response.files:
out_file.write(file_info.file_content)
print("Metadata deleted successfully. Updated file saved as: updated_" + file_path)
except Exception as e:
print("Exception when calling CellsApi->cells_metadata_delete: %s\n" % e)
Try It Yourself
Now it’s time to practice what you’ve learned:
- Create a new Excel file or use an existing one
- Retrieve all document properties and see what’s already there
- Add several custom properties (e.g., Project, Department, Status)
- Update one of the built-in properties (e.g., Author, Title)
- Delete a specific property
- Try the methods that don’t require storage to process metadata
- Finally, delete all properties and verify they’re gone
Troubleshooting Tips
- Error 401 (Unauthorized): Make sure your Client ID and Client Secret are correct and that your token hasn’t expired.
- Error 404 (Not Found): Check that your file name and property names are correct.
- Property not being updated: For built-in properties, ensure you’re using the exact property name with correct capitalization (e.g., “Author” not “author”).
- Custom property not appearing: When creating custom properties, make sure to set BuiltIn to “False”.
- Values not appearing as expected: Document property values are stored as strings, so ensure proper conversion if you’re working with numbers or dates.
What You’ve Learned
In this tutorial, you’ve learned how to:
- Retrieve all document properties from an Excel file
- Get a specific document property by its name
- Add new document properties or update existing ones
- Delete specific document properties by name
- Remove all document properties from a file
- Work with metadata without using storage (direct file processing)
- Update multiple properties at once
You now have the skills to programmatically manage document metadata in Excel files using the Aspose.Cells Cloud API, allowing you to automate document information management.
Next Steps
Consider exploring these related tutorials: