Computer Vision
Overview
Invictus AI's Computer Vision capabilities enable machines to interpret and understand visual information from images and videos. Our platform offers a comprehensive suite of vision AI solutions that can be easily integrated into your applications.
Key Capabilities
Object Detection
Identify and locate multiple objects within images with bounding boxes.
const result = await client.vision.detectObjects({
image: imageBuffer, // or URL or base64 encoded string
minConfidence: 0.6
});
Example response:
{
"objects": [
{
"label": "car",
"confidence": 0.98,
"boundingBox": {
"x1": 45,
"y1": 120,
"x2": 325,
"y2": 250
}
},
{
"label": "person",
"confidence": 0.95,
"boundingBox": {
"x1": 580,
"y1": 90,
"x2": 660,
"y2": 380
}
}
]
}
Image Classification
Categorize images into predefined or custom classes.
const result = await client.vision.classifyImage({
image: imageBuffer,
categories: ["landscape", "portrait", "food", "architecture", "animals"]
});
Face Detection and Analysis
Detect faces in images and analyze facial attributes.
const result = await client.vision.analyzeFaces({
image: imageBuffer,
returnAttributes: ["age", "gender", "emotion"]
});
OCR (Optical Character Recognition)
Extract text from images with advanced layout recognition.
const result = await client.vision.extractText({
image: imageBuffer,
language: "auto", // or specific language code
enhanceText: true
});
Scene Understanding
Analyze and understand the content and context of complex scenes.
const result = await client.vision.analyzeScene({
image: imageBuffer,
returnDetails: ["objects", "activities", "environment"]
});
Advanced Features
Custom Vision Models
Train custom vision models using your own labeled data for specialized use cases.
// First, create a custom vision project
const project = await client.vision.createCustomProject({
name: "Defect Detection Model",
type: "objectDetection" // or "classification"
});
// Upload and tag training images
await client.vision.uploadTrainingImages({
projectId: project.id,
images: [
{
data: image1Buffer,
tags: ["defect_type_a"]
},
{
data: image2Buffer,
tags: ["defect_type_b"]
}
]
});
// Train the model
const trainingJob = await client.vision.trainModel({
projectId: project.id
});
// Once trained, use your custom model
const prediction = await client.vision.predict({
projectId: project.id,
modelId: trainingJob.modelId,
image: testImageBuffer
});
Video Analytics
Process video streams or files to extract valuable insights.
// Start a video analysis job
const job = await client.vision.analyzeVideo({
videoUrl: "https://example.com/video.mp4",
features: ["objectTracking", "activityDetection", "sceneChange"]
});
// Check job status
const status = await client.vision.getJobStatus({
jobId: job.id
});
// Retrieve results when complete
if (status.state === "completed") {
const results = await client.vision.getJobResults({
jobId: job.id
});
}
Industry Applications
Retail
Visual Search: Allow customers to search for products using images
Shelf Analysis: Monitor product placement and stock levels
Customer Journey Analytics: Track customer paths and interactions in stores
Manufacturing
Quality Control: Detect defects in products during manufacturing
Equipment Monitoring: Monitor equipment for signs of wear or malfunction
Safety Compliance: Ensure workers are wearing proper safety equipment
Healthcare
Medical Imaging Analysis: Assist in diagnosis through medical image interpretation
Patient Monitoring: Monitor patient movements and activities
Medication Verification: Ensure correct medication administration
Security
Surveillance: Detect unusual activities or unauthorized personnel
Access Control: Facial recognition for secured areas
Crowd Analysis: Monitor crowd density and movement patterns
Performance Metrics
Object Detection
94.7%
~300ms
Image Classification
96.2%
~150ms
Face Analysis
95.5%
~200ms
OCR
98.3% (for clear text)
~250ms
Scene Understanding
92.1%
~400ms
Next Steps
Try our Image Recognition Tutorial
Explore Computer Vision API Reference
Check out Use Cases for your industry