Computer Vision

Overview

Invictus AI's Computer Vision capabilities enable machines to interpret and understand visual information from images and videos. Our platform offers a comprehensive suite of vision AI solutions that can be easily integrated into your applications.

Key Capabilities

Object Detection

Identify and locate multiple objects within images with bounding boxes.

const result = await client.vision.detectObjects({
  image: imageBuffer, // or URL or base64 encoded string
  minConfidence: 0.6
});

Example response:

{
  "objects": [
    {
      "label": "car",
      "confidence": 0.98,
      "boundingBox": {
        "x1": 45,
        "y1": 120,
        "x2": 325,
        "y2": 250
      }
    },
    {
      "label": "person",
      "confidence": 0.95,
      "boundingBox": {
        "x1": 580,
        "y1": 90,
        "x2": 660,
        "y2": 380
      }
    }
  ]
}

Image Classification

Categorize images into predefined or custom classes.

const result = await client.vision.classifyImage({
  image: imageBuffer,
  categories: ["landscape", "portrait", "food", "architecture", "animals"]
});

Face Detection and Analysis

Detect faces in images and analyze facial attributes.

const result = await client.vision.analyzeFaces({
  image: imageBuffer,
  returnAttributes: ["age", "gender", "emotion"]
});

OCR (Optical Character Recognition)

Extract text from images with advanced layout recognition.

const result = await client.vision.extractText({
  image: imageBuffer,
  language: "auto", // or specific language code
  enhanceText: true
});

Scene Understanding

Analyze and understand the content and context of complex scenes.

const result = await client.vision.analyzeScene({
  image: imageBuffer,
  returnDetails: ["objects", "activities", "environment"]
});

Advanced Features

Custom Vision Models

Train custom vision models using your own labeled data for specialized use cases.

// First, create a custom vision project
const project = await client.vision.createCustomProject({
  name: "Defect Detection Model",
  type: "objectDetection" // or "classification"
});

// Upload and tag training images
await client.vision.uploadTrainingImages({
  projectId: project.id,
  images: [
    {
      data: image1Buffer,
      tags: ["defect_type_a"]
    },
    {
      data: image2Buffer,
      tags: ["defect_type_b"]
    }
  ]
});

// Train the model
const trainingJob = await client.vision.trainModel({
  projectId: project.id
});

// Once trained, use your custom model
const prediction = await client.vision.predict({
  projectId: project.id,
  modelId: trainingJob.modelId,
  image: testImageBuffer
});

Video Analytics

Process video streams or files to extract valuable insights.

// Start a video analysis job
const job = await client.vision.analyzeVideo({
  videoUrl: "https://example.com/video.mp4",
  features: ["objectTracking", "activityDetection", "sceneChange"]
});

// Check job status
const status = await client.vision.getJobStatus({
  jobId: job.id
});

// Retrieve results when complete
if (status.state === "completed") {
  const results = await client.vision.getJobResults({
    jobId: job.id
  });
}

Industry Applications

Retail

Visual Search: Allow customers to search for products using images
Shelf Analysis: Monitor product placement and stock levels
Customer Journey Analytics: Track customer paths and interactions in stores

Manufacturing

Quality Control: Detect defects in products during manufacturing
Equipment Monitoring: Monitor equipment for signs of wear or malfunction
Safety Compliance: Ensure workers are wearing proper safety equipment

Healthcare

Medical Imaging Analysis: Assist in diagnosis through medical image interpretation
Patient Monitoring: Monitor patient movements and activities
Medication Verification: Ensure correct medication administration

Security

Surveillance: Detect unusual activities or unauthorized personnel
Access Control: Facial recognition for secured areas
Crowd Analysis: Monitor crowd density and movement patterns

Performance Metrics

Task

Accuracy

Processing Time

Object Detection

94.7%

~300ms

Image Classification

96.2%

~150ms

Face Analysis

95.5%

~200ms

OCR

98.3% (for clear text)

~250ms

Scene Understanding

92.1%

~400ms

Next Steps

Try our Image Recognition Tutorial
Explore Computer Vision API Reference
Check out Use Cases for your industry

PreviousAuthentication NextHealthcare Use Cases