Quickstart

This guide walks you through installing inertialai-chroma, connecting to a Chroma collection, and running your first embedding and similarity search.

Prerequisites

Python 3.11 or later
An InertialAI API key (sign up here)
Docker installed (to run Chroma locally)

Step 1: Start a Chroma Instance

Run Chroma using the official Docker image:

docker run -d \
  --name chroma \
  -p 8000:8000 \
  chromadb/chroma

Verify it's running:

curl http://localhost:8000/api/v2/heartbeat

Step 2: Install the Package

pip install inertialai-chroma

or with uv:

uv add inertialai-chroma

Step 3: Set Your API Key

Set your InertialAI API key as an environment variable:

export INERTIALAI_API_KEY="your-api-key"

InertialAIEmbeddingFunction reads this variable by default. If you use a different variable name, see the Configuration Reference.

Step 4: Embed Text Documents

Create a collection and add text documents. InertialAIEmbeddingFunction is called automatically by Chroma on every add() and query():

import chromadb
from inertialai_chroma import InertialAIEmbeddingFunction

# Connect to the running Chroma instance
client = chromadb.HttpClient(host="localhost", port=8000)

# Create the embedding function — reads INERTIALAI_API_KEY from env
ef = InertialAIEmbeddingFunction()

# Create a collection with the embedding function attached
collection = client.create_collection("sensors", embedding_function=ef)

# Add text documents — InertialAI's API is called in the background
collection.add(
    documents=[
        "temperature spike detected at noon on sensor array B",
        "stable overnight temperature readings within normal range",
        "humidity levels elevated in zone 3 during afternoon hours",
        "pressure anomaly recorded at 14:32 on sensor unit 7",
        "all systems nominal — environmental conditions within threshold",
    ],
    ids=["doc-1", "doc-2", "doc-3", "doc-4", "doc-5"],
)

# Query — again, embedding happens automatically
results = collection.query(
    query_texts=["unusual thermal event"],
    n_results=2,
)

print(results["documents"])
# [['temperature spike detected at noon on sensor array B',
#   'pressure anomaly recorded at 14:32 on sensor unit 7']]

Step 5: Embed Time-Series Data

To embed raw sensor readings, serialize them as a JSON string using json.dumps(). Time-series data is structured as a list of channels, where each channel is a list of numerical readings ordered by time:

import json
import chromadb
from inertialai_chroma import InertialAIEmbeddingFunction

client = chromadb.HttpClient(host="localhost", port=8000)
ef = InertialAIEmbeddingFunction()

collection = client.create_collection("cnc-machines", embedding_function=ef)

collection.add(
    documents=[
        json.dumps({
            "time_series": [
                [2100, 2150, 2180, 2140, 2120],  # RPM
                [65, 66, 68, 67, 66],            # Temperature (°C)
                [8.2, 8.5, 8.7, 8.4, 8.3],       # Vibration (mm/s)
            ]
        }),
        json.dumps({
            "time_series": [
                [1800, 1820, 1850, 1800, 1790],  # RPM
                [72, 74, 78, 80, 82],            # Temperature (°C) — rising
                [9.1, 9.4, 10.2, 11.0, 12.1],   # Vibration — elevated
            ]
        }),
    ],
    ids=["machine-42-normal", "machine-17-fault"],
)

# Query with a new reading to find the most similar stored pattern
results = collection.query(
    query_texts=[
        json.dumps({
            "time_series": [
                [1810, 1830, 1860, 1810, 1800],
                [71, 73, 77, 79, 81],
                [9.0, 9.3, 10.1, 10.8, 11.9],
            ]
        })
    ],
    n_results=1,
)

print(results["ids"])
# [['machine-17-fault']]

The most powerful capability of inertial-embed-alpha is combining raw sensor readings with natural language context into a single vector. Include both text and time_series keys in your JSON dict:

import json
import chromadb
from inertialai_chroma import InertialAIEmbeddingFunction

client = chromadb.HttpClient(host="localhost", port=8000)
ef = InertialAIEmbeddingFunction()

collection = client.create_collection("patient-vitals", embedding_function=ef)

collection.add(
    documents=[
        json.dumps({
            "text": "Post-exercise recovery, patient ID 1001, age 28, male, marathon runner",
            "time_series": [
                [155, 148, 140, 133, 127, 122],         # Heart rate (BPM) — recovering
                [98.9, 98.8, 98.7, 98.6, 98.5, 98.4],  # Body temp (°F)
            ],
        }),
        json.dumps({
            "text": "Resting baseline, patient ID 1002, age 45, female, sedentary lifestyle",
            "time_series": [
                [72, 71, 73, 72, 71, 72],               # Heart rate (BPM) — stable
                [98.2, 98.2, 98.3, 98.2, 98.2, 98.1],  # Body temp (°F)
            ],
        }),
        json.dumps({
            "text": "Atrial fibrillation episode, patient ID 1003, age 67, male, cardiac history",
            "time_series": [
                [88, 112, 79, 134, 95, 118],            # Heart rate (BPM) — irregular
                [98.6, 98.7, 98.8, 98.9, 99.0, 99.1],  # Body temp (°F)
            ],
        }),
    ],
    ids=["patient-1001", "patient-1002", "patient-1003"],
)

# Find stored records most similar to a new patient's readings
results = collection.query(
    query_texts=[
        json.dumps({
            "text": "Elevated heart rate post-activity, patient ID 2001, age 31, male",
            "time_series": [
                [162, 153, 144, 136, 129, 124],
                [99.0, 98.9, 98.8, 98.7, 98.6, 98.5],
            ],
        })
    ],
    n_results=2,
)

print(results["ids"])
# [['patient-1001', 'patient-1002']]

The multi-modal vector captures both the shape of the signal and the semantic meaning of the text — producing more precise matches than either modality alone.

Next Steps

Multi-Modal Embeddings — Learn more about the multi-modal input format with examples across industrial IoT, healthcare, financial, and other domains.
Collection Persistence — Understand how collections are safely persisted to disk and reloaded across process restarts.
Configuration Reference — Explore all constructor options including dimensionality reduction, custom timeouts, and distance spaces.

Prerequisites​

Step 1: Start a Chroma Instance​

Step 2: Install the Package​

Step 3: Set Your API Key​

Step 4: Embed Text Documents​

Step 5: Embed Time-Series Data​

Step 6: Embed Multi-Modal Data​

Next Steps​