Collection Persistence
Chroma can persist collection configuration to disk so that collections — including their embedding function settings — survive process restarts. InertialAIEmbeddingFunction fully supports this via Chroma's get_config() / build_from_config() protocol, while ensuring that API credentials are never written to disk.
How It Works
When Chroma serializes a collection, it calls get_config() on the embedding function and stores the result. When the collection is reopened, Chroma calls build_from_config() with the stored config to reconstruct the embedding function automatically.
InertialAIEmbeddingFunction.get_config() stores the environment variable name rather than the API key value:
{
"api_key_env_var": "INERTIALAI_API_KEY", # name only — never the key value
"model_name": "inertial-embed-alpha",
"dimensions": None,
"timeout": 60.0,
}
At load time, build_from_config() reads the API key from the named environment variable. This means:
- Persisted collections are safe to commit to version control — no credentials are stored.
- The environment variable must be set in the process that reopens the collection.
- Rotating your API key only requires updating the environment variable — no changes to stored collection data are needed.
Example: Persist and Reload a Collection
import json
import chromadb
from inertialai_chroma import InertialAIEmbeddingFunction
# --- Session 1: Create the collection and add data ---
client = chromadb.PersistentClient(path="./chroma-data")
ef = InertialAIEmbeddingFunction()
collection = client.create_collection("sensors", embedding_function=ef)
collection.add(
documents=[
json.dumps({
"text": "CNC machine #42, cutting aluminum, normal operation",
"time_series": [
[2100, 2150, 2180, 2140, 2120],
[65, 66, 68, 67, 66],
],
}),
json.dumps({
"text": "CNC machine #42, cutting aluminum, tool wear detected",
"time_series": [
[2100, 2090, 2070, 2050, 2030],
[65, 68, 72, 77, 83],
],
}),
],
ids=["run-001", "run-002"],
)
print("Collection created and data added.")
# --- Session 2: Reopen the collection (in a new process) ---
client2 = chromadb.PersistentClient(path="./chroma-data")
# Chroma reconstructs InertialAIEmbeddingFunction from the stored config,
# resolving INERTIALAI_API_KEY from the environment automatically.
# No need to pass embedding_function= here.
collection2 = client2.get_collection("sensors")
results = collection2.query(
query_texts=["elevated vibration and temperature"],
n_results=1,
)
print(results["ids"])
# [['run-002']]
Note: When reopening a persisted collection, do not pass
embedding_function=toget_collection(). Chroma reconstructs the embedding function from the stored config automatically. If you pass an embedding function here, Chroma will use it to validate the stored config — which can raise an error if settings such asmodel_nameordimensionsdo not match.
Custom Environment Variable Names
If you manage multiple InertialAI API keys — for example, separate keys for development and production — configure api_key_env_var to point to the appropriate variable:
import chromadb
from inertialai_chroma import InertialAIEmbeddingFunction
client = chromadb.PersistentClient(path="./chroma-prod")
# The name "INERTIALAI_API_KEY_PROD" is stored in the collection config
ef = InertialAIEmbeddingFunction(api_key_env_var="INERTIALAI_API_KEY_PROD")
collection = client.create_collection("prod-sensors", embedding_function=ef)
When this collection is reopened, Chroma will look for INERTIALAI_API_KEY_PROD in the environment. Ensure the variable is available in every process that opens the collection.
Security Considerations
- Do not pass the API key via
api_key=. That parameter is deprecated. Its value is not included inget_config(), which means the embedding function cannot be reconstructed automatically when the collection is reopened. Useapi_key_env_varinstead. - API key values are never written to disk. The persisted config contains only the environment variable name.
- Rotate keys by updating the environment variable. No stored collection data needs to change.