Observability for Any LLM Provider¶
Anosys is vendor-agnostic. While we offer native integrations for OpenAI and Anthropic, you can connect any LLM provider to the Anosys Platform using OpenTelemetry (OTLP) or our REST API. If your model can be called from code, it can be observed with Anosys.
Supported LLM Providers¶
Anosys works with every major LLM vendor and model hosting platform:
| Provider | Models | Integration Method |
|---|---|---|
| Google Gemini | Gemini 2.5 Pro, Gemini 2.5 Flash | OTLP, REST API |
| Meta Llama | Llama 4, Llama 3.3, Code Llama | OTLP, REST API |
| Mistral AI | Mistral Large, Mistral Medium, Codestral | OTLP, REST API |
| Cohere | Command R+, Embed, Rerank | OTLP, REST API |
| AWS Bedrock | Claude, Llama, Titan, Mistral via Bedrock | OTLP, REST API |
| Azure OpenAI | GPT-4o, GPT-4.1, GPT-5 via Azure | OTLP, REST API |
| Google Vertex AI | Gemini, PaLM, custom models | OTLP, REST API |
| Hugging Face | Inference API, Inference Endpoints | OTLP, REST API |
| Fireworks AI | Llama, Mixtral, custom fine-tunes | OTLP, REST API |
| Together AI | Open-source models at scale | OTLP, REST API |
| Groq | Llama, Mixtral on LPU hardware | OTLP, REST API |
| Replicate | Open-source models on demand | OTLP, REST API |
| Ollama | Local models (Llama, Mistral, Phi) | OTLP, REST API |
| vLLM / TGI | Self-hosted inference servers | OTLP, REST API |
| Custom / Private | Fine-tuned models, proprietary endpoints | OTLP, REST API |
Don't see your provider listed? It doesn't matter — if you can call it from code, you can observe it with Anosys.
How to Integrate Any LLM¶
There are two approaches to adding observability for any model provider:
Option 1 — OpenTelemetry (Recommended)¶
If your application is already instrumented with OpenTelemetry, or you're using a framework that supports it (LangChain, LlamaIndex, Haystack, CrewAI, AutoGen, etc.), point your OTLP exporter at your Anosys endpoint and data flows automatically.
Configure your environment:
Replace YOUR_ANOSYS_OTLP_ENDPOINT with the OTLP endpoint URL from your Anosys Console pixel of type Agentic AI.
This works with any OTEL-compatible library, including:
- Python:
opentelemetry-sdk,opentelemetry-instrumentation-* - JavaScript/TypeScript:
@opentelemetry/sdk-node - Go:
go.opentelemetry.io/otel - Java:
io.opentelemetry
For a full OTLP/HTTP setup example in Python, see the OpenTelemetry integration guide.
Option 2 — REST API¶
Wrap your LLM calls with a simple HTTP POST to the Anosys ingestion endpoint. This works from any language without any SDK dependency.
Python example — instrumenting a Google Gemini call:
cURL example:
Option 3 — Python Decorator¶
For the lightest instrumentation, use the Anosys decorator to auto-capture any function that wraps an LLM call:
The decorator automatically captures the function name, arguments, return value, and execution time.
What to Capture¶
Regardless of the integration method, we recommend sending these fields for maximum observability:
| Field | Type | Description |
|---|---|---|
event_type |
String | Event category (e.g. llm_call, agent_step, embedding) |
s1 / prompt |
String | The user's input prompt |
s2 / response |
String | The model's response text |
s3 / model |
String | Model name and version |
s4 / provider |
String | Vendor name (e.g. google, meta, mistral) |
n1 / duration |
Number | End-to-end latency in milliseconds |
n2 / input_tokens |
Number | Input token count |
n3 / output_tokens |
Number | Output token count |
n4 / cost |
Number | Estimated cost per request (optional) |
Agentic Framework Support¶
Many popular agentic frameworks already emit OpenTelemetry traces. Configure their OTLP exporter to point at Anosys and you get observability for free:
| Framework | Language | OTEL Support |
|---|---|---|
| LangChain / LangGraph | Python, JS | Built-in via callbacks |
| LlamaIndex | Python | Built-in instrumentation |
| CrewAI | Python | OTEL-compatible |
| AutoGen | Python | OTEL-compatible |
| Haystack | Python | OTEL-compatible |
| Semantic Kernel | C#, Python | OTEL-compatible |
| Vercel AI SDK | TypeScript | OTEL-compatible |
| Mastra | TypeScript | OTEL-compatible |
For frameworks without built-in OTEL support, use the REST API or Python decorator approach.
What You'll See in Anosys¶
Once data is flowing from any LLM provider, the Anosys Platform surfaces:
- Request traces — every LLM call with prompt, response, model, and timing metadata
- Cross-model comparison — side-by-side latency, cost, and quality metrics across providers and models
- Token usage trends — track consumption over time by model, provider, or project
- Latency analysis — identify slow calls, p50/p95/p99 breakdowns, and time-to-first-token
- Error tracking — rate limits, timeouts, and API errors with automatic classification
- Cost dashboards — per-request and aggregate cost estimates based on token usage and model pricing
- Anomaly detection — ML-powered baselines that alert on latency spikes, cost overruns, or quality degradation without manual threshold configuration
- Root cause analysis — causal graphs that connect failures to upstream triggers across providers, models, and agent steps
- Alerts — context-aware notifications via Slack, email, PagerDuty, or webhooks for errors, cost overruns, and performance regressions
- Automated metric generation — Anosys automatically generates key metrics from your traces and logs so you get dashboards in minutes, not days
- Custom dashboards — build your own views or start with auto-generated dashboards for model health, provider comparison, and cost attribution
- Custom pipelines — enrich, route, and transform your LLM telemetry with automated remediation workflows
- Labeling — tag and annotate calls by provider, model, project, or team for segmentation and drill-down analysis
- Natural language interface — ask questions about your LLM data in plain English and get answers backed by your telemetry
Next Steps¶
- OpenAI Agents — native Python SDK integration for OpenAI
- OpenAI ChatKit Apps — observability for ChatKit-powered chat widgets
- Anthropic Agents — zero-code setup for Claude Code
- OpenTelemetry Integration — universal OTEL guide for any system that speaks OpenTelemetry
- Data Ingestion Options — all integration methods with code examples
- FAQ — frequently asked questions