内容摘录
<p><picture><img src="https://github.com/user-attachments/assets/47d67430-386d-4675-82ad-d4734d3262d9" alt="TensorZero Logo" width="128" height="128"></picture></p>
TensorZero
<p><picture><img src="https://www.tensorzero.com/github-trending-badge.svg" alt="#1 Repository Of The Day"></picture></p>
**TensorZero is an open-source stack for _industrial-grade LLM applications_:**
**Gateway:** access every LLM provider through a unified API, built for performance (<1ms p99 latency)
**Observability:** store inferences and feedback in your database, available programmatically or in the UI
**Optimization:** collect metrics and human feedback to optimize prompts, models, and inference strategies
**Evaluation:** benchmark individual inferences or end-to-end workflows using heuristics, LLM judges, etc.
**Experimentation:** ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.
Take what you need, adopt incrementally, and complement with other tools.
<video src="https://github.com/user-attachments/assets/04a8466e-27d8-4189-b305-e7cecb6881ee"></video>
---
<p align="center">
<b><a href="https://www.tensorzero.com/" target="_blank">Website</a></b>
·
<b><a href="https://www.tensorzero.com/docs" target="_blank">Docs</a></b>
·
<b><a href="https://www.x.com/tensorzero" target="_blank">Twitter</a></b>
·
<b><a href="https://www.tensorzero.com/slack" target="_blank">Slack</a></b>
·
<b><a href="https://www.tensorzero.com/discord" target="_blank">Discord</a></b>
<br>
<br>
<b><a href="https://www.tensorzero.com/docs/quickstart" target="_blank">Quick Start (5min)</a></b>
·
<b><a href="https://www.tensorzero.com/docs/gateway/deployment" target="_blank">Deployment Guide</a></b>
·
<b><a href="https://www.tensorzero.com/docs/gateway/api-reference" target="_blank">API Reference</a></b>
·
<b><a href="https://www.tensorzero.com/docs/gateway/deployment" target="_blank">Configuration Reference</a></b>
</p>
---
[!NOTE]
### **Coming Soon: TensorZero Autopilot**
TensorZero Autopilot is an **automated AI engineer** (powered by the TensorZero Stack) that analyzes LLM observability data, optimizes prompts and models, sets up evals, and runs A/B tests.
**Learn more** **Join the waitlist**
Features
🌐 LLM Gateway
**Integrate with TensorZero once and access every major LLM provider.**
[x] **Call any LLM** (API or self-hosted) through a single unified API
[x] Infer with **streaming**, **tool use**, **structured outputs (JSON)**, **batch**, **embeddings**, **multimodal (images, files)**, **caching**, etc.
[x] **Create prompt templates and schemas** to enforce a consistent, typed interface between your application and the LLMs
[x] Satisfy extreme throughput and latency needs, thanks to 🦀 Rust: **<1ms p99 latency overhead at 10k+ QPS**
[x] Use any programming language: **integrate via our Python SDK, any OpenAI SDK, or our HTTP API**
[x] **Ensure high availability** with routing, retries, fallbacks, load balancing, granular timeouts, etc.
[x] **Enforce custom rate limits** with granular scopes (e.g. user-defined tags) to keep usage under control
[x] **Set up auth for TensorZero** to allow clients to access models without sharing provider API keys
[ ] Soon: spend tracking and budgeting
<br>
**Supported Model Providers:**
**Anthropic**,
**AWS Bedrock**,
**AWS SageMaker**,
**Azure**,
**DeepSeek**,
**Fireworks**,
**GCP Vertex AI Anthropic**,
**GCP Vertex AI Gemini**,
**Google AI Studio (Gemini API)**,
**Groq**,
**Hyperbolic**,
**Mistral**,
**OpenAI**,
**OpenRouter**,
**SGLang**,
**TGI**,
**Together AI**,
**vLLM**, and
**xAI (Grok)**.
Need something else? TensorZero also supports **any OpenAI-compatible API (e.g. Ollama)**.
<br>
<details open>
<summary><b>Usage: Python — TensorZero SDK</b></summary>
You can access any provider using the TensorZero Python SDK.
pip install tensorzero
Optional: Set up the TensorZero configuration.
Run inference:
See **Quick Start** for more information.
</details>
<details>
<summary><b>Usage: Python — OpenAI SDK</b></summary>
You can access any provider using the OpenAI Python SDK with TensorZero.
pip install tensorzero
Optional: Set up the TensorZero configuration.
Run inference:
See **Quick Start** for more information.
</details>
<details>
<summary><b>Usage: JavaScript / TypeScript (Node) — OpenAI SDK</b></summary>
You can access any provider using the OpenAI Node SDK with TensorZero.
Deploy tensorzero/gateway using Docker.
**Detailed instructions →**
Set up the TensorZero configuration.
Run inference:
See **Quick Start** for more information.
</details>
<details>
<summary><b>Usage: Other Languages & Platforms — HTTP API</b></summary>
TensorZero supports virtually any programming language or platform via its HTTP API.
Deploy tensorzero/gateway using Docker.
**Detailed instructions →**
Optional: Set up the TensorZero configuration.
Run inference:
See **Quick Start** for more information.
</details>
🔍 LLM Observability
**Zoom in to debug individual API calls, or zoom out to monitor metrics across models and prompts over time — all using the open-source TensorZero UI.**
[x] Store inferences and **feedback (metrics, human edits, etc.)** in your own database
[x] Dive into individual inferences or high-level aggregate patterns using the TensorZero UI or programmatically
[x] **Build datasets** for optimization, evaluation, and other workflows
[x] Replay historical inferences with new prompts, models, inference strategies, etc.
[x] **Export OpenTelemetry traces (OTLP)** and **export Prometheus metrics** to your favorite application observability tools
[ ] Soon: AI-assisted debugging and root cause analysis; AI-assisted data labeling
<table>
<tr></tr> <!-- flip highlight order -->
<tr>
<td width="50%" align="center" valign="middle"><b>Observability » UI</b></td>
<td width="50%" align="center" valign="middle"><b>Observability » Programmatic</b></td>
</tr>
<tr>
<td width="50%" align="center" valign="middle"><video s…