Px
pinexai.dev
/ tokenspy
All Docs GitHub
🔥

tokenspy

cProfile for LLMs — find which function is burning your AI budget.

v0.2.0 Python 3.10+ MIT License zero dependencies
$ pip install tokenspy

Overview

tokenspy is your local LLM profiler. One line. Runs on your machine. Forever free.

💡
One-line install: pip install tokenspy — zero required dependencies.

The Problem

You get an OpenAI invoice — $800 this month. You have no idea which function caused it. Langfuse and Braintrust force you to reroute traffic through their cloud proxy, create accounts, and pay per seat.

tokenspy is your local alternative. It wraps your existing LLM calls, records every token, and shows you exactly where your costs are going — with a live dashboard, flame graph, and trace explorer. All on your machine. No accounts. No cloud.

What's in v0.2.0

🔥 Cost Flame Graphs
See exactly which function is burning your budget. Drill down into nested calls.
🔍 Structured Tracing
Full trace + span tree with inputs, outputs, token counts, and latency.
📊 Evaluations
Run LLM functions against golden test sets. Track pass/fail over time.
📝 Prompt Versioning
Every prompt version stored. Diff when costs spike. Roll back instantly.
📺 Live Dashboard
Web UI with cost charts, trace explorer, and token heatmaps. tokenspy serve
📡 OpenTelemetry
Export spans to Grafana, Jaeger, Datadog, or any OTLP collector.

Why tokenspy over Langfuse / Braintrust?

tokenspyLangfuse / Braintrust
Account requiredNoneYes
Data leaves your machineNeverAlways
Setup time30 seconds15–30 min
CostFree foreverPaid tiers
Works offlineYesNo
LicenseMITVarious