ooai_llm.callbacks¶

Usage and cost callback helpers.

Purpose:

Provide ergonomic callback helpers that work with LangChain usage metadata and the native LiteLLM callback interface.

Design:

Normalize usage and cost into a shared UsageEvent model.
Offer a recorder object that can accumulate usage across many calls.
Expose a LiteLLM-compatible success callback factory for cost and usage tracking.
Keep callback helpers transport-agnostic so they can support chat, embeddings, and future model families.

Examples

>>> recorder = UsageRecorder()
>>> callback = make_litellm_cost_callback(recorder)
>>> callable(callback)
True

Attributes¶

`CountSource`
`logger`

Exceptions¶

BudgetExceededError

Raised when a usage or cost budget is exceeded.

Classes¶

`UsageEvent`	Normalized usage and cost event.
`UsageSummary`	Aggregate view over recorded usage events.
`BudgetPolicy`	Simple budget and warning thresholds for usage tracking.
`UsageRecorder`	In-memory recorder for normalized usage events.
`LangChainUsageCallbackHandler`	LangChain callback handler that records observed LLM usage metadata.

Functions¶

`build_langchain_usage_event`(→ UsageEvent)	Build a normalized event from LangChain usage metadata.
`make_litellm_cost_callback`(→ Any)	Return a LiteLLM success callback that records cost and usage.
`estimate_and_record_langchain_usage`(→ UsageEvent)	Estimate cost from LangChain usage metadata and record the result.
`extract_usage_metadata`(→ dict[str, Any] \| None)	Extract best-effort usage metadata from LangChain-style responses.
`extract_response_model_name`(→ str \| None)	Extract a best-effort model name from a LangChain-style response.
`record_langchain_response_usage`(→ UsageEvent \| None)	Record usage from a LangChain response when usage metadata is present.

Module Contents¶

ooai_llm.callbacks.CountSource[source]¶

ooai_llm.callbacks.logger[source]¶

exception ooai_llm.callbacks.BudgetExceededError[source]¶

Bases: RuntimeError

Raised when a usage or cost budget is exceeded.

class ooai_llm.callbacks.UsageEvent(/, **data: Any)[source]¶

Bases: pydantic.BaseModel

Normalized usage and cost event.

Parameters:

source – Origin of the event, such as langchain or litellm.
model – Typed model string.
input_tokens – Input token count.
output_tokens – Output token count.

total_tokens: Total token count. cost_usd: Actual or estimated USD cost. latency_ms: Measured latency in milliseconds when available. count_source: Provenance for token counts. run_name: Optional LangChain run name or logical application run. tags: Optional framework/application tags. metadata: Optional framework/application metadata. cost_labels: Optional normalized labels used for cost grouping. raw: Original payload fragments.

model_config[source]¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

source: str[source]¶

model: ooai_llm.types.ModelString[source]¶

input_tokens: int = 0[source]¶

output_tokens: int = 0[source]¶

total_tokens: int = 0[source]¶

cost_usd: decimal.Decimal | None = None[source]¶

latency_ms: decimal.Decimal | None = None[source]¶

count_source: CountSource = 'framework_callback'[source]¶

run_name: str | None = None[source]¶

tags: list[str] = None[source]¶

metadata: dict[str, Any] = None[source]¶

cost_labels: dict[str, str] = None[source]¶

raw: dict[str, Any] = None[source]¶

class ooai_llm.callbacks.UsageSummary(/, **data: Any)[source]¶

Bases: pydantic.BaseModel

Aggregate view over recorded usage events.

model_config[source]¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

event_count: int = 0[source]¶

input_tokens: int = 0[source]¶

output_tokens: int = 0[source]¶

total_tokens: int = 0[source]¶

total_cost_usd: decimal.Decimal[source]¶

by_model: dict[str, int] = None[source]¶

by_provider: dict[str, int] = None[source]¶

by_run: dict[str, int] = None[source]¶

class ooai_llm.callbacks.BudgetPolicy(/, **data: Any)[source]¶

Bases: pydantic.BaseModel

Simple budget and warning thresholds for usage tracking.

Parameters:

warn_cost_usd – Optional single-event cost warning threshold.
error_cost_usd – Optional single-event hard cost threshold.
warn_total_tokens – Optional single-event total-token warning threshold.
error_total_tokens – Optional single-event hard token threshold.

model_config[source]¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

warn_cost_usd: decimal.Decimal | None = None[source]¶

error_cost_usd: decimal.Decimal | None = None[source]¶

warn_total_tokens: int | None = None[source]¶

error_total_tokens: int | None = None[source]¶

check(event: UsageEvent) → list[str][source]¶

Return warning messages or raise when hard thresholds are crossed.

Parameters:: event – Usage event to evaluate.
Returns:: Warning messages.
Raises:: BudgetExceededError – If a hard threshold is exceeded.

class ooai_llm.callbacks.UsageRecorder(/, **data: Any)[source]¶

Bases: pydantic.BaseModel

In-memory recorder for normalized usage events.

model_config[source]¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

events: list[UsageEvent] = None[source]¶

warnings: list[str] = None[source]¶

record(event: UsageEvent, *, budget: BudgetPolicy | None = None) → UsageEvent[source]¶: Record an event and optionally apply a budget policy.

property total_tokens: int[source]¶: Return total tokens recorded so far.

property total_cost_usd: decimal.Decimal[source]¶: Return total cost recorded so far.

property input_tokens: int[source]¶: Return input tokens recorded so far.

property output_tokens: int[source]¶: Return output tokens recorded so far.

summary() → UsageSummary[source]¶: Return an aggregate usage summary grouped by model, provider, and run.

ooai_llm.callbacks.build_langchain_usage_event(*, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, cost_usd: decimal.Decimal | None = None, latency_ms: decimal.Decimal | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) → UsageEvent[source]¶: Build a normalized event from LangChain usage metadata.

ooai_llm.callbacks.make_litellm_cost_callback(recorder: UsageRecorder, *, budget: BudgetPolicy | None = None) → Any[source]¶

Return a LiteLLM success callback that records cost and usage.

Parameters:

recorder – Recorder instance to update.
budget – Optional budget policy to evaluate for each event.

Returns:

LiteLLM-compatible callback function.

ooai_llm.callbacks.estimate_and_record_langchain_usage(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) → UsageEvent[source]¶

Estimate cost from LangChain usage metadata and record the result.

Parameters:

recorder – Recorder instance to update.
model – Raw or typed model string.
usage_metadata – LangChain usage metadata.
budget – Optional budget policy.
settings – Optional app settings used for LiteLLM enrichment.
profile – Optional LangChain model profile.

Returns:

Recorded usage event.

ooai_llm.callbacks.extract_usage_metadata(value: Any) → dict[str, Any] | None[source]¶: Extract best-effort usage metadata from LangChain-style responses.

ooai_llm.callbacks.extract_response_model_name(value: Any) → str | None[source]¶: Extract a best-effort model name from a LangChain-style response.

ooai_llm.callbacks.record_langchain_response_usage(recorder: UsageRecorder, *, response: Any, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'provider_usage_metadata', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) → UsageEvent | None[source]¶: Record usage from a LangChain response when usage metadata is present.

class ooai_llm.callbacks.LangChainUsageCallbackHandler(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None)[source]¶

LangChain callback handler that records observed LLM usage metadata.

recorder[source]¶

model[source]¶

budget = None[source]¶

settings = None[source]¶

profile = None[source]¶

run_name = None[source]¶

tags = [][source]¶

metadata[source]¶

cost_labels[source]¶

on_llm_end(response: Any, **_: Any) → None[source]¶: Record usage when LangChain reports an LLM result.