ooai_llm.callbacks

Usage and cost callback helpers.

Purpose:

Provide ergonomic callback helpers that work with LangChain usage metadata and the native LiteLLM callback interface.

Design:
  • Normalize usage and cost into a shared UsageEvent model.

  • Offer a recorder object that can accumulate usage across many calls.

  • Expose a LiteLLM-compatible success callback factory for cost and usage tracking.

  • Keep callback helpers transport-agnostic so they can support chat, embeddings, and future model families.

Examples

>>> recorder = UsageRecorder()
>>> callback = make_litellm_cost_callback(recorder)
>>> callable(callback)
True

Attributes

Exceptions

BudgetExceededError

Raised when a usage or cost budget is exceeded.

Classes

UsageEvent

Normalized usage and cost event.

UsageSummary

Aggregate view over recorded usage events.

BudgetPolicy

Simple budget and warning thresholds for usage tracking.

UsageRecorder

In-memory recorder for normalized usage events.

LangChainUsageCallbackHandler

LangChain callback handler that records observed LLM usage metadata.

Functions

build_langchain_usage_event(→ UsageEvent)

Build a normalized event from LangChain usage metadata.

make_litellm_cost_callback(→ Any)

Return a LiteLLM success callback that records cost and usage.

estimate_and_record_langchain_usage(→ UsageEvent)

Estimate cost from LangChain usage metadata and record the result.

extract_usage_metadata(→ dict[str, Any] | None)

Extract best-effort usage metadata from LangChain-style responses.

extract_response_model_name(→ str | None)

Extract a best-effort model name from a LangChain-style response.

record_langchain_response_usage(→ UsageEvent | None)

Record usage from a LangChain response when usage metadata is present.

Module Contents

ooai_llm.callbacks.CountSource[source]
ooai_llm.callbacks.logger[source]
exception ooai_llm.callbacks.BudgetExceededError[source]

Bases: RuntimeError

Raised when a usage or cost budget is exceeded.

class ooai_llm.callbacks.UsageEvent(/, **data: Any)[source]

Bases: pydantic.BaseModel

Normalized usage and cost event.

Parameters:
  • source – Origin of the event, such as langchain or litellm.

  • model – Typed model string.

  • input_tokens – Input token count.

  • output_tokens – Output token count.

total_tokens: Total token count. cost_usd: Actual or estimated USD cost. latency_ms: Measured latency in milliseconds when available. count_source: Provenance for token counts. run_name: Optional LangChain run name or logical application run. tags: Optional framework/application tags. metadata: Optional framework/application metadata. cost_labels: Optional normalized labels used for cost grouping. raw: Original payload fragments.

model_config[source]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

source: str[source]
model: ooai_llm.types.ModelString[source]
input_tokens: int = 0[source]
output_tokens: int = 0[source]
total_tokens: int = 0[source]
cost_usd: decimal.Decimal | None = None[source]
latency_ms: decimal.Decimal | None = None[source]
count_source: CountSource = 'framework_callback'[source]
run_name: str | None = None[source]
tags: list[str] = None[source]
metadata: dict[str, Any] = None[source]
cost_labels: dict[str, str] = None[source]
raw: dict[str, Any] = None[source]
class ooai_llm.callbacks.UsageSummary(/, **data: Any)[source]

Bases: pydantic.BaseModel

Aggregate view over recorded usage events.

model_config[source]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

event_count: int = 0[source]
input_tokens: int = 0[source]
output_tokens: int = 0[source]
total_tokens: int = 0[source]
total_cost_usd: decimal.Decimal[source]
by_model: dict[str, int] = None[source]
by_provider: dict[str, int] = None[source]
by_run: dict[str, int] = None[source]
class ooai_llm.callbacks.BudgetPolicy(/, **data: Any)[source]

Bases: pydantic.BaseModel

Simple budget and warning thresholds for usage tracking.

Parameters:
  • warn_cost_usd – Optional single-event cost warning threshold.

  • error_cost_usd – Optional single-event hard cost threshold.

  • warn_total_tokens – Optional single-event total-token warning threshold.

  • error_total_tokens – Optional single-event hard token threshold.

model_config[source]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

warn_cost_usd: decimal.Decimal | None = None[source]
error_cost_usd: decimal.Decimal | None = None[source]
warn_total_tokens: int | None = None[source]
error_total_tokens: int | None = None[source]
check(event: UsageEvent) list[str][source]

Return warning messages or raise when hard thresholds are crossed.

Parameters:

event – Usage event to evaluate.

Returns:

Warning messages.

Raises:

BudgetExceededError – If a hard threshold is exceeded.

class ooai_llm.callbacks.UsageRecorder(/, **data: Any)[source]

Bases: pydantic.BaseModel

In-memory recorder for normalized usage events.

model_config[source]

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

events: list[UsageEvent] = None[source]
warnings: list[str] = None[source]
record(event: UsageEvent, *, budget: BudgetPolicy | None = None) UsageEvent[source]

Record an event and optionally apply a budget policy.

property total_tokens: int[source]

Return total tokens recorded so far.

property total_cost_usd: decimal.Decimal[source]

Return total cost recorded so far.

property input_tokens: int[source]

Return input tokens recorded so far.

property output_tokens: int[source]

Return output tokens recorded so far.

summary() UsageSummary[source]

Return an aggregate usage summary grouped by model, provider, and run.

ooai_llm.callbacks.build_langchain_usage_event(*, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, cost_usd: decimal.Decimal | None = None, latency_ms: decimal.Decimal | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent[source]

Build a normalized event from LangChain usage metadata.

ooai_llm.callbacks.make_litellm_cost_callback(recorder: UsageRecorder, *, budget: BudgetPolicy | None = None) Any[source]

Return a LiteLLM success callback that records cost and usage.

Parameters:
  • recorder – Recorder instance to update.

  • budget – Optional budget policy to evaluate for each event.

Returns:

LiteLLM-compatible callback function.

ooai_llm.callbacks.estimate_and_record_langchain_usage(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent[source]

Estimate cost from LangChain usage metadata and record the result.

Parameters:
  • recorder – Recorder instance to update.

  • model – Raw or typed model string.

  • usage_metadata – LangChain usage metadata.

  • budget – Optional budget policy.

  • settings – Optional app settings used for LiteLLM enrichment.

  • profile – Optional LangChain model profile.

Returns:

Recorded usage event.

ooai_llm.callbacks.extract_usage_metadata(value: Any) dict[str, Any] | None[source]

Extract best-effort usage metadata from LangChain-style responses.

ooai_llm.callbacks.extract_response_model_name(value: Any) str | None[source]

Extract a best-effort model name from a LangChain-style response.

ooai_llm.callbacks.record_langchain_response_usage(recorder: UsageRecorder, *, response: Any, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'provider_usage_metadata', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent | None[source]

Record usage from a LangChain response when usage metadata is present.

class ooai_llm.callbacks.LangChainUsageCallbackHandler(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None)[source]

LangChain callback handler that records observed LLM usage metadata.

recorder[source]
model[source]
budget = None[source]
settings = None[source]
profile = None[source]
run_name = None[source]
tags = [][source]
metadata[source]
cost_labels[source]
on_llm_end(response: Any, **_: Any) None[source]

Record usage when LangChain reports an LLM result.