ooai_llm.callbacks¶
Usage and cost callback helpers.
- Purpose:
Provide ergonomic callback helpers that work with LangChain usage metadata and the native LiteLLM callback interface.
- Design:
Normalize usage and cost into a shared
UsageEventmodel.Offer a recorder object that can accumulate usage across many calls.
Expose a LiteLLM-compatible success callback factory for cost and usage tracking.
Keep callback helpers transport-agnostic so they can support chat, embeddings, and future model families.
Examples
>>> recorder = UsageRecorder()
>>> callback = make_litellm_cost_callback(recorder)
>>> callable(callback)
True
Attributes¶
Exceptions¶
Raised when a usage or cost budget is exceeded. |
Classes¶
Normalized usage and cost event. |
|
Aggregate view over recorded usage events. |
|
Simple budget and warning thresholds for usage tracking. |
|
In-memory recorder for normalized usage events. |
|
LangChain callback handler that records observed LLM usage metadata. |
Functions¶
|
Build a normalized event from LangChain usage metadata. |
|
Return a LiteLLM success callback that records cost and usage. |
|
Estimate cost from LangChain usage metadata and record the result. |
|
Extract best-effort usage metadata from LangChain-style responses. |
|
Extract a best-effort model name from a LangChain-style response. |
|
Record usage from a LangChain response when usage metadata is present. |
Module Contents¶
- exception ooai_llm.callbacks.BudgetExceededError[source]¶
Bases:
RuntimeErrorRaised when a usage or cost budget is exceeded.
- class ooai_llm.callbacks.UsageEvent(/, **data: Any)[source]¶
Bases:
pydantic.BaseModelNormalized usage and cost event.
- Parameters:
source – Origin of the event, such as
langchainorlitellm.model – Typed model string.
input_tokens – Input token count.
output_tokens – Output token count.
total_tokens: Total token count. cost_usd: Actual or estimated USD cost. latency_ms: Measured latency in milliseconds when available. count_source: Provenance for token counts. run_name: Optional LangChain run name or logical application run. tags: Optional framework/application tags. metadata: Optional framework/application metadata. cost_labels: Optional normalized labels used for cost grouping. raw: Original payload fragments.
- model_config[source]¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- count_source: CountSource = 'framework_callback'[source]¶
- class ooai_llm.callbacks.UsageSummary(/, **data: Any)[source]¶
Bases:
pydantic.BaseModelAggregate view over recorded usage events.
- class ooai_llm.callbacks.BudgetPolicy(/, **data: Any)[source]¶
Bases:
pydantic.BaseModelSimple budget and warning thresholds for usage tracking.
- Parameters:
warn_cost_usd – Optional single-event cost warning threshold.
error_cost_usd – Optional single-event hard cost threshold.
warn_total_tokens – Optional single-event total-token warning threshold.
error_total_tokens – Optional single-event hard token threshold.
- model_config[source]¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- check(event: UsageEvent) list[str][source]¶
Return warning messages or raise when hard thresholds are crossed.
- Parameters:
event – Usage event to evaluate.
- Returns:
Warning messages.
- Raises:
BudgetExceededError – If a hard threshold is exceeded.
- class ooai_llm.callbacks.UsageRecorder(/, **data: Any)[source]¶
Bases:
pydantic.BaseModelIn-memory recorder for normalized usage events.
- model_config[source]¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- events: list[UsageEvent] = None[source]¶
- record(event: UsageEvent, *, budget: BudgetPolicy | None = None) UsageEvent[source]¶
Record an event and optionally apply a budget policy.
- summary() UsageSummary[source]¶
Return an aggregate usage summary grouped by model, provider, and run.
- ooai_llm.callbacks.build_langchain_usage_event(*, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, cost_usd: decimal.Decimal | None = None, latency_ms: decimal.Decimal | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent[source]¶
Build a normalized event from LangChain usage metadata.
- ooai_llm.callbacks.make_litellm_cost_callback(recorder: UsageRecorder, *, budget: BudgetPolicy | None = None) Any[source]¶
Return a LiteLLM success callback that records cost and usage.
- Parameters:
recorder – Recorder instance to update.
budget – Optional budget policy to evaluate for each event.
- Returns:
LiteLLM-compatible callback function.
- ooai_llm.callbacks.estimate_and_record_langchain_usage(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, usage_metadata: collections.abc.Mapping[str, Any] | None, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'framework_callback', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent[source]¶
Estimate cost from LangChain usage metadata and record the result.
- Parameters:
recorder – Recorder instance to update.
model – Raw or typed model string.
usage_metadata – LangChain usage metadata.
budget – Optional budget policy.
settings – Optional app settings used for LiteLLM enrichment.
profile – Optional LangChain model profile.
- Returns:
Recorded usage event.
- ooai_llm.callbacks.extract_usage_metadata(value: Any) dict[str, Any] | None[source]¶
Extract best-effort usage metadata from LangChain-style responses.
- ooai_llm.callbacks.extract_response_model_name(value: Any) str | None[source]¶
Extract a best-effort model name from a LangChain-style response.
- ooai_llm.callbacks.record_langchain_response_usage(recorder: UsageRecorder, *, response: Any, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, count_source: CountSource = 'provider_usage_metadata', run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None) UsageEvent | None[source]¶
Record usage from a LangChain response when usage metadata is present.
- class ooai_llm.callbacks.LangChainUsageCallbackHandler(recorder: UsageRecorder, *, model: str | ooai_llm.types.ModelString, budget: BudgetPolicy | None = None, settings: Any = None, profile: collections.abc.Mapping[str, Any] | None = None, run_name: str | None = None, tags: list[str] | tuple[str, Ellipsis] | None = None, metadata: collections.abc.Mapping[str, Any] | None = None, cost_labels: collections.abc.Mapping[str, str] | None = None)[source]¶
LangChain callback handler that records observed LLM usage metadata.