CLI Guide¶
The CLI is designed around the same mental model as the Python package:
Catalog and suite commands help you choose and compare models.
ChatModelProfilestores serializable model configuration.LLMis the runtime wrapper that invokes a model and records observed usage/cost.
Use ooai-llm --help when you need orientation. Use ooai-llm recipes when
you want copy/paste CLI and Python package examples.
Install¶
The base console script is installed with the package. Rich table rendering is optional:
pip install "ooai-llm[cli]"
For provider catalogs through LiteLLM:
pip install "ooai-llm[litellm]"
For the interactive Textual UI:
pip install "ooai-llm[tui]"
Common Commands¶
Print guided help:
ooai-llm --help
ooai-llm models --help
ooai-llm models cheapest --help
Print package and terminal recipes:
ooai-llm recipes --topic cheapest
ooai-llm recipes --topic coding
ooai-llm recipes --topic rich
ooai-llm recipes --topic runtime --format markdown
Pretty terminal output is opt-in through extras. Install ooai-llm[cli] for
Rich tables and ooai-llm[tui] for the Textual explorer. Rich tables are used
automatically when installed; pass --no-rich for a plain deterministic table.
Machine output is available for catalog, comparison, suite, and benchmark commands:
ooai-llm models cheapest --providers mistral --format json
ooai-llm models coding --providers openai,mistral --format csv
Interactive exploration:
ooai-llm tui \
--source litellm \
--providers openai,anthropic,mistral \
--input-tokens 10000 \
--output-tokens 2000 \
--budget-usd 10 \
--theme slate \
--views cheapest,coding,catalog \
--refresh-cooldown 2
Inside the TUI, use the top navigation buttons, Tab/Shift+Tab, n/p,
or 1-7 to switch views. Use / to search visible metadata, the provider
selector to narrow rows, and r to refresh after the cooldown. Use
--refresh-cooldown 0 only when you are sure the selected source will not make
expensive provider calls.
Use --theme paper, --theme slate, --theme mono, or --theme forest to
control the TUI palette.
Use --views cheapest,catalog or repeat --view catalog to load only selected
table-backed surfaces. This limits snapshot work before catalog/comparison/suite
helpers are called; it is stronger than filtering after every view has loaded.
CLI Package Layout¶
The terminal surface is implemented as a package so new command groups and rendering helpers can be added without growing one giant module:
ooai_llm.cli.app parser construction, dispatch, and command handlers
ooai_llm.cli.help examples, choices, and shared parser arguments
ooai_llm.cli.recipes copy/paste CLI and Python recipes
The console script still points at ooai_llm.cli:main, so existing installs and
tests do not need a new entry point. The old ooai_llm.cli_help and
ooai_llm.cli_recipes imports remain as compatibility shims.
Future CLI extensions should stay optional. Python’s standard-library
argparse remains the base because it avoids another runtime
dependency, while Rich handles current table output. If shell
completion becomes a priority, add argcomplete as an optional
cli-completion extra. If the command surface becomes truly typed/declarative,
consider a Typer migration only after the current argparse
behavior is covered by compatibility tests.
Cheapest Models¶
Use models cheapest when the question is “how many calls do I get for this
budget and token shape?”
ooai-llm models cheapest \
--source litellm \
--providers mistral \
--input-tokens 10000 \
--output-tokens 2000 \
--budget-usd 10 \
--limit 20
Compare one winner per provider:
ooai-llm models cheapest \
--source litellm \
--providers openai,anthropic,google,deepseek,mistral \
--input-tokens 10000 \
--output-tokens 2000 \
--per-provider
Equivalent Python:
from ooai_llm import compare_model_catalog
comparison = compare_model_catalog(
providers=["mistral"],
source="litellm",
input_tokens=10_000,
output_tokens=2_000,
budget_usd=10,
sort_by="call_cost",
)
for row in comparison.estimates:
print(row.model.as_langchain(), row.call_cost_usd, row.calls_per_budget)
Coding Models¶
Use models coding when you want models marked or inferred as coding-oriented.
It accepts the same cost, context, date, and capability filters as
models compare.
ooai-llm models coding \
--source litellm \
--providers openai,anthropic,google,deepseek,mistral \
--tool-calling-only \
--structured-output-only \
--input-tokens 10000 \
--output-tokens 2000 \
--per-provider
Useful sorts:
--sort call_costranks by representative call cost.--sort output_tokensfinds larger completion budgets.--sort contextfinds larger input/context windows.--sort calls_per_usdranks by calls per dollar.
Equivalent Python:
from ooai_llm import get_coding_model_comparison
comparison = get_coding_model_comparison(
providers=["openai", "anthropic", "mistral"],
source="litellm",
input_tokens=10_000,
output_tokens=2_000,
capabilities=["tool_calling", "structured_output"],
)
for row in comparison.estimates:
print(row.model.as_langchain(), row.call_cost_usd, row.capabilities)
Catalog Filters¶
Use models list when you want raw catalog metadata:
ooai-llm models list \
--source litellm \
--providers mistral \
--sort cost \
--limit 0
Capability filters can be combined:
ooai-llm models list \
--source litellm \
--tool-calling-only \
--parallel-tool-calls-only \
--structured-output-only \
--min-input-tokens 128000 \
--min-output-tokens 8000 \
--sort output_tokens
Cost and release filters:
ooai-llm models list \
--source litellm \
--released-after 2026-01 \
--max-input-cost-per-1m 1 \
--max-output-cost-per-1m 5
Model Suites¶
Suites are reusable shortlists. They are useful for LangGraph node variants, experiments, and enum/dict-style runtime setup.
ooai-llm models suite \
--suite comparison \
--providers openai,anthropic,mistral
Build a suite from catalog filters:
ooai-llm models suite \
--from-catalog \
--source litellm \
--coding-only \
--tool-calling-only \
--structured-output-only \
--sort cost \
--limit 5 \
--format json
Equivalent Python:
from ooai_llm import UsageRecorder, model_suite_from_catalog
suite = model_suite_from_catalog(
providers=["mistral"],
source="litellm",
capabilities=["coding"],
max_output_cost_per_1m=3,
sort_by="cost",
limit=5,
temperature=0,
parallel_tool_calls=True,
)
recorder = UsageRecorder()
runtimes = {
key: profile.create_runtime(recorder=recorder, id=key)
for key, profile in suite.to_profiles().items()
}
Profiles And Runtime¶
Validate, normalize, and resolve profile files:
ooai-llm profiles validate --input profile.json
ooai-llm profiles render --input profile.json
ooai-llm profiles resolve --input profile.json --format json
Package-side runtime setup:
from enum import StrEnum
from ooai_llm import ChatModelProfile, UsageRecorder
class ModelChoice(StrEnum):
CHEAP = "cheap"
CODING = "coding"
profiles = {
ModelChoice.CHEAP: ChatModelProfile(
id="cheap",
model="openai:gpt-5-mini",
temperature=0,
),
ModelChoice.CODING: ChatModelProfile(
id="coding",
model="mistral:YOUR_CODING_MODEL_FROM_CATALOG",
temperature=0,
parallel_tool_calls=True,
),
}
recorder = UsageRecorder()
runtimes = {
key: profile.create_runtime(recorder=recorder, id=str(key))
for key, profile in profiles.items()
}
Catalog comparison estimates cost from assumed token counts. Runtime accounting records observed usage after model calls when LangChain/provider metadata is available.