February 15, 2026Research

B01-NUna model card and orchestration overview

How multi-model routing, reasoning paths, and published benchmarks fit together in production.

Abstract neural network visualization — B01-NUna model orchestration

B01-NUna is an orchestration layer: it analyzes each request, selects among configured model providers, and composes tools behind one interface. It is not a single monolithic model trained end-to-end in this repository.

We publish a model card so users, partners, and researchers can see how production routing works — without mystique or hand-waving.

Production routing, honestly

Default cloud chat routes general turns to Groq-hosted Llama 3.3 70B (llama-3.3-70b-versatile). Reasoning-heavy queries can route to Groq-hosted GPT-OSS 120B (openai/gpt-oss-120b). Both defaults are configurable via environment variables.

Routing is heuristic and score-based — query analysis, provider capabilities, latency and cost hints, circuit-breaker state, and trust-weighted feedback — not a jointly trained softmax router.

Benchmarks and R&D

In-product backbone benchmark figures are sourced from Meta’s Llama 3.3 70B Instruct model card
End-to-end app performance depends on routing, provider load, prompt policy, and tool usage
Optional Hub LoRA adapter work (pejmantheory/B01-NUna) is R&D and backup — not the default production path on helloblue.ai
Availability targets in the card are operational goals, not contractual SLAs

Read the full model card