Building a Production-Grade Evaluation Stack for Agentic Systems
A technical framework for measuring agent outcome correctness, external state mutation, token flow, runtime cost, and optimization surfaces.
这里记录医药医疗 AI 平台上的实现细节:模型路由、知识编排、工作台交付与合规留痕。文章为团队原创或经审校的技术分享。
A technical framework for measuring agent outcome correctness, external state mutation, token flow, runtime cost, and optimization surfaces.
Redundancy-aware context selection for token-budgeted RAG (fallback when meta.json is unavailable).