Multi-Agent System Benchmark Construction Embodied AI Task Planning
摘要

针对具身空间智能基准构建耗时、难复用及易饱和的问题,本文提出 Embodied-BenchClaw,一个自主多智能体系统。该系统通过意图蓝图、数据收集、结构化清洗、基准合成及评估报告五阶段流水线,自动生成可更新的基准包。系统由规划、构建和评估三个智能体协同工作,并引入可扩展技能库与过程质量控制以提升可靠性。实验表明,该方法能高效构建可验证、可执行且具备诊断价值的多样化具身基准。

AI 推荐理由

论文核心是多智能体系统,其中规划代理协调五阶段流水线,关键涉及任务规划与分解。

研究机构
QiYuan Lab School of Information and Software Engineering, University of Electronic Science and Technology of China Beijing University of Posts and Telecommunications School of Computer Science and Engineering, Northeastern University School of Computer Science and Engineering, Beihang University
论文信息
作者 Baoyang Jiang, Fengchun Zhang, Leyuan Wang, Haotian Li, Yida Wang et al.
发布日期 2026-06-10
arXiv ID 2606.11909
相关性评分 8/10 (高度相关)