X-SYNTH synthesizes enterprise context from digital human attention using Digital Twin Signatures and seven attention filters, raising true lead rate from 9.5% to 61.9% while cutting false lead rate to 18.8%.
Towards Enterprise-Ready computer using generalist agent
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
baseline 1polarities
baseline 1representative citing papers
HCL-GP learns parameterized policies and reuses extracted components to achieve 98% accuracy on AppWorld benchmark tasks for LLM agents, outperforming static synthesis by 15.8 points on challenges.
Empirical evaluation of four natural language plan representations in a static planner-executor framework shows that plan formulation and the underlying LLM both affect LLM web-agent robustness and task success on hard WebArena tasks.
CUGA introduces a runtime governance architecture that enforces policies at five checkpoints in generalist agent execution pipelines for predictable and compliant behavior.
Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.
citing papers explorer
-
X-SYNTH: Beyond Retrieval -- Enterprise Context Synthesis from Observed Digital Human Attention
X-SYNTH synthesizes enterprise context from digital human attention using Digital Twin Signatures and seven attention filters, raising true lead rate from 9.5% to 61.9% while cutting false lead rate to 18.8%.
-
Learning and Reusing Policy Decompositions for Hierarchical Generalized Planning with LLM Agents
HCL-GP learns parameterized policies and reuses extracted components to achieve 98% accuracy on AppWorld benchmark tasks for LLM agents, outperforming static synthesis by 15.8 points on challenges.
-
Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents
Empirical evaluation of four natural language plan representations in a static planner-executor framework shows that plan formulation and the underlying LLM both affect LLM web-agent robustness and task success on hard WebArena tasks.
-
Governance by Construction for Generalist Agents
CUGA introduces a runtime governance architecture that enforces policies at five checkpoints in generalist agent execution pipelines for predictable and compliant behavior.
-
Agent Mentor: Framing Agent Knowledge through Semantic Trajectory Analysis
Agent Mentor analyzes semantic trajectories in agent logs to identify undesired behaviors and derives corrective prompt instructions, yielding measurable accuracy gains on benchmark tasks across three agent setups.