The work creates a new benchmark for humanizing GUI agent touch dynamics via a MinMax detector-agent model, a mobile touch dataset, and methods showing agents can match human behavior without losing task performance.
The rise and potential of large language model based agents: A survey.Science China Information Sciences, 68(2):121101
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 8roles
background 2polarities
background 2representative citing papers
TS-Agent is an agentic framework that uses LLMs only for evidence-based reasoning while delegating extraction to raw time series tools, matching or exceeding baselines on four benchmarks with largest gains on reasoning tasks.
Verifiable Process Rewards (VPR) converts symbolic oracles into dense turn-level supervision for reinforcement learning in agentic reasoning, outperforming outcome-only rewards and transferring to general benchmarks.
BadSkill poisons embedded models in agent skills to achieve up to 99.5% attack success rate on triggered tasks with only 3% poison rate while preserving normal behavior on non-trigger inputs.
LLM agent societies develop power-law coordination cascades and intellectual elites through an integration bottleneck that grows with system size.
Holos is a five-layer LLM-based multi-agent system architecture using the Nuwa engine for agent generation, a market-driven Orchestrator for coordination, and an endogenous value cycle for incentive-compatible persistence in the Agentic Web.
Personal agents require edge deployment to preserve high-fidelity local context and zero-latency loops, as claimed through three structural shifts away from cloud-centric designs.
Explicit provenance across the full agentic AI lifecycle is the necessary condition for making responsibility computable and actionable.
citing papers explorer
-
Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization
The work creates a new benchmark for humanizing GUI agent touch dynamics via a MinMax detector-agent model, a mobile touch dataset, and methods showing agents can match human behavior without losing task performance.
-
TS-Agent: Understanding and Reasoning Over Raw Time Series via Iterative Insight Gathering
TS-Agent is an agentic framework that uses LLMs only for evidence-based reasoning while delegating extraction to raw time series tools, matching or exceeding baselines on four benchmarks with largest gains on reasoning tasks.
-
Verifiable Process Rewards for Agentic Reasoning
Verifiable Process Rewards (VPR) converts symbolic oracles into dense turn-level supervision for reinforcement learning in agentic reasoning, outperforming outcome-only rewards and transferring to general benchmarks.
-
BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning
BadSkill poisons embedded models in agent skills to achieve up to 99.5% attack success rate on triggered tasks with only 3% poison rate while preserving normal behavior on non-trigger inputs.
-
Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems
LLM agent societies develop power-law coordination cascades and intellectual elites through an integration bottleneck that grows with system size.
-
Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web
Holos is a five-layer LLM-based multi-agent system architecture using the Nuwa engine for agent generation, a market-driven Orchestrator for coordination, and an endogenous value cycle for incentive-compatible persistence in the Agentic Web.
-
Beyond Scaling: Agents Are Heading to the Edge
Personal agents require edge deployment to preserve high-fidelity local context and zero-latency loops, as claimed through three structural shifts away from cloud-centric designs.
-
Responsible Agentic AI Requires Explicit Provenance
Explicit provenance across the full agentic AI lifecycle is the necessary condition for making responsibility computable and actionable.