AEL uses a fast-timescale bandit for memory policy selection and slow-timescale LLM reflection for causal insights, achieving a Sharpe ratio of 2.13 on a 208-episode portfolio benchmark while showing that added mechanisms degrade performance.
Evotool: 11 Self-evolving tool-use policy optimization in llm agents via blame-aware mutation and diversity-aware selection
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.
ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.
citing papers explorer
-
AEL: Agent Evolving Learning for Open-Ended Environments
AEL uses a fast-timescale bandit for memory policy selection and slow-timescale LLM reflection for causal insights, achieving a Sharpe ratio of 2.13 on a 208-episode portfolio benchmark while showing that added mechanisms degrade performance.
-
From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs
SFI-Bench shows current multimodal LLMs struggle to integrate spatial memory with functional reasoning and external knowledge in video tasks.
-
ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing
ActorMind is a four-agent chain-of-thought framework that emulates human actors to produce spontaneous, emotion-infused speech responses for role-playing scenarios.