EDV decouples execution, distillation by a third-party agent, and consensus verification to filter erroneous trajectories in LLM agent experience learning, outperforming baselines on tau2-bench, Mind2Web, and MMTB.
Tanzib Hosain, Salman Rahman, Md
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.
A multi-agent forensic system integrates multiple evidence sources and debate to detect AI-generated images, reporting 97.05% accuracy on a 6,000-image benchmark while outperforming traditional classifiers.
Empirical comparison shows APPLE, FedGC, and FedProto outperform other PFL algorithms on MNIST, SignMNIST, and Digit5 using accuracy, precision, recall, and F1 score.
citing papers explorer
-
Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning
EDV decouples execution, distillation by a third-party agent, and consensus verification to filter erroneous trajectories in LLM agent experience learning, outperforming baselines on tau2-bench, Mind2Web, and MMTB.
-
MetaPS: Adaptive Programmatic Strategy Selection for Market Agents
MetaPS trains models via simulation rollouts to select from programmatic strategy libraries for market agents, yielding better performance than fixed or direct LLM baselines across model sizes.
-
Pattern Recognition Tasks with Personalized Federated Learning
Empirical comparison shows APPLE, FedGC, and FedProto outperform other PFL algorithms on MNIST, SignMNIST, and Digit5 using accuracy, precision, recall, and F1 score.