Presents CUActSpot benchmark and renderer-LLM data synthesis that lets a 4B model outperform larger open-source models on complex computer interactions.
Visualagentbench: Towards large multimodal models as visual foundation agents
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
CTM-AI combines a formal consciousness model with foundation models to report state-of-the-art results on sarcasm detection, humor, and agentic tool-use benchmarks.
A new QoS-QoE Translation dataset is constructed from multimedia literature and fine-tuned LLMs demonstrate strong performance on bidirectional continuous and discrete QoS-QoE predictions.
citing papers explorer
-
Covering Human Action Space for Computer Use: Data Synthesis and Benchmark
Presents CUActSpot benchmark and renderer-LLM data synthesis that lets a 4B model outperform larger open-source models on complex computer interactions.
-
CTM-AI: A Blueprint for General AI Inspired by a Model of Consciousness
CTM-AI combines a formal consciousness model with foundation models to report state-of-the-art results on sarcasm detection, humor, and agentic tool-use benchmarks.
-
QoS-QoE Translation with Large Language Model
A new QoS-QoE Translation dataset is constructed from multimedia literature and fine-tuned LLMs demonstrate strong performance on bidirectional continuous and discrete QoS-QoE predictions.