Herm: Benchmarking and enhancing multimodal llms for human- centric understanding

Keliang Li, Zaifei Yang, Jiahe Zhao, Hongze Shen, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen · 2024 · arXiv 2410.06777

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks

cs.CV · 2024-12-23 · unverdicted · novelty 7.0

HumanVBench provides a 16-task benchmark for human-centric video understanding in MLLMs, created through automated annotation and distractor synthesis pipelines, and shows top models lag human performance on emotion perception and cross-modal alignment.

Employing Vision-Language Models for Face Image Quality Assessment

cs.CV · 2026-05-17 · unverdicted · novelty 5.0

Vision-language models enable zero-shot face image quality assessment whose biometric utility depends on model architecture rather than size, with outputs that align with traditional methods but vary by prompt.

citing papers explorer

Showing 2 of 2 citing papers.

HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks cs.CV · 2024-12-23 · unverdicted · none · ref 31
HumanVBench provides a 16-task benchmark for human-centric video understanding in MLLMs, created through automated annotation and distractor synthesis pipelines, and shows top models lag human performance on emotion perception and cross-modal alignment.
Employing Vision-Language Models for Face Image Quality Assessment cs.CV · 2026-05-17 · unverdicted · none · ref 20
Vision-language models enable zero-shot face image quality assessment whose biometric utility depends on model architecture rather than size, with outputs that align with traditional methods but vary by prompt.

Herm: Benchmarking and enhancing multimodal llms for human- centric understanding

fields

years

verdicts

representative citing papers

citing papers explorer