pith. sign in

arxiv: 2406.14877 · v1 · pith:PVHCEXZ7new · submitted 2024-06-21 · 💻 cs.CL

Sports Intelligence: Assessing the Sports Understanding Capabilities of Language Models through Question Answering from Text to Video

classification 💻 cs.CL
keywords sportsunderstandinglanguagecapabilitiesmodelsreasoninganalysischallenges
0
0 comments X
read the original abstract

Understanding sports is crucial for the advancement of Natural Language Processing (NLP) due to its intricate and dynamic nature. Reasoning over complex sports scenarios has posed significant challenges to current NLP technologies which require advanced cognitive capabilities. Toward addressing the limitations of existing benchmarks on sports understanding in the NLP field, we extensively evaluated mainstream large language models for various sports tasks. Our evaluation spans from simple queries on basic rules and historical facts to complex, context-specific reasoning, leveraging strategies from zero-shot to few-shot learning, and chain-of-thought techniques. In addition to unimodal analysis, we further assessed the sports reasoning capabilities of mainstream video language models to bridge the gap in multimodal sports understanding benchmarking. Our findings highlighted the critical challenges of sports understanding for NLP. We proposed a new benchmark based on a comprehensive overview of existing sports datasets and provided extensive error analysis which we hope can help identify future research priorities in this field.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Counterfactual Reasoning for Fine-Grained Evidence Disentanglement in VideoQA

    cs.CV 2026-06 unverdicted novelty 6.0

    CREDiT applies counterfactual reasoning via structural causal models to decompose video representations into causal and non-causal parts for more reliable VideoQA on datasets like NExT-GQA and SportsQA.