Recognition: 3 theorem links
· Lean TheoremMicrosoft COCO Captions: Data Collection and Evaluation Server
Pith reviewed 2026-05-12 21:32 UTC · model grok-4.3
The pith
Microsoft COCO Caption dataset collects over 1.5 million human captions for more than 330,000 images and supplies an evaluation server using standard metrics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption generation algorithms, an evaluation server is used. The evaluation server receives candidate captions and scores them using several popular metrics, including BLEU, METEOR, ROUGE and CIDEr.
What carries the argument
The COCO Captions dataset with five independent human captions per training and validation image together with the public evaluation server that scores submitted captions using BLEU, METEOR, ROUGE, and CIDEr.
Load-bearing premise
The collected human captions are sufficiently consistent, high-quality, and representative to serve as reliable ground truth for automatic systems.
What would settle it
An experiment in which human judges systematically prefer machine captions that receive low scores from the server on all four metrics would falsify the claim that the server provides a useful proxy for caption quality.
read the original abstract
In this paper we describe the Microsoft COCO Caption dataset and evaluation server. When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided. To ensure consistency in evaluation of automatic caption generation algorithms, an evaluation server is used. The evaluation server receives candidate captions and scores them using several popular metrics, including BLEU, METEOR, ROUGE and CIDEr. Instructions for using the evaluation server are provided.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper describes the Microsoft COCO Captions dataset and evaluation server. Upon completion, the dataset will contain over 1.5 million captions describing over 330,000 images, with five independent human-generated captions provided for each training and validation image. The evaluation server receives candidate captions and scores them using standard metrics including BLEU, METEOR, ROUGE, and CIDEr to promote consistent evaluation of automatic caption generation algorithms.
Significance. If the dataset is collected and the server implemented as described, this provides a valuable large-scale resource for image captioning research, enabling training on substantial data volumes and standardized benchmarking via established metrics on a public platform. The scale exceeds prior caption datasets and supports reproducible comparisons across methods.
minor comments (2)
- [Abstract] The abstract and description focus on planned scale and metrics but omit any details on the annotation protocol, quality control, or inter-annotator agreement measures, which would strengthen the presentation of the data collection process.
- No example captions, sample images, or illustrative server output are provided, which would improve clarity for readers unfamiliar with the dataset style or evaluation format.
Simulated Author's Rebuttal
We thank the referee for their positive review of the manuscript and for recommending acceptance. The referee's summary correctly reflects the scope and purpose of the Microsoft COCO Captions dataset and evaluation server.
Circularity Check
No significant circularity identified
full rationale
The paper is a purely descriptive release of the COCO Captions dataset and evaluation server. It states planned scale (over 1.5M captions for >330k images, five per training/validation image) and adoption of existing metrics (BLEU, METEOR, ROUGE, CIDEr) via a public server. No derivations, equations, predictions, fitted parameters, or optimality claims appear. Consequently no load-bearing steps exist that could reduce to self-definition, fitted inputs, or self-citation chains. The central content is factual description of data collection and tooling.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith.Foundation.LawOfExistencedefect_zero_iff_one unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided.
-
IndisputableMonolith.Foundation.DAlembert.Inevitabilitybilinear_family_forced unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The evaluation server receives candidate captions and scores them using several popular metrics, including BLEU, METEOR, ROUGE and CIDEr.
-
IndisputableMonolith.Foundation.PhiForcingphi_equation unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Instructions for using the evaluation server are provided.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 35 Pith papers
-
Challenging Vision-Language Models with Physically Deployable Multimodal Semantic Lighting Attacks
MSLA is the first physically deployable attack that uses adversarial lighting to break semantic alignment in VLMs such as CLIP, LLaVA, and BLIP, causing classification failures and hallucinations in real scenes.
-
Tessera: Unlocking Heterogeneous GPUs through Kernel-Granularity Disaggregation
Tessera performs kernel-granularity disaggregation on heterogeneous GPUs, achieving up to 2.3x throughput and 1.6x cost efficiency gains for large model inference while generalizing beyond prior methods.
-
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
Molmo VLMs trained on newly collected PixMo open datasets achieve state-of-the-art performance among open-weight models and surpass multiple proprietary VLMs including Claude 3.5 Sonnet and Gemini 1.5 Pro.
-
OxyEcomBench: Benchmarking Multimodal Foundation Models across E-Commerce Ecosystems
OxyEcomBench is a unified multimodal benchmark covering 6 capability areas and 29 tasks with authentic e-commerce data to measure how well foundation models handle real platform, merchant, and customer challenges.
-
Exploring Hierarchical Consistency and Unbiased Objectness for Open-Vocabulary Object Detection
Hierarchical confidence calibration and LoCLIP adaptation improve pseudo-label quality for open-vocabulary object detection, achieving new state-of-the-art results on COCO and LVIS benchmarks.
-
GaLa: Hypergraph-Guided Visual Language Models for Procedural Planning
GaLa uses hypergraph representations of objects and a TriView encoder with contrastive learning to improve vision-language models on procedural planning benchmarks.
-
S-GRPO: Unified Post-Training for Large Vision-Language Models
S-GRPO unifies SFT and RL for LVLMs via conditional ground-truth injection that supplies a maximal-reward anchor when group exploration fails completely.
-
Vision-Language Foundation Models for Comprehensive Automated Pavement Condition Assessment
Instruction-tuned vision-language model PaveGPT, trained on a large unified pavement dataset, achieves substantial gains over general models in comprehensive, standard-compliant pavement condition assessment.
-
DetailVerifyBench: A Benchmark for Dense Hallucination Localization in Long Image Captions
DetailVerifyBench supplies 1,000 images and densely annotated long captions to evaluate precise hallucination localization in multimodal large language models.
-
Batch Loss Score for Dynamic Data Pruning
BLS approximates per-sample loss importance via EMA of batch losses, enabling simple and effective dynamic pruning of 20-50% samples losslessly across many datasets and models.
-
VideoChat: Chat-Centric Video Understanding
VideoChat integrates video models and LLMs via a learnable interface for chat-based spatiotemporal and causal video reasoning, trained on a new video-centric instruction dataset.
-
A Generalist Agent
Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.
-
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo models reach new state-of-the-art few-shot results on image and video tasks by bridging frozen vision and language models with cross-attention layers trained on interleaved web-scale data.
-
Learning to See What You Need: Gaze Attention for Multimodal Large Language Models
Gaze Attention groups visual embeddings into selectable regions and dynamically restricts attention to task-relevant ones, matching dense baselines with up to 90% fewer visual KV entries via added context tokens.
-
MSD-Score: Multi-Scale Distributional Scoring for Reference-Free Image Caption Evaluation
MSD-Score introduces multi-scale distributional scoring on von Mises-Fisher mixtures to evaluate image captions without references and reports state-of-the-art correlation with human judgments.
-
Sentinel2Cap: A Human-Annotated Benchmark Dataset for Multimodal Remote Sensing Image Captioning
Sentinel2Cap provides human-annotated captions for multimodal Sentinel satellite images, with zero-shot tests showing RGB outperforming SAR and prompts helping performance.
-
Statistical Consistency and Generalization of Contrastive Representation Learning
Contrastive representation learning is statistically consistent for optimal retrieval and admits generalization bounds of order O(1/m + 1/sqrt(n)) supervised and O(1/sqrt(m) + 1/sqrt(n)) self-supervised that benefit f...
-
EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure
EASE closes three residual anchors in federated multimodal unlearning using bilateral displacement, cosine-sine decomposition, and forget lock, achieving near-retrain performance on forget and retain data.
-
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
TIPSv2 improves dense patch-text alignment in vision-language pretraining through distillation and iBOT++ modifications, yielding models on par with or better than recent baselines on 9 tasks across 20 datasets.
-
Saliency-R1: Enforcing Interpretable and Faithful Vision-language Reasoning via Saliency-map Alignment Reward
Saliency-R1 uses a novel saliency map technique and GRPO with human bounding-box overlap as reward to improve VLM reasoning faithfulness and interpretability.
-
LinguDistill: Recovering Linguistic Ability in Vision-Language Models via Selective Cross-Modal Distillation
LinguDistill recovers approximately 10% of lost performance on language benchmarks in VLMs by selectively distilling from a frozen LM teacher using KV-cache sharing, while preserving vision performance.
-
Emu3: Next-Token Prediction is All You Need
Emu3 shows that next-token prediction on a unified discrete token space for text, images, and video lets a single transformer outperform task-specific models such as SDXL and LLaVA-1.6 in multimodal generation and perception.
-
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions
A new 1.2M-caption dataset generated via GPT-4V improves LMMs on MME and MMBench by 222.8/22.0/22.3 and 2.7/1.3/1.5 points respectively when used for supervised fine-tuning.
-
MMBench: Is Your Multi-modal Model an All-around Player?
MMBench is a new bilingual benchmark that uses curated questions, CircularEval, and LLM-assisted answer conversion to provide objective, fine-grained evaluation of vision-language models.
-
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
MME is a manually annotated benchmark evaluating MLLMs on perception and cognition across 14 subtasks to avoid data leakage and support fair model comparisons.
-
Otter: A Multi-Modal Model with In-Context Instruction Tuning
Otter is a multi-modal model instruction-tuned on the MIMIC-IT dataset of over 3 million in-context instruction-response pairs to improve convergence and generalization on tasks with multiple images and videos.
-
CoCa: Contrastive Captioners are Image-Text Foundation Models
CoCa unifies contrastive and generative pretraining in one image-text model to reach 86.3% zero-shot ImageNet accuracy and new state-of-the-art results on multiple downstream benchmarks.
-
VLA Foundry: A Unified Framework for Training Vision-Language-Action Models
VLA Foundry provides a single training stack for VLA models and releases open models that match prior closed-source performance or outperform baselines on multi-task manipulation in simulation.
-
From Heads to Neurons: Causal Attribution and Steering in Multi-Task Vision-Language Models
HONES ranks feed-forward neurons by their causal contributions from task-relevant attention heads and uses lightweight scaling to steer performance on multiple vision-language tasks.
-
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
InternVL scales a vision model to 6B parameters and aligns it with LLMs using web data to achieve state-of-the-art results on 32 visual-linguistic benchmarks.
-
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
LLaMA-Adapter V2 achieves open-ended visual instruction following in LLMs by unlocking more parameters, early fusion of visual tokens, and joint training on disjoint parameter groups with only 14M added parameters.
-
ZAYA1-VL-8B Technical Report
ZAYA1-VL-8B is a new MoE vision-language model with vision-specific LoRA adapters and bidirectional image attention that reports competitive performance against several 3B-4B models on image, reasoning, and counting b...
-
Empowering Video Translation using Multimodal Large Language Models
The paper offers the first focused review of MLLM-based video translation organized by a three-role taxonomy of Semantic Reasoner, Expressive Performer, and Visual Synthesizer, plus open challenges.
-
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
InternVL 1.5 narrows the performance gap to proprietary multimodal models via a stronger transferable vision encoder, dynamic high-resolution tiling, and curated English-Chinese training data.
-
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
OpenFlamingo provides open-source autoregressive vision-language models that achieve 80-89% of Flamingo performance on seven vision-language datasets.
Reference graph
Works this paper leans on
-
[1]
Learning the semantics of words and pictures,
K. Barnard and D. Forsyth, “Learning the semantics of words and pictures,” in ICCV, vol. 2, 2001, pp. 408–415
work page 2001
-
[2]
K. Barnard, P . Duygulu, D. Forsyth, N. De Freitas, D. M. Blei, and M. I. Jordan, “Matching words and pictures,” JMLR, vol. 3, pp. 1107–1135, 2003
work page 2003
-
[3]
A model for learning the semantics of pictures,
V . Lavrenko, R. Manmatha, and J. Jeon, “A model for learning the semantics of pictures,” in NIPS, 2003
work page 2003
-
[4]
Baby talk: Understanding and generating simple image descriptions,
G. Kulkarni, V . Premraj, S. Dhar, S. Li, Y. Choi, A. C. Berg, and T. L. Berg, “Baby talk: Understanding and generating simple image descriptions,” in CVPR, 2011
work page 2011
-
[5]
Midge: Generating image descriptions from computer vision detections,
M. Mitchell, X. Han, J. Dodge, A. Mensch, A. Goyal, A. Berg, K. Yamaguchi, T. Berg, K. Stratos, and H. Daum ´e III, “Midge: Generating image descriptions from computer vision detections,” in EACL, 2012
work page 2012
-
[6]
Every picture tells a story: Generating sentences from images,
A. Farhadi, M. Hejrati, M. A. Sadeghi, P . Young, C. Rashtchian, J. Hockenmaier, and D. Forsyth, “Every picture tells a story: Generating sentences from images,” in ECCV, 2010
work page 2010
-
[7]
Framing image de- scription as a ranking task: Data, models and evaluation metrics
M. Hodosh, P . Young, and J. Hockenmaier, “Framing image de- scription as a ranking task: Data, models and evaluation metrics.” JAIR, vol. 47, pp. 853–899, 2013
work page 2013
-
[8]
Collective generation of natural image descriptions,
P . Kuznetsova, V . Ordonez, A. C. Berg, T. L. Berg, and Y. Choi, “Collective generation of natural image descriptions,” in ACL, 2012
work page 2012
-
[9]
Corpus- guided sentence generation of natural images,
Y. Yang, C. L. Teo, H. Daum ´e III, and Y. Aloimonos, “Corpus- guided sentence generation of natural images,” in EMNLP, 2011
work page 2011
-
[10]
Choosing linguistics over vision to describe images
A. Gupta, Y. Verma, and C. Jawahar, “Choosing linguistics over vision to describe images.” in AAAI, 2012
work page 2012
-
[11]
Distributional semantics in technicolor,
E. Bruni, G. Boleda, M. Baroni, and N.-K. Tran, “Distributional semantics in technicolor,” in ACL, 2012
work page 2012
-
[12]
Automatic caption generation for news images,
Y. Feng and M. Lapata, “Automatic caption generation for news images,” TP AMI, vol. 35, no. 4, pp. 797–812, 2013
work page 2013
-
[13]
Image description using visual depen- dency representations
D. Elliott and F. Keller, “Image description using visual depen- dency representations.” in EMNLP, 2013, pp. 1292–1302
work page 2013
-
[14]
Deep fragment embeddings for bidirectional image sentence mapping,
A. Karpathy, A. Joulin, and F.-F. Li, “Deep fragment embeddings for bidirectional image sentence mapping,” in NIPS, 2014
work page 2014
-
[15]
Improving image-sentence embeddings using large weakly an- notated photo collections,
Y. Gong, L. Wang, M. Hodosh, J. Hockenmaier, and S. Lazebnik, “Improving image-sentence embeddings using large weakly an- notated photo collections,” in ECCV, 2014, pp. 529–545
work page 2014
-
[16]
Nonparametric method for data- driven image captioning,
R. Mason and E. Charniak, “Nonparametric method for data- driven image captioning,” in ACL, 2014
work page 2014
-
[17]
Treetalk: Com- position and compression of trees for image descriptions,
P . Kuznetsova, V . Ordonez, T. Berg, and Y. Choi, “Treetalk: Com- position and compression of trees for image descriptions,” TACL, vol. 2, pp. 351–362, 2014
work page 2014
-
[18]
Autocaption: Automatic caption generation for personal photos,
K. Ramnath, S. Baker, L. Vanderwende, M. El-Saban, S. N. Sinha, A. Kannan, N. Hassan, M. Galley, Y. Yang, D. Ramanan, A. Bergamo, and L. Torresani, “Autocaption: Automatic caption generation for personal photos,” in WACV, 2014
work page 2014
-
[19]
Is this a wampimuk? cross-modal mapping between distributional semantics and the visual world,
A. Lazaridou, E. Bruni, and M. Baroni, “Is this a wampimuk? cross-modal mapping between distributional semantics and the visual world,” in ACL, 2014
work page 2014
-
[20]
Multimodal neural language models,
R. Kiros, R. Salakhutdinov, and R. Zemel, “Multimodal neural language models,” in ICML, 2014
work page 2014
-
[21]
Explain im- ages with multimodal recurrent neural networks,
J. Mao, W. Xu, Y. Yang, J. Wang, and A. L. Yuille, “Explain im- ages with multimodal recurrent neural networks,” arXiv preprint arXiv:1410.1090, 2014
-
[22]
Show and tell: A neural image caption generator,
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” arXiv preprint arXiv:1411.4555 , 2014
-
[23]
Deep visual-semantic alignments for generating image descriptions,
A. Karpathy and L. Fei-Fei, “Deep visual-semantic alignments for generating image descriptions,” arXiv preprint arXiv:1412.2306, 2014
-
[24]
Unifying visual- semantic embeddings with multimodal neural language models,
R. Kiros, R. Salakhutdinov, and R. S. Zemel, “Unifying visual- semantic embeddings with multimodal neural language models,” arXiv preprint arXiv:1411.2539 , 2014
-
[25]
Long-term recurrent convolutional networks for visual recognition and description,
J. Donahue, L. A. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell, “Long-term recurrent convolutional networks for visual recognition and description,” arXiv preprint arXiv:1411.4389 , 2014
-
[26]
From captions to visual concepts and back,
H. Fang, S. Gupta, F. Iandola, R. Srivastava, L. Deng, P . Doll ´ar, J. Gao, X. He, M. Mitchell, J. Platt et al. , “From captions to visual concepts and back,” arXiv preprint arXiv:1411.4952 , 2014
-
[27]
Learning a recurrent visual representa- tion for image caption generation,
X. Chen and C. L. Zitnick, “Learning a recurrent visual representa- tion for image caption generation,” arXiv preprint arXiv:1411.5654 , 2014
-
[28]
Phrase-based image captioning,
R. Lebret, P . O. Pinheiro, and R. Collobert, “Phrase-based image captioning,” arXiv preprint arXiv:1502.03671 , 2015
-
[29]
Simple image description generator via a linear phrase- based approach,
——, “Simple image description generator via a linear phrase- based approach,” arXiv preprint arXiv:1412.8419 , 2014
-
[30]
Combining language and vision with a multimodal skip-gram model,
A. Lazaridou, N. T. Pham, and M. Baroni, “Combining language and vision with a multimodal skip-gram model,” arXiv preprint arXiv:1501.02598, 2015
-
[31]
ImageNet classifica- tion with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classifica- tion with deep convolutional neural networks,” in NIPS, 2012
work page 2012
-
[32]
S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation , vol. 9, no. 8, pp. 1735–1780, 1997
work page 1997
-
[33]
Im- ageNet: A Large-Scale Hierarchical Image Database,
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Im- ageNet: A Large-Scale Hierarchical Image Database,” in CVPR, 2009
work page 2009
-
[34]
The iapr tc- 12 benchmark: A new evaluation resource for visual information systems,
M. Grubinger, P . Clough, H. M¨uller, and T. Deselaers, “The iapr tc- 12 benchmark: A new evaluation resource for visual information systems,” in LREC Workshop on Language Resources for Content- based Image Retrieval , 2006
work page 2006
-
[35]
Im2text: Describing images using 1 million captioned photographs
V . Ordonez, G. Kulkarni, and T. Berg, “Im2text: Describing images using 1 million captioned photographs.” in NIPS, 2011
work page 2011
-
[36]
P . Young, A. Lai, M. Hodosh, and J. Hockenmaier, “From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions,” TACL, vol. 2, pp. 67– 78, 2014
work page 2014
-
[37]
D ´ej´a image- captions: A corpus of expressive image descriptions in repetition,
J. Chen, P . Kuznetsova, D. Warren, and Y. Choi, “D ´ej´a image- captions: A corpus of expressive image descriptions in repetition,” in NAACL, 2015
work page 2015
-
[38]
Microsoft COCO: Common objects in context,
T. Lin, M. Maire, S. Belongie, J. Hays, P . Perona, D. Ramanan, P . Doll´ar, and C. L. Zitnick, “Microsoft COCO: Common objects in context,” in ECCV, 2014
work page 2014
-
[39]
Bleu: a method for automatic evaluation of machine translation,
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in ACL, 2002
work page 2002
-
[40]
Rouge: A package for automatic evaluation of sum- maries,
C.-Y. Lin, “Rouge: A package for automatic evaluation of sum- maries,” in ACL Workshop , 2004
work page 2004
-
[41]
Meteor universal: Language spe- cific translation evaluation for any target language,
M. Denkowski and A. Lavie, “Meteor universal: Language spe- cific translation evaluation for any target language,” in EACL Workshop on Statistical Machine T ranslation , 2014
work page 2014
-
[42]
Cider: Consensus-based image description evaluation,
R. Vedantam, C. L. Zitnick, and D. Parikh, “Cider: Consensus-based image description evaluation,” arXiv preprint arXiv:1411.5726, 2014
-
[43]
The Stanford CoreNLP natural language processing toolkit,
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky, “The Stanford CoreNLP natural language processing toolkit,” in Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014, pp. 55–60. [Online]. Available: http: //www.aclweb.org/anthology/P/P14/P14-5010
work page 2014
-
[44]
Wordnet: a lexical database for english,
G. A. Miller, “Wordnet: a lexical database for english,” Communi- cations of the ACM , vol. 38, no. 11, pp. 39–41, 1995
work page 1995
-
[45]
Comparing automatic evaluation mea- sures for image description,
D. Elliott and F. Keller, “Comparing automatic evaluation mea- sures for image description,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics , vol. 2, 2014, pp. 452–457
work page 2014
-
[46]
Re-evaluation the role of bleu in machine translation research
C. Callison-Burch, M. Osborne, and P . Koehn, “Re-evaluation the role of bleu in machine translation research.” in EACL, vol. 6, 2006, pp. 249–256
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.