Convolutional Neural Networks for Sentence Classification

Yoon Kim

arxiv: 1408.5882 · v2 · pith:ZDYN3K7Nnew · submitted 2014-08-25 · 💻 cs.CL · cs.NE

Convolutional Neural Networks for Sentence Classification

Yoon Kim This is my paper

classification 💻 cs.CL cs.NE

keywords vectorsclassificationconvolutionalnetworksneuralsimplestatictask-specific

0 comments

read the original abstract

We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 16 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

iTAG: Inverse Design for Natural Text Generation with Accurate Causal Graph Annotations
cs.CL 2026-04 unverdicted novelty 7.0

iTAG generates natural text paired with accurate causal graph annotations by framing concept assignment as an inverse problem and refining selections via chain-of-thought reasoning until the text's relations align wit...
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
cs.LG 2019-09 accept novelty 7.0

Releases a large multi-language code corpus and expert-annotated challenge to benchmark semantic code search.
DRIFT: Drift-Resilient Invariant-Feature Transformer for DGA Detection
cs.CR 2026-05 unverdicted novelty 6.0

DRIFT uses hybrid character and subword tokenization plus multi-task self-supervised pre-training to build DGA detectors that resist temporal drift and outperform baselines in forward-chaining evaluations over nine ye...
Versatile yet Efficient Network Traffic Analysis: Offloading Network Foundation Model to SmartNIC
cs.NI 2025-08 unverdicted novelty 6.0

Nepco offloads network foundation models to SmartNICs using localized byte-sequence modeling and a pattern-aware convolutional architecture to achieve competitive macro F1 scores with 328x lower end-to-end latency tha...
Realised Volatility Forecasting: Machine Learning via Financial Word Embedding
q-fin.CP 2021-08 unverdicted novelty 6.0

News embeddings from financial text improve out-of-sample realized volatility forecasts for stocks, with stronger effects for stock-specific news and high-volatility periods, and yield gains when combined with benchmarks.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
cs.SE 2021-02 unverdicted novelty 6.0

CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
cs.CL 2020-02 unverdicted novelty 6.0

CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
Deep Mixture Point Processes: Spatio-temporal Event Prediction with Rich Contextual Information
stat.ML 2019-06 unverdicted novelty 6.0

DMPP models spatio-temporal event intensity as a deep NN-weighted mixture of kernels to incorporate high-dimensional context while keeping likelihood integration tractable.
ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
cs.AR 2025-12 unverdicted novelty 5.0

ODMA raises KV-cache utilization by up to 19.25% and throughput by 23-27% on Cambricon MLU accelerators by dynamically adjusting prediction buckets and using a safety pool for LLM serving.
Automatically Learning Construction Injury Precursors from Text
cs.CL 2019-07 unverdicted novelty 4.0

Standard NLP classifiers can surface valid injury precursors from raw construction safety reports.
Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection
cs.CL 2019-07 unverdicted novelty 4.0

Neural language model augments limited labeled rumor tweets using unlabeled event data, expanding datasets by ~200% and improving F-score by 12.1% in detection models.
Ranking sentences from product description & bullets for better search
cs.IR 2019-07 unverdicted novelty 4.0

Two RL-based extractive summarization models rank sentences from product fields by leveraging titles and click-through logs to improve search relevance.
Cross-lingual Data Transformation and Combination for Text Classification
cs.IR 2019-06 unverdicted novelty 3.0

Cross-lingual data combined via translation or aligned embeddings can improve performance of CNN and RNN text classifiers.
Deep neural network-based classification model for Sentiment Analysis
cs.CL 2019-07 unverdicted novelty 2.0

Empirical comparison of DNN, LSTM variants and CNN for implicit sentiment classification finds Bi-LSTM with word-level attention best on positive class in a public dataset.
Water Preservation in Soan River Basin using Deep Learning Techniques
cs.NE 2019-06 unverdicted novelty 2.0

RNN and LSTM models outperform other algorithms in predicting stream flow from precipitation, land use, and temperature, with a public dataset released.
Machine Reading Comprehension: a Literature Review
cs.CL 2019-06 unverdicted novelty 1.0

A 2019 survey of machine reading comprehension corpora and methods.