Convolutional Neural Networks for Sentence Classification
read the original abstract
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.
This paper has not been read by Pith yet.
Forward citations
Cited by 16 Pith papers
-
iTAG: Inverse Design for Natural Text Generation with Accurate Causal Graph Annotations
iTAG generates natural text paired with accurate causal graph annotations by framing concept assignment as an inverse problem and refining selections via chain-of-thought reasoning until the text's relations align wit...
-
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
Releases a large multi-language code corpus and expert-annotated challenge to benchmark semantic code search.
-
DRIFT: Drift-Resilient Invariant-Feature Transformer for DGA Detection
DRIFT uses hybrid character and subword tokenization plus multi-task self-supervised pre-training to build DGA detectors that resist temporal drift and outperform baselines in forward-chaining evaluations over nine ye...
-
Versatile yet Efficient Network Traffic Analysis: Offloading Network Foundation Model to SmartNIC
Nepco offloads network foundation models to SmartNICs using localized byte-sequence modeling and a pattern-aware convolutional architecture to achieve competitive macro F1 scores with 328x lower end-to-end latency tha...
-
Realised Volatility Forecasting: Machine Learning via Financial Word Embedding
News embeddings from financial text improve out-of-sample realized volatility forecasts for stocks, with stronger effects for stock-specific news and high-volatility periods, and yield gains when combined with benchmarks.
-
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
-
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
-
Deep Mixture Point Processes: Spatio-temporal Event Prediction with Rich Contextual Information
DMPP models spatio-temporal event intensity as a deep NN-weighted mixture of kernels to incorporate high-dimensional context while keeping likelihood integration tractable.
-
ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
ODMA raises KV-cache utilization by up to 19.25% and throughput by 23-27% on Cambricon MLU accelerators by dynamically adjusting prediction buckets and using a safety pool for LLM serving.
-
Automatically Learning Construction Injury Precursors from Text
Standard NLP classifiers can surface valid injury precursors from raw construction safety reports.
-
Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection
Neural language model augments limited labeled rumor tweets using unlabeled event data, expanding datasets by ~200% and improving F-score by 12.1% in detection models.
-
Ranking sentences from product description & bullets for better search
Two RL-based extractive summarization models rank sentences from product fields by leveraging titles and click-through logs to improve search relevance.
-
Cross-lingual Data Transformation and Combination for Text Classification
Cross-lingual data combined via translation or aligned embeddings can improve performance of CNN and RNN text classifiers.
-
Deep neural network-based classification model for Sentiment Analysis
Empirical comparison of DNN, LSTM variants and CNN for implicit sentiment classification finds Bi-LSTM with word-level attention best on positive class in a public dataset.
-
Water Preservation in Soan River Basin using Deep Learning Techniques
RNN and LSTM models outperform other algorithms in predicting stream flow from precipitation, land use, and temperature, with a public dataset released.
-
Machine Reading Comprehension: a Literature Review
A 2019 survey of machine reading comprehension corpora and methods.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.