Smoothie performs diffusion by smoothing token embeddings based on semantic similarity, outperforming prior diffusion models on sequence-to-sequence and unconditional text generation tasks.
Roberta: A robustly optimized bert pretraining approach
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Adapting large language models by training only a low-rank decomposition BA added to frozen weight matrices matches full fine-tuning while cutting trainable parameters by orders of magnitude and adding no inference latency.
HyperPersona is a hypergraph framework that jointly models document, sentence, and word levels of text via hyperedges and nodes, then uses a transformer graph encoder to predict Big Five personality traits from text alone.
CS conferences should require nonrepudiable experimental results via signed attestations that prevent authors from later altering or denying reported numbers.
Nomic AI produced and open-sourced a reproducible 8192-context English text embedder that exceeds OpenAI Ada-002 and text-embedding-3-small performance on MTEB short-context and LoCo long-context benchmarks.
ALiBi enables transformers trained on length-1024 sequences to extrapolate to length-2048 with the same perplexity as a sinusoidal model trained on 2048, while training 11% faster and using 11% less memory.
LoRA-FA freezes LoRA's A matrix and trains only B with gradient corrections to approximate full fine-tuning gradients more closely.
A RoBERTa NER model with BiLSTM-CRF and SapBERT candidate generation via cosine similarity was applied to symptom recognition and linking, where knowledge base selection most affected accuracy.
citing papers explorer
-
Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation
Smoothie performs diffusion by smoothing token embeddings based on semantic similarity, outperforming prior diffusion models on sequence-to-sequence and unconditional text generation tasks.
-
LoRA: Low-Rank Adaptation of Large Language Models
Adapting large language models by training only a low-rank decomposition BA added to frozen weight matrices matches full fine-tuning while cutting trainable parameters by orders of magnitude and adding no inference latency.
-
HyperPersona: A Multi-Level Hypergraph Framework for Text-Based Automatic Personality Prediction
HyperPersona is a hypergraph framework that jointly models document, sentence, and word levels of text via hyperedges and nodes, then uses a transformer graph encoder to predict Big Five personality traits from text alone.
-
Computer Science Conferences Should Require Nonrepudiable Experimental Results
CS conferences should require nonrepudiable experimental results via signed attestations that prevent authors from later altering or denying reported numbers.
-
Nomic Embed: Training a Reproducible Long Context Text Embedder
Nomic AI produced and open-sourced a reproducible 8192-context English text embedder that exceeds OpenAI Ada-002 and text-embedding-3-small performance on MTEB short-context and LoCo long-context benchmarks.
-
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
ALiBi enables transformers trained on length-1024 sequences to extrapolate to length-2048 with the same perplexity as a sinusoidal model trained on 2048, while training 11% faster and using 11% less memory.
-
LoRA-FA: Efficient and Effective Low Rank Representation Fine-tuning
LoRA-FA freezes LoRA's A matrix and trains only B with gradient corrections to approximate full fine-tuning gradients more closely.
-
Team Fusion@ SU@ BC8 SympTEMIST track: transformer-based approach for symptom recognition and linking
A RoBERTa NER model with BiLSTM-CRF and SapBERT candidate generation via cosine similarity was applied to symptom recognition and linking, where knowledge base selection most affected accuracy.