Black-box membership inference attacks on retrieval-based in-context learning for document QA succeed via query prefixes, with a novel weighted-averaging method outperforming priors even under paraphrasing.
Datasheet for the pile
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Training per-layer affine probes on frozen transformers yields more reliable latent predictions than the logit lens and enables detection of malicious inputs from prediction trajectories.
AIR-MoE introduces a two-stage inverted-index routing method based on vector quantization that approximates optimal expert selection for granular MoE models at lower cost and with empirical performance gains.
MiniCPM 1.2B and 2.4B models reach parity with 7B-13B LLMs via model wind-tunnel scaling and a WSD scheduler that yields a higher optimal data-to-model ratio than Chinchilla scaling.
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
Model developers must address human concerns, preferences, values, and goals with rigor at every stage of the LLM pipeline rather than only in post-training.
citing papers explorer
-
Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering
Black-box membership inference attacks on retrieval-based in-context learning for document QA succeed via query prefixes, with a novel weighted-averaging method outperforming priors even under paraphrasing.
-
Eliciting Latent Predictions from Transformers with the Tuned Lens
Training per-layer affine probes on frozen transformers yields more reliable latent predictions than the logit lens and enables detection of malicious inputs from prediction trajectories.
-
Adaptive Inverted-Index Routing for Granular Mixtures-of-Experts
AIR-MoE introduces a two-stage inverted-index routing method based on vector quantization that approximates optimal expert selection for granular MoE models at lower cost and with empirical performance gains.
-
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
MiniCPM 1.2B and 2.4B models reach parity with 7B-13B LLMs via model wind-tunnel scaling and a WSD scheduler that yields a higher optimal data-to-model ratio than Chinchilla scaling.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM is a 176B-parameter open-access multilingual language model trained on the ROOTS corpus that achieves competitive performance on benchmarks, with improved results after multitask prompted finetuning.
-
Reflections and New Directions for Human-Centered Large Language Models
Model developers must address human concerns, preferences, values, and goals with rigor at every stage of the LLM pipeline rather than only in post-training.