Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
Commonsenseqa 2.0: Exposing the limits of ai through gamification
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
K-sparse autoencoders with dead-latent fixes produce clean scaling laws and better feature quality metrics that improve with size, shown by training a 16-million-latent model on GPT-4 activations.
citing papers explorer
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-thought prompting, by including intermediate reasoning steps in few-shot examples, elicits strong reasoning abilities in large language models on arithmetic, commonsense, and symbolic tasks.
-
Scaling and evaluating sparse autoencoders
K-sparse autoencoders with dead-latent fixes produce clean scaling laws and better feature quality metrics that improve with size, shown by training a 16-million-latent model on GPT-4 activations.