hub

Magpie: Alignment data synthesis from scratch by prompting aligned llms with nothing.arXiv preprint arXiv:2406.08464

Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, Radha Poovendran, Yejin Choi, Bill Yuchen Lin · 2024 · arXiv 2406.08464

16 Pith papers cite this work. Polarity classification is still indexing.

16 Pith papers citing it

read on arXiv browse 16 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 baseline 1 dataset 1

citation-polarity summary

background 1 baseline 1 use dataset 1

representative citing papers

Large Language Diffusion Models

cs.CL · 2025-02-14 · unverdicted · novelty 8.0

LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

cs.LG · 2025-02-07 · unverdicted · novelty 7.0

A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.

Explaining and Breaking the Safety-Helpfulness Ceiling via Preference Dimensional Expansion

cs.AI · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

MORA breaks the safety-helpfulness ceiling in LLMs by pre-sampling single-reward prompts and rewriting them to incorporate multi-dimensional intents, delivering 5-12.4% gains in sequential alignment and 4.6% overall improvement in simultaneous alignment.

PARD-2: Target-Aligned Parallel Draft Model for Dual-Mode Speculative Decoding

cs.CL · 2026-05-09 · unverdicted · novelty 6.0

PARD-2 uses Confidence-Adaptive Token optimization to align draft model training with acceptance length in speculative decoding, enabling dual-mode operation and up to 6.94x lossless speedup on Llama3.1-8B.

Hypothesis generation and updating in large language models

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.

ZAYA1-8B Technical Report

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.

TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning

cs.CR · 2026-04-30 · unverdicted · novelty 6.0

TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.

To Adapt or not to Adapt, Rethinking the Value of Medical Knowledge-Aware Large Language Models

cs.CL · 2026-04-08 · unverdicted · novelty 6.0

Domain-adapted clinical LLMs provide only marginal and unstable gains over general models on English clinical MCQA benchmarks, while new Spanish Marmoka models perform better.

SmolVLM: Redefining small and efficient multimodal models

cs.AI · 2025-04-07 · unverdicted · novelty 6.0

SmolVLM-256M outperforms a 300-times larger model using under 1 GB GPU memory, while the 2.2B version matches state-of-the-art VLMs at half the memory cost.

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

cs.CV · 2024-12-06 · unverdicted · novelty 6.0

InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.

Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training

cs.CL · 2026-05-06 · unverdicted · novelty 5.0 · 2 refs

LoPT achieves competitive task performance in LLM post-training by limiting task gradients to the upper model half and training the lower half with local feature reconstruction.

Kimi-Audio Technical Report

eess.AS · 2025-04-25 · unverdicted · novelty 5.0

Kimi-Audio is an open-source audio foundation model that achieves state-of-the-art results on speech recognition, audio understanding, question answering, and conversation after pre-training on more than 13 million hours of speech, sound, and music data.

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

cs.CL · 2025-02-04 · unverdicted · novelty 5.0

SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

cs.CV · 2024-12-13 · accept · novelty 5.0

DeepSeek-VL2 is a series of MoE vision-language models using dynamic tiling and latent attention that reach competitive or state-of-the-art results on VQA, OCR, document understanding and grounding with 1.0B to 4.5B activated parameters.

LLaVA-OneVision: Easy Visual Task Transfer

cs.CV · 2024-08-06 · unverdicted · novelty 5.0

LLaVA-OneVision is the first single open LMM to simultaneously achieve strong performance in single-image, multi-image, and video scenarios with cross-scenario transfer capabilities.

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

cs.CV · 2025-01-22 · unverdicted · novelty 4.0

VideoLLaMA3 uses a vision-centric training paradigm and token-reduction design to reach competitive results on image and video benchmarks.

citing papers explorer

Showing 15 of 15 citing papers after filters.

Large Language Diffusion Models cs.CL · 2025-02-14 · unverdicted · none · ref 104
LLaDA is a scalable diffusion-based language model that matches autoregressive LLMs like LLaMA3 8B on tasks and surpasses GPT-4o on reversal poem completion.
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach cs.LG · 2025-02-07 · unverdicted · none · ref 171
A recurrent-depth architecture enables language models to improve reasoning performance by iterating computation in latent space, achieving gains equivalent to much larger models on benchmarks.
Explaining and Breaking the Safety-Helpfulness Ceiling via Preference Dimensional Expansion cs.AI · 2026-05-12 · unverdicted · none · ref 60 · 2 links
MORA breaks the safety-helpfulness ceiling in LLMs by pre-sampling single-reward prompts and rewriting them to incorporate multi-dimensional intents, delivering 5-12.4% gains in sequential alignment and 4.6% overall improvement in simultaneous alignment.
PARD-2: Target-Aligned Parallel Draft Model for Dual-Mode Speculative Decoding cs.CL · 2026-05-09 · unverdicted · none · ref 31
PARD-2 uses Confidence-Adaptive Token optimization to align draft model training with acceptance length in speculative decoding, enabling dual-mode operation and up to 6.94x lossless speedup on Llama3.1-8B.
Hypothesis generation and updating in large language models cs.LG · 2026-05-07 · unverdicted · none · ref 52
LLMs exhibit Bayesian-like hypothesis updating with strong-sampling bias and an evaluation-generation gap but generalize poorly outside observed data.
ZAYA1-8B Technical Report cs.AI · 2026-05-06 · unverdicted · none · ref 65
ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.
TwinGate: Stateful Defense against Decompositional Jailbreaks in Untraceable Traffic via Asymmetric Contrastive Learning cs.CR · 2026-04-30 · unverdicted · none · ref 32
TwinGate deploys a stateful dual-encoder system with asymmetric contrastive learning to detect decompositional jailbreaks in untraceable LLM traffic at high recall and low false-positive rate with negligible latency.
To Adapt or not to Adapt, Rethinking the Value of Medical Knowledge-Aware Large Language Models cs.CL · 2026-04-08 · unverdicted · none · ref 5
Domain-adapted clinical LLMs provide only marginal and unstable gains over general models on English clinical MCQA benchmarks, while new Spanish Marmoka models perform better.
SmolVLM: Redefining small and efficient multimodal models cs.AI · 2025-04-07 · unverdicted · none · ref 34
SmolVLM-256M outperforms a 300-times larger model using under 1 GB GPU memory, while the 2.2B version matches state-of-the-art VLMs at half the memory cost.
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling cs.CV · 2024-12-06 · unverdicted · none · ref 267
InternVL 2.5 is the first open-source MLLM to surpass 70% on the MMMU benchmark via model, data, and test-time scaling, with a 3.7-point gain from chain-of-thought reasoning.
Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training cs.CL · 2026-05-06 · unverdicted · none · ref 27 · 2 links
LoPT achieves competitive task performance in LLM post-training by limiting task gradients to the upper model half and training the lower half with local feature reconstruction.
Kimi-Audio Technical Report eess.AS · 2025-04-25 · unverdicted · none · ref 75
Kimi-Audio is an open-source audio foundation model that achieves state-of-the-art results on speech recognition, audio understanding, question answering, and conversation after pre-training on more than 13 million hours of speech, sound, and music data.
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model cs.CL · 2025-02-04 · unverdicted · none · ref 242
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
LLaVA-OneVision: Easy Visual Task Transfer cs.CV · 2024-08-06 · unverdicted · none · ref 144
LLaVA-OneVision is the first single open LMM to simultaneously achieve strong performance in single-image, multi-image, and video scenarios with cross-scenario transfer capabilities.
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding cs.CV · 2025-01-22 · unverdicted · none · ref 95
VideoLLaMA3 uses a vision-centric training paradigm and token-reduction design to reach competitive results on image and video benchmarks.

Magpie: Alignment data synthesis from scratch by prompting aligned llms with nothing.arXiv preprint arXiv:2406.08464

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer