Maksym Andriushchenko — Pith Author Registry

Identifiers

name variant Maksym Andriushchenko 0.60 · backfill

Papers (40)

What Shapes Emergent Misalignment? Insights from Training Dynamics, Model Priors, and Data cs.AI · 2026 · author #4
Decomposing and Measuring Evaluation Awareness cs.LG · 2026 · author #6
FutureSim: Replaying World Events to Evaluate Adaptive Agents cs.LG · 2026 · author #7
Europe and the Geopolitics of AGI: The Need for a Preparedness Plan cs.CY · 2026 · author #11
Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors cs.AI · 2026 · author #3
Characterizing the Consistency of the Emergent Misalignment Persona cs.AI · 2026 · author #3
QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals cs.LG · 2026 · author #2
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs cs.LG · 2026 · author #6
Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks cs.CR · 2026 · author #4
Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents cs.CL · 2026 · author #4
Monitoring Decomposition Attacks in LLMs with Lightweight Sequential Monitors cs.CR · 2025 · author #4
Exploring Memorization and Copyright Violation in Frontier LLMs: A Study of the New York Times v. OpenAI 2023 Lawsuit cs.LG · 2024 · author #4
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents cs.LG · 2024 · author #1
Does Refusal Training in LLMs Generalize to the Past Tense? cs.CL · 2024 · author #1
Improving Alignment and Robustness with Circuit Breakers cs.LG · 2024 · author #6
Is In-Context Learning Sufficient for Instruction Following in LLMs? cs.CL · 2024 · author #2
Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs cs.CL · 2024 · author #5
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks cs.CR · 2024 · author #1
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models cs.CR · 2024 · author #4
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning cs.CL · 2024 · author #2
Scaling Compute Is Not All You Need for Adversarial Robustness cs.LG · 2023 · author #3
Critical Influence of Overparameterization on Sharpness-aware Minimization cs.LG · 2023 · author #3
Why Do We Need Weight Decay in Modern Deep Learning? cs.LG · 2023 · author #2
Layer-wise Linear Mode Connectivity cs.LG · 2023 · author #2
Transferable Adversarial Robustness for Categorical Data via Universal Robust Embeddings cs.LG · 2023 · author #2
Sharpness-Aware Minimization Leads to Low-Rank Features cs.LG · 2023 · author #1
A Modern Look at the Relationship between Sharpness and Generalization cs.LG · 2023 · author #1
SGD with Large Step Sizes Learns Sparse Features cs.LG · 2022 · author #1
Towards Understanding Sharpness-Aware Minimization cs.LG · 2022 · author #1
ARIA: Adversarially Robust Image Attribution for Content Provenance cs.CV · 2022 · author #1
On the effectiveness of adversarial training against common corruptions cs.LG · 2021 · author #2
RobustBench: a standardized adversarial robustness benchmark cs.LG · 2020 · author #2
Understanding and Improving Fast Adversarial Training cs.LG · 2020 · author #1
Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks cs.LG · 2020 · author #2
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines cs.LG · 2020 · author #2
Square Attack: a query-efficient black-box adversarial attack via random search cs.LG · 2019 · author #1
Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks cs.LG · 2019 · author #1
Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem cs.LG · 2018 · author #2
Logit Pairing Methods Can Fool Gradient-Based Attacks cs.LG · 2018 · author #2
Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation cs.LG · 2017 · author #2

Mentions

2506.10949 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2311.17539 #3 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2410.09024 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2407.11969 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2405.19874 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2404.02151 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2412.06370 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2310.04415 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2404.01318 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2406.04313 #6 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2404.14461 #5 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2402.04833 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2307.06966 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2312.13131 #3 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2306.04064 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2305.16292 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2302.07011 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2210.05337 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2206.06232 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2202.12860 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2006.12834 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2103.02325 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2010.09670 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2006.04884 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2007.02617 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
1912.00049 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
1906.03526 #1 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
1812.05720 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
1810.12042 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
1705.08475 #2 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2606.20814 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2603.24511 #6 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2605.23055 #6 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2602.16346 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
2602.20156 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko

Frequent Coauthors

Nicolas Flammarion 18 shared papers
Francesco Croce 9 shared papers
Matthias Hein 8 shared papers
Edoardo Debenedetti 4 shared papers
Vikash Sehwag 3 shared papers
Aditya Varre 2 shared papers
Andy Zou 2 shared papers
Anietta Weckauff 2 shared papers
Dan Hendrycks 2 shared papers
Derek Duenas 2 shared papers
Dietrich Klakow 2 shared papers
Hao Zhao 2 shared papers
Jonas Geiping 2 shared papers
Justin Wang 2 shared papers
Klim Kireev 2 shared papers
Marius Mosbach 2 shared papers
Matt Fredrikson 2 shared papers
Maxwell Lin 2 shared papers
Sahar Abdelnabi 2 shared papers
Yuchen Zhang 2 shared papers