pith. sign in

Maksym Andriushchenko

Identifiers

  • name variant Maksym Andriushchenko 0.60 · backfill

Papers (15)

  1. What Shapes Emergent Misalignment? Insights from Training Dynamics, Model Priors, and Data cs.AI · 2026 · author #4
  2. Decomposing and Measuring Evaluation Awareness cs.LG · 2026 · author #6
  3. FutureSim: Replaying World Events to Evaluate Adaptive Agents cs.LG · 2026 · author #7
  4. Europe and the Geopolitics of AGI: The Need for a Preparedness Plan cs.CY · 2026 · author #11
  5. Instrumental Choices: Measuring the Propensity of LLM Agents to Pursue Instrumental Behaviors cs.AI · 2026 · author #3
  6. Characterizing the Consistency of the Emergent Misalignment Persona cs.AI · 2026 · author #3
  7. QuantSightBench: Evaluating LLM Quantitative Forecasting with Prediction Intervals cs.LG · 2026 · author #2
  8. Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs cs.LG · 2026 · author #6
  9. Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks cs.CR · 2026 · author #4
  10. Helpful to a Fault: Measuring Illicit Assistance in Multi-Turn, Multilingual LLM Agents cs.CL · 2026 · author #4
  11. AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents cs.LG · 2024 · author #1
  12. JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models cs.CR · 2024 · author #4
  13. Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem cs.LG · 2018 · author #2
  14. Logit Pairing Methods Can Fool Gradient-Based Attacks cs.LG · 2018 · author #2
  15. Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation cs.LG · 2017 · author #2

Mentions

  • 2606.20814 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
  • 2603.24511 #6 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
  • 2605.23055 #6 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
  • 2602.16346 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko
  • 2602.20156 #4 · arxiv_oai · confidence 0.70 Maksym Andriushchenko

Frequent Coauthors