pith. sign in

Mohit Bansal

Identifiers

  • name variant Mohit Bansal 0.60 · backfill

Papers (85)

  1. GPU Forecasters: Language Models as Selective Surrogates for Kernel Runtime Optimization cs.LG · 2026 · author #5
  2. Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)? cs.CV · 2026 · author #6
  3. STORM: Internalized Modeling for Spatial-Temporal Reasoning in Video-Language Models cs.CV · 2026 · author #10
  4. AVSD: Adaptive-View Self-Distillation by Balancing Consensus and Teacher-Specific Privileged Signals cs.LG · 2026 · author #10
  5. MINTEval: Evaluating Memory under Multi-Target Interference in Long-Horizon Agent Systems cs.CL · 2026 · author #6
  6. PhyMotion: Structured 3D Motion Reward for Physics-Grounded Human Video Generation cs.CV · 2026 · author #9
  7. Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty cs.CL · 2026 · author #8
  8. EgoMemReason: A Memory-Driven Reasoning Benchmark for Long-Horizon Egocentric Video Understanding cs.CV · 2026 · author #9
  9. Stabilizing Efficient Reasoning with Step-Level Advantage Selection cs.CL · 2026 · author #6
  10. MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments cs.CL · 2026 · author #9
  11. Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind cs.CL · 2026 · author #6
  12. The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment cs.LG · 2026 · author #8
  13. Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems cs.LG · 2026 · author #7
  14. Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style cs.CV · 2026 · author #10
  15. Multimodal Fact-Level Attribution for Verifiable Reasoning cs.CL · 2026 · author #6
  16. Effective Reasoning Chains Reduce Intrinsic Dimensionality cs.CL · 2026 · author #4
  17. When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning cs.CV · 2026 · author #7
  18. StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos cs.CV · 2025 · author #9
  19. One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration cs.AI · 2025 · author #5
  20. VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing cs.RO · 2025 · author #9
  21. OpenThoughts: Data Recipes for Reasoning Models cs.LG · 2025 · author #37
  22. SiLVR: A Simple Language-based Video Reasoning Framework cs.CV · 2025 · author #4
  23. EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance cs.CV · 2025 · author #7
  24. Skill-Based Mixture-of-Experts: Adaptive Routing for Heterogeneous Reasoning via Inferred Skills cs.CL · 2025 · author #5
  25. On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective cs.CY · 2025 · author #56
  26. Self-Correcting Text-to-Video Generation with Misalignment Detection and Localized Refinement cs.CV · 2024 · author #4
  27. Evaluating Very Long-Term Conversational Memory of LLM Agents cs.CL · 2024 · author #4
  28. TrustLLM: Trustworthiness in Large Language Models cs.CL · 2024 · author #31
  29. Analyzing and Mitigating Object Hallucination in Large Vision-Language Models cs.LG · 2023 · author #7
  30. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models cs.CL · 2022 · author #276
  31. Expressing Visual Relationships via Language cs.CL · 2019 · author #5
  32. Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA cs.CL · 2019 · author #2
  33. Improving Visual Question Answering by Referring to Generated Paragraph Captions cs.CL · 2019 · author #2
  34. Continual and Multi-Task Architecture Search cs.CL · 2019 · author #2
  35. Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension cs.CL · 2019 · author #4
  36. Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation cs.CL · 2019 · author #6
  37. Multi-Target Embodied Question Answering cs.CV · 2019 · author #4
  38. Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout cs.CL · 2019 · author #3
  39. AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning cs.CL · 2019 · author #3
  40. Combining Fact Extraction and Verification with Neural Semantic Matching Networks cs.CL · 2018 · author #3
  41. Analyzing Compositionality-Sensitivity of NLI Models cs.CL · 2018 · author #3
  42. Commonsense for Generative Multi-Hop Question Answering Tasks cs.CL · 2018 · author #3
  43. SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories cs.CL · 2018 · author #2
  44. Closed-Book Training to Improve Summarization Encoder Memory cs.CL · 2018 · author #2
  45. Game-Based Video-Context Dialogue cs.CL · 2018 · author #2
  46. Adversarial Over-Sensitivity and Over-Stability Strategies for Dialogue Models cs.CL · 2018 · author #2
  47. TVQA: Localized, Compositional Video Question Answering cs.CL · 2018 · author #3
  48. Dynamic Multi-Level Multi-Task Learning for Sentence Simplification cs.CL · 2018 · author #3
  49. Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting cs.CL · 2018 · author #2
  50. Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation cs.CL · 2018 · author #3
  51. Polite Dialogue Generation Without Parallel Data cs.CL · 2018 · author #2
  52. Object Ordering with Bidirectional Matchings for Visual Reasoning cs.CL · 2018 · author #2
  53. Robust Machine Comprehension Models via Adversarial Training cs.CL · 2018 · author #2
  54. Multi-Reward Reinforced Summarization with Saliency and Entailment cs.CL · 2018 · author #2
  55. Detecting Linguistic Characteristics of Alzheimer's Dementia by Interpreting Neural Models cs.CL · 2018 · author #3
  56. MAttNet: Modular Attention Network for Referring Expression Comprehension cs.CV · 2018 · author #6
  57. Hierarchically-Attentive RNN for Album Summarization and Storytelling cs.CL · 2017 · author #2
  58. Shortcut-Stacked Sentence Encoders for Multi-Domain Inference cs.CL · 2017 · author #2
  59. Reinforced Video Captioning with Entailment Rewards cs.CL · 2017 · author #2
  60. Video Highlight Prediction Using Audience Chat Reactions cs.CL · 2017 · author #3
  61. Source-Target Inference Models for Spatial Instruction Understanding cs.CL · 2017 · author #2
  62. Efficient Generation of Motion Plans from Attribute-Based Natural Language Instructions Using Dynamic Constraint Mapping cs.RO · 2017 · author #3
  63. Punny Captions: Witty Wordplay in Image Descriptions cs.CL · 2017 · author #3
  64. Multi-Task Video Captioning with Video and Entailment Generation cs.CL · 2017 · author #2
  65. Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information cs.CL · 2017 · author #3
  66. A Joint Speaker-Listener-Reinforcer Model for Referring Expressions cs.CV · 2016 · author #3
  67. Coherent Dialogue with Attention-based Language Models cs.CL · 2016 · author #2
  68. Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation cs.RO · 2016 · author #2
  69. Interpreting Neural Networks to Improve Politeness Comprehension cs.CL · 2016 · author #2
  70. Contextual RNN-GANs for Abstract Reasoning Diagram Generation cs.CV · 2016 · author #5
  71. Who did What: A Large-Scale Person-Centered Cloze Dataset cs.CL · 2016 · author #3
  72. Charagram: Embedding Words and Sentences via Character n-grams cs.CL · 2016 · author #2
  73. Sort Story: Sorting Jumbled Images and Captions into Stories cs.CL · 2016 · author #5
  74. Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions cs.CV · 2016 · author #3
  75. The Role of Context Types and Dimensionality in Learning Word Embeddings cs.CL · 2016 · author #4
  76. End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures cs.CL · 2016 · author #2
  77. We Are Humor Beings: Understanding and Predicting Visual Humor cs.CV · 2015 · author #4
  78. Towards Universal Paraphrastic Sentence Embeddings cs.CL · 2015 · author #2
  79. Learning Articulated Motion Models from Visual and Lingual Signals cs.RO · 2015 · author #2
  80. Accurate Vision-based Vehicle Localization using Satellite Imagery cs.RO · 2015 · author #3
  81. Mapping Unseen Words to Task-Trained Embedding Spaces cs.CL · 2015 · author #2
  82. What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment cs.CL · 2015 · author #2
  83. Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences cs.CL · 2015 · author #2
  84. From Paraphrase Database to Compositional Paraphrase Model and Back cs.CL · 2015 · author #2
  85. Web-scale Surface and Syntactic n-gram Features for Dependency Parsing cs.CL · 2015 · author #2

Mentions

  • 1502.07038 #2 · backfill · confidence 0.70 Mohit Bansal
  • 2602.08236 #7 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2503.05641 #5 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2605.31464 #5 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2605.30557 #6 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2602.09276 #4 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2505.21876 #7 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2605.26014 #10 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2605.20643 #10 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2603.11024 #10 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2605.18565 #6 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2502.14296 #56 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2401.05561 #31 · arxiv_oai · confidence 0.70 Mohit Bansal
  • 2310.00754 #7 · arxiv_oai · confidence 0.70 Mohit Bansal

Frequent Coauthors