{"paper":{"title":"Evaluating and Learning Robust Bandit Policies Under Uncertain Causal Mechanisms","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Structural equation models let bandit algorithms evaluate and learn policies accurately even when causal mechanisms remain uncertain.","cross_cats":[],"primary_cat":"cs.LG","authors_text":"Chinmay Pendse, David Jensen, Katherine Avery","submitted_at":"2025-08-04T18:29:29Z","abstract_excerpt":"Causal graphical models can encode large amounts structural knowledge, both from the background knowledge of domain experts and the structural knowledge discovered from randomized experiments or observational data. However, though we may know the general structure of causal relationships, we often do not know the exact causal mechanisms. In this work, we propose a causal multi-armed bandit evaluation and learning algorithm that can reason effectively despite uncertainty over conditional probability distributions. Further, we show how conditional independence testing can be used to choose varia"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"the structural equation model (SEM) approach gives more accurate evaluations compared to traditional approaches, particularly as the range of possible causal mechanisms grows. Further, the SEM approach learns low-variance policies, and it learns an optimal policy, assuming the model is sufficiently well-specified.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The structural equation model must be sufficiently well-specified for the method to learn an optimal policy; this premise is invoked in the abstract when stating convergence to optimality and is structurally required for the superiority claims to hold.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A SEM-based causal bandit method provides more accurate policy evaluations and learns low-variance optimal policies under uncertain conditional distributions compared to traditional approaches.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Structural equation models let bandit algorithms evaluate and learn policies accurately even when causal mechanisms remain uncertain.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"b2919aa1ea95f4ca2c78abbed6ba51323b9ba8dfc7f6486d5d7b41ce736db2e5"},"source":{"id":"2508.02812","kind":"arxiv","version":3},"verdict":{"id":"230646d2-d35c-4aa5-954a-841a4738527f","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T00:31:48.767498Z","strongest_claim":"the structural equation model (SEM) approach gives more accurate evaluations compared to traditional approaches, particularly as the range of possible causal mechanisms grows. Further, the SEM approach learns low-variance policies, and it learns an optimal policy, assuming the model is sufficiently well-specified.","one_line_summary":"A SEM-based causal bandit method provides more accurate policy evaluations and learns low-variance optimal policies under uncertain conditional distributions compared to traditional approaches.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The structural equation model must be sufficiently well-specified for the method to learn an optimal policy; this premise is invoked in the abstract when stating convergence to optimality and is structurally required for the superiority claims to hold.","pith_extraction_headline":"Structural equation models let bandit algorithms evaluate and learn policies accurately even when causal mechanisms remain uncertain."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2508.02812/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"936ede161f76e2e7ceed9ddc10561de281d6435b2bac2b37ae564d875b4955c7"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}