Recognition: unknown
Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout
Pith reviewed 2026-05-08 16:47 UTC · model grok-4.3
The pith
Driver-WM forecasts in-cabin driver dynamics by causally conditioning on out-cabin traffic in a compact latent space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Driver-WM is a driver-centric latent world model that rolls out in-cabin dynamics causally conditioned on out-cabin traffic context. This formulation unifies physical kinematics forecasting with auxiliary behavioral and emotional semantic recognition. Operating in a compact latent space constructed from frozen vision-language features, Driver-WM adopts a dual-stream architecture to separately encode external traffic and internal driver states. These streams are directionally coupled via a gated causal injection mechanism, which uses a learned vector gate to modulate external contextual perturbations while strictly enforcing temporal causality.
What carries the argument
Dual-stream architecture with gated causal injection, which separately encodes traffic and driver states then directionally couples them through a learned vector gate that modulates external context into internal predictions while enforcing temporal causality.
If this is right
- Enables robust long-horizon geometric forecasting for reactive high-motion maneuvers.
- Improves semantic alignment for both driver and traffic states.
- Supports controlled test-time interventions that systematically analyze how external context affects internal state predictions.
- Unifies physical kinematics rollout with behavioral and emotional semantic recognition in one model.
Where Pith is reading between the lines
- The explicit external-to-internal conditioning could support simulation of driver responses under hypothetical traffic changes not seen in training.
- Because the architecture separates the two streams before injection, it may allow independent updates to the traffic encoder without retraining the entire driver dynamics component.
- The approach suggests that similar causal injection patterns could be tested in other in-cabin monitoring domains where external scene context influences human state.
Load-bearing premise
A compact latent space built from frozen vision-language features plus the dual-stream gated injection is sufficient to capture and causally link external traffic context to internal driver dynamics without critical information loss.
What would settle it
Ablation experiments on the multi-task assistive driving benchmark in which removing the gated causal injection or unfreezing the vision-language features produces measurable drops in long-horizon geometric accuracy or semantic alignment scores.
Figures
read the original abstract
Safe L2/L3 driving automation requires anticipating human-in-the-loop reactions during shared-control transitions. While most driving world models forecast the external environment, in-cabin intelligence remains strictly recognition-oriented and lacks multi-step rollout capabilities for driver dynamics. We introduce Driver-WM, a driver-centric latent world model that rolls out in-cabin dynamics causally conditioned on out-cabin traffic context. This formulation unifies physical kinematics forecasting with auxiliary behavioral and emotional semantic recognition. Operating in a compact latent space constructed from frozen vision-language features, Driver-WM adopts a dual-stream architecture to separately encode external traffic and internal driver states. These streams are directionally coupled via a gated causal injection mechanism, which uses a learned vector gate to modulate external contextual perturbations while strictly enforcing temporal causality. Evaluations on a multi-task assistive driving benchmark demonstrate that Driver-WM yields robust long-horizon geometric forecasting for reactive high-motion maneuvers and improves semantic alignment for both driver and traffic states. Finally, the explicit external-to-internal conditioning allows for controlled test-time interventions to systematically analyze mechanism responses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Driver-WM, a driver-centric latent world model for rolling out in-cabin dynamics causally conditioned on external traffic context. It uses a dual-stream architecture operating in a compact latent space derived from frozen vision-language model features, with the streams coupled directionally via a gated causal injection mechanism employing a learned vector gate to modulate external perturbations while enforcing temporal causality. The approach unifies physical kinematics forecasting with auxiliary behavioral and emotional semantic recognition tasks. Evaluations on a multi-task assistive driving benchmark are claimed to demonstrate robust long-horizon geometric forecasting for reactive high-motion maneuvers, improved semantic alignment for driver and traffic states, and support for controlled test-time interventions to analyze mechanism responses.
Significance. If the central claims hold with supporting evidence, the work would advance in-cabin intelligence for shared-control L2/L3 automation by extending world models beyond external forecasting to include multi-step driver dynamics rollout. The explicit external-to-internal conditioning and test-time intervention capability offer interpretability benefits, while the frozen VLM features and causal gating promote efficiency. This addresses a recognized gap in driver-centric modeling, though the significance hinges on demonstrating that the architecture preserves necessary dynamic information without critical loss.
major comments (2)
- Abstract: The central claims that Driver-WM 'yields robust long-horizon geometric forecasting for reactive high-motion maneuvers' and 'improves semantic alignment for both driver and traffic states' are presented without any quantitative metrics, error bars, baseline comparisons, ablation results, or details on the multi-task benchmark. This absence prevents verification of whether the dual-stream architecture and gated causal injection deliver the stated performance gains.
- Abstract and architecture description: The load-bearing assumption that a compact latent space constructed from frozen vision-language features, combined with gated causal injection, preserves sufficient fine-grained kinematic and geometric information for accurate long-horizon rollout of high-motion driver maneuvers is not justified. VLMs are pretrained on semantic alignment rather than physics-consistent motion prediction, and no explicit recovery mechanism for high-frequency dynamic cues discarded at encoding is described, raising a correctness risk for the robustness claims even if the gating functions as intended.
minor comments (2)
- The 'multi-task assistive driving benchmark' is referenced without naming the dataset, specifying the constituent tasks, metrics, data splits, or exclusion rules. These details are required in the experiments section for reproducibility and to allow assessment of the evaluation protocol.
- The abstract would be strengthened by including at least one key quantitative result or comparison to ground the performance claims, rather than relying solely on qualitative descriptors.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable feedback on our work. We have carefully considered the major comments and made revisions to the manuscript to improve clarity and provide additional supporting evidence for our claims. Our point-by-point responses are as follows.
read point-by-point responses
-
Referee: Abstract: The central claims that Driver-WM 'yields robust long-horizon geometric forecasting for reactive high-motion maneuvers' and 'improves semantic alignment for both driver and traffic states' are presented without any quantitative metrics, error bars, baseline comparisons, ablation results, or details on the multi-task benchmark. This absence prevents verification of whether the dual-stream architecture and gated causal injection deliver the stated performance gains.
Authors: We agree that the abstract would be strengthened by including quantitative support for the claims. In the revised manuscript, we have updated the abstract to incorporate key quantitative metrics from our evaluations on the multi-task assistive driving benchmark, including specific improvements in forecasting error for long-horizon rollouts and semantic alignment scores, along with comparisons to relevant baselines. This revision ensures that the performance gains are verifiable from the abstract itself. revision: yes
-
Referee: Abstract and architecture description: The load-bearing assumption that a compact latent space constructed from frozen vision-language features, combined with gated causal injection, preserves sufficient fine-grained kinematic and geometric information for accurate long-horizon rollout of high-motion driver maneuvers is not justified. VLMs are pretrained on semantic alignment rather than physics-consistent motion prediction, and no explicit recovery mechanism for high-frequency dynamic cues discarded at encoding is described, raising a correctness risk for the robustness claims even if the gating functions as intended.
Authors: This is a valid concern regarding the information preservation in the latent space. Although VLMs are pretrained primarily for semantic tasks, our empirical results on the benchmark demonstrate that the encoded features, when processed through the dual-stream architecture and gated causal injection, enable accurate long-horizon geometric forecasting even for high-motion maneuvers. We have revised the architecture description section to provide a more detailed justification, including references to how VLM features capture motion-related information and how the causal gating mechanism helps maintain temporal dynamics. Additionally, we have included ablation studies on the latent space dimensionality to show that the compact representation retains necessary kinematic details without significant loss. While we do not introduce an explicit recovery mechanism for discarded cues, the design choices and experimental validation support the robustness claims. revision: partial
Circularity Check
No significant circularity; architecture presented as novel design
full rationale
The paper describes Driver-WM as a new driver-centric latent world model using frozen vision-language features, a dual-stream architecture, and gated causal injection for external-to-internal conditioning. No equations, derivations, or fitted parameters are shown that reduce any claimed prediction or rollout to its own inputs by construction. The central claims rest on the proposed architecture and benchmark evaluations rather than self-definitional loops, self-citation chains, or renamed known results. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Frozen vision-language features capture sufficient semantic information for both external traffic and internal driver states
invented entities (1)
-
Gated causal injection mechanism with learned vector gate
no independent evidence
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the IEEE/CVF ICCV, pp
Adeli, V., Ehsanpour, M., Reid, I., Niebles, J.C., Savarese, S., Adeli, E., Rezatofighi, H.: TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild . In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 13370– 13380. IEEE Computer Society, Los Alamitos, CA, USA (Oct 2021).https://doi. org/10.1109/ICCV48922.2021.01314, http...
-
[2]
Bai, J., Bai, S., Yang, S., Wang, S., Tan, S., Wang, P., Lin, J., Zhou, C., Zhou, J.: Qwen-VL: A versatile vision-language model for understanding, localization, text reading, and beyond (2024),https://openreview.net/forum?id=qrGjFJVl3m
2024
-
[3]
Bai, S., Cai, Y., Chen, R., Chen, K., Chen, X., Cheng, Z., Deng, L., Ding, W., Gao, C., Ge, C., Ge, W., Guo, Z., Huang, Q., Huang, J., Huang, F., Hui, B., Jiang, S., Li, Z., Li, M., Li, M., Li, K., Lin, Z., Lin, J., Liu, X., Liu, J., Liu, C., Liu, Y., Liu, D., Liu, S., Lu, D., Luo, R., Lv, C., Men, R., Meng, L., Ren, X., Ren, X., Song, S., Sun, Y., Tang, ...
2025
-
[4]
Chen, L., Wu, P., Chitta, K., Jaeger, B., Geiger, A., Li, H.: End-to-end autonomous driving: Challenges and frontiers. IEEE Trans. Pattern Anal. Mach. Intell.46(12), 10164–10183 (Dec 2024).https://doi.org/10.1109/TPAMI.2024.3435937, https: //doi.org/10.1109/TPAMI.2024.3435937
-
[5]
Ai safety assurance for automated vehicles: A survey on research, standardization, regulation,
Chen, L., Li, Y., Huang, C., Li, B., Xing, Y., Tian, D., Li, L., Hu, Z., Na, X., Li, Z., Teng, S., Lv, C., Wang, J., Cao, D., Zheng, N., Wang, F.Y.: Milestones in autonomous driving and intelligent vehicles: Survey of surveys. IEEE Transactions on Intelligent Vehicles8(2), 1046–1056 (2023).https://doi.org/10.1109/TIV. 2022.3223131
work page doi:10.1109/tiv 2023
-
[6]
In: 2025 IEEE Intelligent Vehicles Sym- posium (IV)
Chi, H., Yang, H., Yang, L., Lv, C.: Vlm-dm: Visual language models for multitask domain adaptation in driver monitoring. In: 2025 IEEE Intelligent Vehicles Sym- posium (IV). pp. 1280–1285 (2025).https://doi.org/10.1109/IV64158.2025. 11097620
-
[7]
In: Proceedings of the 40th International Conference on Machine Learning
Driess, D., Xia, F., Sajjadi, M.S.M., Lynch, C., Chowdhery, A., Ichter, B., Wahid, A., Tompson, J., Vuong, Q., Yu, T., Huang, W., Chebotar, Y., Sermanet, P., Duckworth, D., Levine, S., Vanhoucke, V., Hausman, K., Toussaint, M., Greff, K., Zeng, A., Mordatch, I., Florence, P.: Palm-e: an embodied multimodal language model. In: Proceedings of the 40th Inter...
2023
-
[8]
In: Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C
Gao, S., Yang, J., Chen, L., Chitta, K., Qiu, Y., Geiger, A., Zhang, J., Li, H.: Vista: A generalizable driving world model with high fidelity and versatile con- trollability. In: Globerson, A., Mackey, L., Belgrave, D., Fan, A., Paquet, U., Tomczak, J., Zhang, C. (eds.) Advances in Neural Information Processing Sys- tems. vol. 37, pp. 91560–91596. Curran...
-
[9]
In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Guo, W., Du, Y., Shen, X., Lepetit, V., Alameda-Pineda, X., Moreno-Noguer, F.: Back to mlp: A simple baseline for human motion prediction. In: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 4798–4808 (2023).https://doi.org/10.1109/WACV56688.2023.00479
-
[10]
arXiv e-prints arXiv:2309.17080 (Sep 2023).https://doi.org/10.48550/arXiv.2309
Hu, A., Russell, L., Yeo, H., Murez, Z., Fedoseev, G., Kendall, A., Shotton, J., Corrado, G.: GAIA-1: A Generative World Model for Autonomous Driving. arXiv e-prints arXiv:2309.17080 (Sep 2023).https://doi.org/10.48550/arXiv.2309. 17080
-
[11]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Hu, Y., Yang, J., Chen, L., Li, K., Sima, C., Zhu, X., Chai, S., Du, S., Lin, T., Wang, W., Lu, L., Jia, X., Liu, Q., Dai, J., Qiao, Y., Li, H.: Planning-oriented autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 17853–17862 (June 2023)
2023
- [12]
-
[13]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Jiang, B., Chen, S., Xu, Q., Liao, B., Chen, J., Zhou, H., Zhang, Q., Liu, W., Huang, C., Wang, X.: Vad: Vectorized scene representation for efficient autonomous driving. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 8306–8316 (2023).https://doi.org/10.1109/ICCV51070.2023.00766
-
[14]
arXiv preprint arXiv:2509.07996 (2025)
Kong, L., Yang, W., Mei, J., Liu, Y., Liang, A., Zhu, D., Lu, D., Yin, W., Hu, X., Jia, M., Deng, J., Zhang, K., Wu, Y., Yan, T., Gao, S., Wang, S., Li, L., Pan, L., Liu, Y., Zhu, J., Tsang Ooi, W., Hoi, S.C.H., Liu, Z.: 3D and 4D World Modeling: A Survey. arXiv e-prints arXiv:2509.07996 (Sep 2025).https://doi.org/10.48550/ arXiv.2509.07996
-
[15]
In: Proceedings of the 40th International Conference on Machine Learning
Li, J., Li, D., Savarese, S., Hoi, S.: Blip-2: bootstrapping language-image pre-training with frozen image encoders and large language models. In: Proceedings of the 40th International Conference on Machine Learning. ICML’23, JMLR.org (2023)
2023
-
[16]
Li, Y., Fan, L., He, J., Wang, Y., Chen, Y., Zhang, Z., Tan, T.: Enhancing End-to- End Autonomous Driving with Latent World Model. arXiv e-prints arXiv:2406.08481 (Jun 2024).https://doi.org/10.48550/arXiv.2406.08481
-
[17]
Liu, H., Huang, Z., Huang, W., Yang, H., Mo, X., Lv, C.: Hybrid-prediction integrated planning for autonomous driving. IEEE Transactions on Pattern Analysis and Machine Intelligence47(4), 2597–2614 (2025).https://doi.org/10.1109/ TPAMI.2025.3526936
-
[18]
In: Proceedings of the 37th International Conference on Neural Information Processing Systems
Liu, H., Li, C., Wu, Q., Lee, Y.J.: Visual instruction tuning. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. NIPS ’23, Curran Associates Inc., Red Hook, NY, USA (2023)
2023
- [19]
-
[20]
Liu, W., Qiao, Y., Wang, Z., Guo, Q., Chen, Z., Zhou, M., Li, X., Wang, L., Li, Z., Liu, H., Wang, W.: Tem3-learning: Time-efficient multimodal multi-task Driver-WM 17 learning for advanced assistive driving. In: Laugier, C., Renzaglia, A., Atanasov, N., Birchfield, S., Cielniak, G., {De Mattos}, L., Fiorini, L., Giguere, P., Hashimoto, K., Ibanez-Guzman,...
-
[21]
What’s in the image? a deep-dive into the vision of vision language models
Liu, W., Wang, W., Qiao, Y., Guo, Q., Zhu, J., Li, P., Chen, Z., Yang, H., Li, Z., Wang, L., Tan, T., Liu, H.: MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception . In: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 6864–6874. IEEE Computer Society, Los Alamitos, CA, USA (...
-
[22]
In: 2019 IEEE/CVF International Conference on Com- puter Vision (ICCV)
Martin, M., Roitberg, A., Haurilet, M., Horne, M., Reiß, S., Voit, M., Stiefelhagen, R.: Drive&act: A multi-modal dataset for fine-grained driver behavior recognition in autonomous vehicles. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 2801–2810 (2019).https://doi.org/10.1109/ICCV.2019. 00289
-
[23]
Martinez, J., Black, M.J., Romero, J.: On Human Motion Prediction Using Recurrent Neural Networks . In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4674–4683. IEEE Computer Society, Los Alamitos, CA, USA (Jul 2017).https://doi.org/10.1109/CVPR.2017.497, https: //doi.ieeecomputersociety.org/10.1109/CVPR.2017.497
-
[24]
Rahimi, A., Gerard, V., Zablocki, E., Cord, M., Alahi, A.: Mad: Motion appearance decoupling for efficient driving world models (2026),https://arxiv.org/abs/2601. 09452
2026
-
[25]
SAE On-Road Automated Vehicle Standards Committee: Taxonomy and definitions fortermsrelated toon-roadmotorvehicle automated drivingsystems.SAE Standard J3016 (2014)
2014
-
[26]
Sima, C., Renz, K., Chitta, K., Chen, L., Zhang, H., Xie, C., Beißwenger, J., Luo, P., Geiger, A., Li, H.: Drivelm: Driving with graph visual question answering. In: Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LII. p. 256–274. Springer-Verlag, Berlin, Heidelberg (2024). https://doi.o...
-
[27]
Wang, P., Bai, S., Tan, S., Wang, S., Fan, Z., Bai, J., Chen, K., Liu, X., Wang, J., Ge, W., Fan, Y., Dang, K., Du, M., Ren, X., Men, R., Liu, D., Zhou, C., Zhou, J., Lin, J.: Qwen2-vl: Enhancing vision-language model’s perception of the world at any resolution (2024),https://arxiv.org/abs/2409.12191
work page internal anchor Pith review arXiv 2024
-
[28]
Wang, W., Wang, L., Zhang, C., Liu, C., Sun, L.: Social interactions for au- tonomous driving: A review and perspectives. Found. Trends Robot10(3–4), 198–376 (Nov 2022).https://doi.org/10.1561/2300000078, https://doi.org/ 10.1561/2300000078 18 H. Chi et al
-
[29]
Wang, X., Zhu, Z., Huang, G., Chen, X., Zhu, J., Lu, J.: Drivedreamer: Towards real- world-drive world models for autonomous driving. In: Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLVIII. p. 55–72. Springer-Verlag, Berlin, Heidelberg (2024). https://doi.org/10.1007/978-3-031-73195-...
-
[30]
Human Factors64(7), 1227– 1260 (2022)
Weaver, B.W., DeLucia, P.R.: A systematic review and meta-analysis of takeover performance during conditionally automated driving. Human Factors64(7), 1227– 1260 (2022)
2022
-
[31]
Xing,Y.,Lv,C.,Cao,D.,Hang,P.:Towardhuman-vehiclecollaboration:Reviewand perspectives on human-centered collaborative automated driving. Transportation Research Part C: Emerging Technologies128, 103199 (2021).https://doi.org/ https://doi.org/10.1016/j.trc.2021.103199
-
[32]
IEEE Transactions on Vehicular Technology68(6), 5379–5390 (2019).https://doi.org/10.1109/TVT
Xing, Y., Lv, C., Wang, H., Cao, D., Velenis, E., Wang, F.Y.: Driver activity recognition for intelligent vehicles: A deep learning approach. IEEE Transactions on Vehicular Technology68(6), 5379–5390 (2019).https://doi.org/10.1109/TVT. 2019.2908425
work page doi:10.1109/tvt 2019
-
[33]
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI’18/I...
2018
-
[34]
In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)
Yang, D., Huang, S., Xu, Z., Li, Z., Wang, S., Li, M., Wang, Y., Liu, Y., Yang, K., Chen, Z., Wang, Y., Liu, J., Zhang, P., Zhai, P., Zhang, L.: Aide: A vision-driven multi-view, multi-modal, multi-tasking dataset for assistive driving perception. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 20402–20413 (2023).https://doi.org/...
-
[35]
IEEE Transactions on Intelligent Transportation Systems25(2), 2034–2045 (2024)
Yang, H., Liu, H., Hu, Z., Nguyen, A.T., Guerra, T.M., Lv, C.: Quantitative identi- fication of driver distraction: A weakly supervised contrastive learning approach. IEEE Transactions on Intelligent Transportation Systems25(2), 2034–2045 (2024). https://doi.org/10.1109/TITS.2023.3316203
-
[36]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zhao, G., Ni, C., Wang, X., Zhu, Z., Zhang, X., Wang, Y., Huang, G., Chen, X., Wang, B., Zhang, Y., Mei, W., Wang, X.: Drivedreamer4d: World models are effective data machines for 4d driving scene representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 12015–12026 (June 2025)
2025
- [37]
-
[38]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Zheng, Y., Yang, P., Xing, Z., Zhang, Q., Zheng, Y., Gao, Y., Li, P., Zhang, T., Xia, Z., Jia, P., Lang, X., Zhao, D.: World4drive: End-to-end autonomous driving via intention-aware physical latent world model. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 28632–28642 (October 2025)
2025
-
[39]
In: Proceedings of the 33rd ACM International Conference on Multimedia
Zhu, G., Fan, S., Dai, H., Ho, E.S.L.: Waymo-3dskelmo: A multi-agent 3d skele- tal motion dataset for pedestrian interaction modeling in autonomous driving. In: Proceedings of the 33rd ACM International Conference on Multimedia. p. 13184–13190. MM ’25, Association for Computing Machinery, New York, NY, USA (2025). https://doi.org/10.1145/3746027.3758273, ...
-
[40]
Zhu, W., Ma, X., Liu, Z., Liu, L., Wu, W., Wang, Y.: MotionBERT: A Unified Perspective on Learning Human Motion Representations . In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 15039–15053. IEEE Computer Society, Los Alamitos, CA, USA (Oct 2023).https://doi.org/10. 1109/ICCV51070.2023.01385, https://doi.ieeecomputersociety.org/1...
-
[41]
maximal injection step
Zitkovich, B., Yu, T., Xu, S., Xu, P., Xiao, T., Xia, F., Wu, J., Wohlhart, P., Welker, S., Wahid, A., Vuong, Q., Vanhoucke, V., Tran, H., Soricut, R., Singh, A., Singh, J., Sermanet, P., Sanketi, P.R., Salazar, G., Ryoo, M.S., Reymann, K., Rao, K., Pertsch, K., Mordatch, I., Michalewski, H., Lu, Y., Levine, S., Lee, L., Lee, T.W.E., Leal, I., Kuang, Y., ...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.