pith. sign in

arxiv: 2603.13679 · v2 · pith:LACMDHMWnew · submitted 2026-03-14 · 💻 cs.HC · cs.CV

Toward Scalable Co-located Practical Learning: Assisting with Computer Vision and Multimodal Analytics

classification 💻 cs.HC cs.CV
keywords behavioursessionsinteractiontargettraceadaptationanalysesco-located
0
0 comments X
read the original abstract

Co-located practical learning leaves evidence in visible actions around patients, task resources and room zones, but these traces are often recovered through live observation or retrospective video review. Fixed wide-angle video could reduce sensing burden, yet a debriefing pipeline must do more than detect behaviours: it must maintain detection after small camera-position shifts, relate the detector-derived behaviour trace to instructor-labelled outcomes and preserve room-zone context. This study evaluates a fixed-camera pipeline in repeated nursing simulation. Using a harmonised six-code taxonomy, we tested YOLO26 target-only training and two-stage source-to-target adaptation across two same-room side-view data sources. We then converted detections from 51 instructor-labelled sessions into one-second behaviour and behaviour-zone traces for rate, ordered-network, transition-network and sequence analyses. Two-stage adaptation improved mean mAP50 from 0.815 to 0.848 for the 2021 target view and from 0.690 to 0.855 for the smaller 2022 target view; with a balanced target quota of \(N = 22\), the 2022 model reached 0.850 mAP50. In the detector-derived behaviour trace analyses, higher phone use characterised low task-performance sessions. Zone labels changed the interpretation of patient interaction: primary patient-care-zone interaction was stronger in higher-performance sessions, while secondary-zone interaction was stronger in lower-performance sessions. Ordered and transition network models showed that ordered room-zone relations contributed beyond behaviour frequency, with the strongest task-performance classifier using zoned and co-presence features. The resulting trace is most appropriate for searchable simulation debriefing, where instructors inspect detected moments rather than receive automated assessment scores.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.