pith. machine review for the scientific record. sign in

arxiv: 2605.07389 · v1 · submitted 2026-05-08 · 💻 cs.SE · cs.LG

Recognition: no theorem link

Exploring CoCo Challenges in ML Engineering Teams: Insights From the Semiconductor Industry

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:44 UTC · model grok-4.3

classification 💻 cs.SE cs.LG
keywords collaboration challengesmachine learning engineeringsemiconductor industryinterdisciplinary teamshardware-centric developmentqualitative interviewsroles and responsibilitiesCoCo problems
0
0 comments X

The pith

Unclear roles and responsibilities are the primary collaboration challenge for machine learning teams in semiconductor development.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates collaboration and communication challenges in machine learning engineering teams operating in a hardware-centric semiconductor company. Through interviews with 12 practitioners, it identifies 16 recurring issues, pinpointing unclear roles and responsibilities as the most critical one. It also outlines practices that teams find helpful in addressing these problems. Understanding these dynamics is important because hardware environments impose strict data rules, long timelines, and physical process ties that differ from typical software development, making ML system maintenance harder.

Core claim

A qualitative study based on interviews with 12 practitioners in a global semiconductor company uncovered 16 recurring collaboration and communication challenges in ML engineering teams. Unclear roles and responsibilities stood out as the most significant issue, and the analysis also captured practitioner-recommended practices for mitigation. These challenges are shaped by the hardware-centric setting, including tight coupling with physical manufacturing processes and extended development cycles.

What carries the argument

Thematic analysis of semi-structured interviews identifying 16 CoCo challenges, with unclear roles and responsibilities ranked as most critical by participants.

If this is right

  • Clarifying roles and responsibilities early can reduce coordination problems in interdisciplinary ML teams working on manufacturing systems.
  • Practices such as establishing shared documentation standards and communication protocols help mitigate CoCo issues under long development cycles.
  • Tool support for ML projects should address data governance and coordination needs specific to hardware constraints.
  • These challenges affect the long-term reproducibility and maintenance of ML-enabled systems in production environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar role-clarity issues may arise in other regulated hardware domains such as automotive or aerospace where ML integrates with physical processes.
  • Addressing the top challenge could speed ML adoption in manufacturing by reducing delays in system deployment and updates.
  • Multi-company studies would be needed to test whether the 16 challenges hold beyond a single organizational culture.

Load-bearing premise

The assumption that findings from interviews at one global semiconductor company represent the collaboration challenges faced by ML engineering teams across hardware-centric industries.

What would settle it

Interviewing practitioners at multiple other semiconductor or hardware manufacturing firms and finding that unclear roles and responsibilities is not rated as the most critical challenge would challenge the generality of the results.

Figures

Figures reproduced from arXiv: 2605.07389 by A. Azamnouri, J. Bogner, L. Woltmann, M. Fritz, M. Haug, S. Wagner.

Figure 1
Figure 1. Figure 1: Overview of the overall research process. clear ownership structures, or sufficient cross-functional in￾tegration. The authors further link these smells to their underlying causes, such as organizational fragmentation and uneven skill distribution, and discuss their potential effects on productivity, knowledge sharing, and system quality. Similarly, Mailach and Siegmund (2023) analyzed socio￾technical anti… view at source ↗
Figure 2
Figure 2. Figure 2: CoCo Challenges identified in the interviews with the number of mentions would lead to inefficient task delegation and missed opportu￾nities for collaboration. P05 said that “The biggest problem is that maybe team members do not know what I know, and I do not know what they know.” Unlike general role awareness issues reported in prior work, this challenge is intensified by the matrix structure, where dynam… view at source ↗
Figure 3
Figure 3. Figure 3: Solutions for CoCo challenges identified in the interviews with the number of mentions relevant shared information, details, communication, and the flow of information would be helpful and would help avoid wasted time. P04 suggested the need for supportive features, such as integrated tracking systems and notification mechanisms, that can help team members maintain visibil￾ity over relevant exchanges and b… view at source ↗
read the original abstract

The integration of machine learning (ML) into complex software systems has increased challenges in collaboration and communication (CoCo) of the teams building these systems. ML engineering (MLE) teams often involve diverse roles, ML engineers, data scientists, software engineers, and domain experts, each bringing unique goals, experiences, and jargon. These interdisciplinary dynamics can make it challenging to deploy, reproduce, and maintain ML-enabled systems over the long term. Previous studies have uncovered several CoCo challenges and practices, but most have focused on software-centric companies, leaving limited empirical understanding of how these dynamics unfold in hardware-centric contexts. In hardware-centric environments, CoCo challenges are shaped by additional constraints such as strict data governance, long development cycles, and tight coupling with physical processes, which amplify coordination complexity and reduce flexibility. To strengthen empirical understanding in such settings, we present a qualitative investigation of MLE teams within a global semiconductor company, where ML-enabled systems and manufacturing processes introduce additional complexity. We interviewed 12 practitioners regarding CoCo practices, tools, challenges, and approaches. Through analysis, we identified 16 recurring challenges, with unclear roles and responsibilities emerging as the most critical, and common practices and recommendations practitioners considered effective in mitigating CoCo problems. While grounded in a single organizational context, our findings align with known issues in interdisciplinary ML-enabled systems development, but also demonstrate how these challenges manifest differently under hardware-driven constraints. Our results highlight directions for future research and tool support to strengthen CoCo in MLE projects and ensure the success of ML-enabled systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript reports a qualitative interview study with 12 practitioners from a single global semiconductor company. Using thematic analysis, the authors derive 16 recurring collaboration and communication (CoCo) challenges faced by ML engineering teams in hardware-centric settings. Unclear roles and responsibilities is identified as the most critical challenge, and the paper also surfaces practitioner-reported practices and recommendations for mitigation. The work contrasts these findings with prior software-centric studies and notes the influence of hardware-specific constraints such as strict governance, long cycles, and physical coupling.

Significance. If the empirical claims are adequately supported, the study supplies needed evidence from an under-explored hardware-centric domain where ML-enabled systems intersect with manufacturing processes. The enumeration of 16 challenges together with mitigation practices could usefully inform tool builders and process designers working on interdisciplinary ML projects in regulated industries, while the explicit alignment with and differentiation from existing literature helps delineate domain-specific research needs.

major comments (3)
  1. [Methodology] Methodology section: the account of data collection and analysis provides no detail on the interview protocol (sample questions or guide), participant recruitment and selection criteria, or the thematic analysis procedure (e.g., open vs. axial coding steps, how the 16 challenges were consolidated). These omissions are load-bearing for the central claim that 16 recurring challenges were identified, because without them the reader cannot evaluate the reliability or reproducibility of the thematic results.
  2. [Findings] Findings section: the statement that unclear roles and responsibilities is the 'most critical' challenge does not specify the operational metric (mention frequency, emphasis in transcripts, participant ranking, or another criterion). This directly affects the ranking claim and its use to prioritize future work.
  3. [Discussion] Discussion section: although the single-company limitation is acknowledged, the manuscript does not report saturation checks, inter-rater reliability statistics, or member-checking procedures. With n=12 these details are necessary to assess whether the observed patterns are robust enough to support statements about hardware-centric ML teams in general.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'common practices and recommendations practitioners considered effective' is not illustrated with even one concrete example, reducing the abstract's ability to convey the practical contribution.
  2. [Related Work] Related Work: several citations to prior CoCo studies appear; ensure the most recent (2023-2024) empirical papers on ML team coordination are included to sharpen the contrast with hardware-centric settings.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help improve the transparency and rigor of our work. We address each major comment point by point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: Methodology section: the account of data collection and analysis provides no detail on the interview protocol (sample questions or guide), participant recruitment and selection criteria, or the thematic analysis procedure (e.g., open vs. axial coding steps, how the 16 challenges were consolidated). These omissions are load-bearing for the central claim that 16 recurring challenges were identified, because without them the reader cannot evaluate the reliability or reproducibility of the thematic results.

    Authors: We agree that these methodological details are essential for transparency. In the revised manuscript, we will expand the Methodology section to include the full semi-structured interview protocol with sample questions, a description of participant recruitment (purposive sampling targeting practitioners with direct experience in ML-enabled systems within the semiconductor domain, selected by role and tenure), and the thematic analysis procedure. The analysis followed an inductive thematic approach based on Braun and Clarke's six-phase framework, with initial open coding, theme development, and iterative consolidation of the 16 challenges through team discussions to resolve discrepancies. This addition will enable readers to assess the reliability of the derived challenges. revision: yes

  2. Referee: Findings section: the statement that unclear roles and responsibilities is the 'most critical' challenge does not specify the operational metric (mention frequency, emphasis in transcripts, participant ranking, or another criterion). This directly affects the ranking claim and its use to prioritize future work.

    Authors: We appreciate this observation. The designation was based on the challenge being mentioned by all 12 participants and appearing as a foundational issue linked to multiple other challenges in the transcripts. In the revision, we will explicitly define the metric (frequency of mention across participants combined with thematic emphasis) and include supporting details such as the number of participants highlighting it and representative excerpts. This will clarify the basis for the claim while preserving the original analysis. revision: yes

  3. Referee: Discussion section: although the single-company limitation is acknowledged, the manuscript does not report saturation checks, inter-rater reliability statistics, or member-checking procedures. With n=12 these details are necessary to assess whether the observed patterns are robust enough to support statements about hardware-centric ML teams in general.

    Authors: We concur that these elements strengthen evaluation of robustness in qualitative work. We will revise the Discussion to report our saturation process: interviews continued until no new themes emerged, with saturation confirmed after the 10th interview and reinforced by the final two. The thematic analysis was led by the first author with iterative team reviews for consensus, but formal inter-rater reliability statistics were not computed; we will explicitly note this as a limitation. Member-checking was not conducted owing to constraints on practitioner time in the industrial setting, which we will disclose. We will also qualify statements to emphasize the exploratory, single-organization context and avoid broad generalizations. revision: partial

Circularity Check

0 steps flagged

No circularity: claims rest on new interview data without reduction to self-citations or fitted inputs

full rationale

The paper reports a qualitative study based on 12 practitioner interviews, followed by thematic analysis that yields 16 challenges (with unclear roles ranked most critical) and mitigation practices. No equations, parameter fitting, or predictive derivations exist that could reduce outputs to inputs by construction. Prior work is referenced only for alignment and context, not as a load-bearing justification that substitutes for the new data. The single-company limitation is explicitly noted, preserving the empirical nature of the contribution. This is standard non-circular qualitative reporting.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard assumptions of qualitative empirical research in software engineering; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption Interview responses from practitioners accurately reflect real collaboration and communication challenges in ML engineering teams.
    Core assumption underlying all qualitative interview studies; invoked when presenting the 16 challenges as recurring issues.
  • domain assumption Thematic analysis of interview transcripts can reliably identify and rank challenges such as unclear roles.
    Standard assumption in empirical SE studies; supports the claim that unclear roles is the most critical challenge.

pith-pipeline@v0.9.0 · 5597 in / 1361 out tokens · 42253 ms · 2026-05-11T01:44:10.594280+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    Conductingsemi-structuredinterviews,in:Handbook of Practical Program Evaluation

    Adams,W.C.,2015. Conductingsemi-structuredinterviews,in:Handbook of Practical Program Evaluation. Wiley, p. 492–505. URL:http://dx. doi.org/10.1002/9781119171386.ch19, doi:10.1002/9781119171386.ch19. Almahmoud,J.,DeLine,R.,Drucker,S.M.,2021.Howteamscommunicate aboutthequalityofmlmodels:Acasestudyataninternationaltechnol- ogycompany.ProceedingsoftheACMonHu...

  2. [2]

    ACM Transactions on Software Engineering and Methodology 34, 1–48

    Uncov- ering community smells in machine learning-enabled systems: Causes, effects, and mitigation strategies. ACM Transactions on Software Engineering and Methodology 34, 1–48. URL:http://dx.doi.org/10. 1145/3712198, doi:10.1145/3712198. Assres, G., Bhandari, G., Shalaginov, A., Gronli, T.M., Ghinea, G.,

  3. [3]

    ACM Computing Surveys 57, 1–35

    State-of-the-art and challenges of engineering ml- enabled software systems in the deep learning era. ACM Computing Surveys 57, 1–35. URL:http://dx.doi.org/10.1145/3731597, doi:10.1145/3731597. Azamnouri, A.,

  4. [4]

    Coco challenges in ml engineering teams: How to collaboratively build ml-enabled systems, in: 2025 IEEE/ACM 4th International Conference on AI Engineering – Software Engineering for AI (CAIN), IEEE. p. 241–243. URL:http://dx.doi.org/10.1109/ cain66642.2025.00036, doi:10.1109/cain66642.2025.00036. Bhat,A.,Coursey,A.,Hu,G.,Li,S.,Nahar,N.,Zhou,S.,Kästner,C.,...

  5. [5]

    CoRR abs/2007.05408

    Machine learning explainability for external stakeholders. CoRR abs/2007.05408. URL: https://arxiv.org/abs/2007.05408, doi:10.48550/arXiv.2007.05408. Busquim, G., Araújo, A.A., Lima, M.J., Kalinowski, M., 2024a. Towards effective collaboration between software engineers and data scientists developing machine learning-enabled systems, in: Anais do XXXVIII ...

  6. [6]

    Gebru, J

    Datasheets for datasets. Communications of the ACM 64, 86–92. URL:http://dx.doi.org/10.1145/3458723, doi:10. 1145/3458723. Guest, G., MacQueen, K., Namey, E.,

  7. [7]

    SAGE Publications, Inc

    Applied Thematic Anal- ysis. SAGE Publications, Inc. URL:http://dx.doi.org/10.4135/ 9781483384436, doi:10.4135/9781483384436. Haberl,A.,Fleiß,J.,Kowald,D.,Thalmann,S.,2024. Taketheatrain.intro- ducinganinterfacefortheaccessibletranscriptionofinterviews. Journal of Behavioral and Experimental Finance 41, 100891. URL:http://dx. doi.org/10.1016/j.jbef.2024.1...

  8. [8]

    Springer Nature Switzerland

    MLOps Adoption in the Manufacturing Industry: A Case Study with Zeiss SMT. Springer Nature Switzerland. p. 16–36. URL:http://dx.doi.org/10.1007/978-3-032-07313-6_2, doi:10. 1007/978-3-032-07313-6_2. Honkanen,T.,Odwyer,J.,Salminen,V.,2022. Multidisciplinaryteamwork in machine learning operations (mlops), in: Human Factors, Business Management and Society, ...

  9. [9]

    Experiences from conducting semi-structured interviews in empirical software engineering research, in: 11th IEEE International Software Metrics Symposium (METRICS’05), IEEE. p. 23–23. URL:http://dx.doi.org/10.1109/metrics.2005.24, doi:10.1109/ metrics.2005.24. Indykov, V., Wohlrab, R., Strüber, D.,

  10. [10]

    Quality trade-offs in ml-enabled systems: a multiple-case study, in: Proceedings of the 40th ACM/SIGAPP Symposium on Applied Computing, ACM. p. 1730–1737. URL:http://dx.doi.org/10.1145/3672608.3707754, doi:10. 1145/3672608.3707754. Kalinowski, M., Mendez, D., Giray, G., Santos Alves, A.P., Azevedo, K., Escovedo, T., Villamizar, H., Lopes, H., Baldassarre,...

  11. [11]

    In- formationandSoftwareTechnology187,107866

    Naming the pain in machine learning-enabled systems engineering. In- formationandSoftwareTechnology187,107866. URL:http://dx.doi. org/10.1016/j.infsof.2025.107866, doi:10.1016/j.infsof.2025.107866. Krause-Jüttler, G., Weitz, J., Bork, U.,

  12. [12]

    JMIR Human Factors 9, e36579

    Interdisciplinary collabo- rations in digital health research: Mixed methods case study. JMIR Human Factors 9, e36579. URL:http://dx.doi.org/10.2196/36579, doi:10.2196/36579. Latendresse,J.,Abedu,S.,Abdellatif,A.,Shihab,E.,2024. Anexploratory study on machine learning model management. ACM Transactions on SoftwareEngineeringandMethodology34,1–31. URL:http...

  13. [13]

    Characterizing and detecting mismatch in machine-learning-enabled systems, in: 2021 IEEE/ACM 1stWorkshoponAIEngineering-SoftwareEngineeringforAI(WAIN), IEEE. p. 133–140. URL:http://dx.doi.org/10.1109/wain52551.2021. 00028, doi:10.1109/wain52551.2021.00028. Li, Y., Du, J., Jiang, W.,

  14. [14]

    IISE Transactions 56, 585–599

    Reinforcement learning for process control with application in semiconductor manufacturing. IISE Transactions 56, 585–599. URL:http://dx.doi.org/10.1080/24725854.2023.2219290, doi:10.1080/24725854.2023.2219290. Lima, A., Monteiro, L., Furtado, A.,

  15. [15]

    URL:http://dx.doi.org/10.5220/0010997300003179, doi:10.5220/ 0010997300003179

    Mlops: Practices, maturity models, roles, tools, and challenges – a systematic literature review, in: Proceedings of the 24th International Conference on Enterprise InformationSystems,SCITEPRESS-ScienceandTechnologyPublica- tions. URL:http://dx.doi.org/10.5220/0010997300003179, doi:10.5220/ 0010997300003179. Mailach,A.,Siegmund,N.,2023. Socio-technicalant...

  16. [16]

    Model cards for model re- porting, in: Proceedings of the Conference on Fairness, Accountability, andTransparency,ACM.p.220–229.URL:http://dx.doi.org/10.1145/ A. Azamnouri et al.:Preprint submitted to ElsevierPage 20 of 21 Exploring CoCo Challenges in ML Engineering Teams: Insights From the Semiconductor Industry 3287560.3287596, doi:10.1145/3287560.32875...

  17. [17]

    Collaboration chal- lengesinbuildingml-enabledsystems:communication,documentation, engineering, and process, in: Proceedings of the 44th International Conference on Software Engineering, ACM. p. 413–425. URL:http: //dx.doi.org/10.1145/3510003.3510209, doi:10.1145/3510003.3510209. Nazir, R., Bucaioni, A., Pelliccione, P.,

  18. [18]

    Journal of Systems and Software 207, 111860

    Architecting ml-enabled systems: Challenges, best practices, and design decisions. Journal of Systems and Software 207, 111860. URL:http://dx.doi.org/10.1016/ j.jss.2023.111860, doi:10.1016/j.jss.2023.111860. Pineau, J., Vincent-Lamarre, P., Sinha, K., Lariviere, V., Beygelzimer, A., d’AlcheBuc,F.,Fox,E.,Larochelle,H.,2021.Improvingreproducibility in mach...

  19. [19]

    Proceedings of the ACM on Human-Computer Interaction 5, 1–25

    How ai developers overcome communication challenges in a multidisciplinary team: A case study. Proceedings of the ACM on Human-Computer Interaction 5, 1–25. URL:http://dx.doi.org/10. 1145/3449205, doi:10.1145/3449205. Polyzotis, N., Roy, S., Whang, S.E., Zinkevich, M.,

  20. [20]

    ACM SIGMOD Record 47, 17–28

    Data lifecycle challenges in production machine learning: A survey. ACM SIGMOD Record 47, 17–28. URL:http://dx.doi.org/10.1145/3299887.3299891, doi:10.1145/3299887.3299891. Recupito,G.,Pecorelli,F.,Catolino,G.,Lenarduzzi,V.,Taibi,D.,DiNucci, D., Palomba, F.,

  21. [21]

    Journal of Systems and Software 216, 112151

    Technical debt in ai-enabled systems: On the prevalence, severity, impact, and management strategies for code and architecture. Journal of Systems and Software 216, 112151. URL: http://dx.doi.org/10.1016/j.jss.2024.112151, doi:10.1016/j.jss.2024. 112151. Retzlaff,C.O.,Angerschmid,A.,Saranti,A.,Schneeberger,D.,Röttger,R., Müller, H., Holzinger, A.,

  22. [22]

    URL https://www.sciencedirect.com/science/ article/pii/S1389041724000378

    Post-hoc vs ante-hoc explanations: xai design guidelines for data scientists. Cognitive Systems Research 86, 101243. URL:http://dx.doi.org/10.1016/j.cogsys.2024.101243, doi:10.1016/j.cogsys.2024.101243. Runeson, P., Höst, M.,

  23. [23]

    Empirical Software Engineer- ing 14, 131–164

    Guidelines for conducting and reporting case study research in software engineering. Empirical Software Engineer- ing 14, 131–164. URL:http://dx.doi.org/10.1007/s10664-008-9102-8, doi:10.1007/s10664-008-9102-8. Saldana, J.,

  24. [24]

    2025.The Coding Manual for Qualitative Researchers

    The Coding Manual for Qualitative Researchers. SAGE Publications Ltd. URL:http://dx.doi.org/10.4135/9781036235611, doi:10.4135/9781036235611. Sambasivan, N., Kapania, S., Highfill, H., Akrong, D., Paritosh, P., Aroyo, L.M.,

  25. [25]

    Everyone wants to do the model work, not the data work

    “everyone wants to do the model work, not the data work”: Data cascades in high-stakes ai, in: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, ACM. p. 1–15. URL:http://dx.doi.org/10.1145/3411764.3445518, doi:10.1145/ 3411764.3445518. Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V., Youn...

  26. [26]

    (Eds.), Ad- vances in Neural Information Processing Systems, Curran Associates, Inc

    Hidden technical debt in machine learning systems, in: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (Eds.), Ad- vances in Neural Information Processing Systems, Curran Associates, Inc. URL:https://proceedings.neurips.cc/paper_files/paper/2015/ file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf. Seaman, C.,

  27. [27]

    IEEE Transactions on Software Engineering 25, 557–572

    Qualitative methods in empirical studies of software engineering. IEEE Transactions on Software Engineering 25, 557–572. URL:http://dx.doi.org/10.1109/32.799955, doi:10.1109/32.799955. Suresh, H., Gomez, S.R., Nam, K.K., Satyanarayan, A.,

  28. [28]

    Beyond expertise and roles: A framework to characterize the stakeholders of interpretable machine learning and their needs, in: Proceedings of the 2021CHIConferenceonHumanFactorsinComputingSystems,ACM. p. 1–16. URL:http://dx.doi.org/10.1145/3411764.3445088, doi:10. 1145/3411764.3445088. Wan,Z.,Xia,X.,Lo,D.,Murphy,G.C.,2020. Howdoesmachinelearning changeso...

  29. [29]

    An exploratory study of v-model in building ml-enabled software: A systems engineering perspective, in: Proceedings of the IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI, ACM. p. 30–40. URL:http://dx.doi.org/10.1145/ 3644815.3644951, doi:10.1145/3644815.3644951. Xu, H.W., Zhang, Q.H., Sun, Y.N., Chen, Q.L., Qin, W., ...

  30. [30]

    Journal of Manufacturing Systems 76, 222–233

    A fast ramp-up framework for wafer yield improvement in semiconductor manufacturing systems. Journal of Manufacturing Systems 76, 222–233. URL:http://dx.doi.org/10.1016/j.jmsy.2024. 07.001, doi:10.1016/j.jmsy.2024.07.001. Zaharia, M., Chen, A., Davidson, A., Ghodsi, A., Hong, S.A., Konwinski, A., Murching, S., Nykodym, T., Ogilvie, P., Parkhe, M., et al.,