MetaPlate: Counterfactual-Guided RAG-LLM Tool for Personalized Food Recommendation and Hyperglycemia Prevention

Asiful Arefeen; Carol Johnston; Hassan Ghasemzadeh

arxiv: 2606.10120 · v2 · pith:6QTAVMKInew · submitted 2026-06-08 · 💻 cs.IR · cs.AI· cs.HC

MetaPlate: Counterfactual-Guided RAG-LLM Tool for Personalized Food Recommendation and Hyperglycemia Prevention

Asiful Arefeen , Carol Johnston , Hassan Ghasemzadeh This is my paper

Pith reviewed 2026-06-27 14:26 UTC · model grok-4.3

classification 💻 cs.IR cs.AIcs.HC

keywords personalized meal recommendationcounterfactual explanationsRAG-LLMpostprandial hyperglycemiacontinuous glucose monitoringdietary decision supportmacronutrient optimizationexpert evaluation

0 comments

The pith

MetaPlate uses counterfactual meal adjustments and an LLM to generate personalized food recommendations that reduce post-meal blood sugar spikes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MetaPlate as a system that takes CGM readings, wearable signals, and meal inputs to predict glucose response, then optimizes meal composition by changing macronutrient amounts so predicted levels stay at or below 140 mg/dL. A retrieval-augmented LLM layer converts those optimized values into readable suggestions drawn from the USDA food database. Structured testing with registered dietitians shows that prompt refinement raises scores for meal realism, portion suitability, and likelihood of recommendation. A sympathetic reader would care because static dietary advice often fails to deliver actionable steps that actually limit postprandial excursions in daily life.

Core claim

MetaPlate generates personalized meal recommendations via counterfactual optimization and RAG-LLM that improve meal realism, portion suitability, and recommendation likelihood as judged by registered dietitians after prompt refinement.

What carries the argument

The counterfactual optimization module that modifies macronutrient amounts to keep a machine-learning glucose-response prediction inside the target range, paired with a constrained RAG layer that produces human-readable suggestions from the USDA database.

If this is right

Dietary recommendations become more contextually appropriate once domain constraints are added to the LLM stage.
Real-time adjustment of meal composition can be performed from multimodal user data without requiring extensive manual input.
Expert-in-the-loop refinement shifts outputs from implausible to actionable suggestions.
The same pipeline can support decision support for healthy adults aiming to limit postprandial excursions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Retraining the underlying glucose model on larger and more diverse groups would likely be required before deployment to populations beyond the original 25 participants.
The framework could be extended to other metabolic targets if the prediction model is swapped for a different endpoint.
Daily integration with wearable apps would turn the one-time recommendation step into a recurring tool.

Load-bearing premise

The glucose-response model trained on data from only 25 individuals will give accurate enough predictions to guide reliable meal adjustments for new users.

What would settle it

A trial in which new participants follow the generated meal plans while wearing CGM and the measured glucose values exceed 140 mg/dL at rates higher than the model's predictions.

Figures

Figures reproduced from arXiv: 2606.10120 by Asiful Arefeen, Carol Johnston, Hassan Ghasemzadeh.

**Figure 1.** Figure 1: MetaPlate framework consists of multiple phases: (1) data acquisition from healthy adults in free-living condition using CGM sensor, wristband and smartphone application, (2) data processing, feature engineering, model training and validation for forecasting post-meal blood glucose peak, (3) CF optimization for meal macro-nutrient adjustment to achieve in-range post-meal glucose level, (4) LLM-RAG module t… view at source ↗

**Figure 2.** Figure 2: LLM comparison across RMSE, glycemic consistency, and diversity. For visualization purposes, RMSE-based metrics are normalized and inverted to accuracy such that higher values indicate better performance. Specifically, lower RMSE corresponds to higher normalized accuracy in the radar plot. 0 2 4 6 8 10 Score Meal composition Usability Trustworthiness Consistency with clinical knowledge Ease of use Recommen… view at source ↗

**Figure 3.** Figure 3: Comparison of expert evaluation scores before and after prompt refinement across case-level (red) and system-level (blue) dimensions. Ratings are reported on a 10-point Likert scale with error bars indicating standard deviation across experts and cases. Substantial improvements are observed across all dimensions following prompt redesign, particularly in portion suitability, recommendation likelihood, eas… view at source ↗

read the original abstract

Postprandial hyperglycemia is a key risk factor for metabolic disorders; however, existing dietary guidance is often static, impractical, and insufficiently personalized, providing recommendations that are difficult to follow or not impactful. While recent advances leverage continuous glucose monitoring (CGM) and machine learning to predict glycemic responses, these approaches are largely predictive and lack actionable guidance. Moreover, recommendation systems are often misaligned with user goals and require extensive input. We present MetaPlate, a counterfactual explanation (CF) guided, context-aware decision-support framework that generates personalized meal recommendations to mitigate postprandial glucose excursions in healthy adults. MetaPlate integrates multimodal data, including CGM readings, wearable-derived physiological signals, and user-provided meal inputs from $25$ individuals to model pre-meal context. A machine learning model predicts glucose response, while a CF optimization module adjusts meal composition modifying macronutrient amounts to maintain glucose levels within a target range ($\leq 140$ mg/dL). An LLM-based retrieval-augmented generation (RAG) layer enhances interpretability by producing human-readable recommendations using constrained search of the USDA food database. We evaluate MetaPlate via a structured expert-in-the-loop assessment with registered dietitians (RDs), comparing performance before and after prompt refinement. Results show improvements in meal realism, portion suitability, and recommendation likelihood, with expert feedback indicating a shift from clinically implausible outputs to actionable, contextually appropriate recommendations. Our findings emphasize the importance of domain knowledge and structured constraints in LLM-driven systems and highlight the potential of MetaPlate as a real-time personalized dietary decision-support tool.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MetaPlate wires together a small-data glucose predictor, counterfactual meal tweaks, and RAG-LLM outputs, but supplies no accuracy numbers or outcome measures to show the pipeline actually works.

read the letter

The paper's core is an end-to-end system: train an ML model on CGM and wearable data from 25 people to predict post-meal glucose, use counterfactual optimization to adjust macronutrients so the prediction stays under 140 mg/dL, then feed the result into a RAG-LLM that pulls from the USDA database and produces readable meal suggestions. The only reported result is that registered dietitians rated the LLM outputs higher on realism, portion size, and likelihood of recommendation after the authors refined the prompts.

That integration is straightforward and the expert feedback loop is a reasonable way to surface obvious LLM failures. The authors also make clear that domain constraints matter for making the generated advice usable.

The gaps are straightforward. The abstract gives no accuracy, precision, or error metrics for the glucose predictor itself, no hold-out tests on new users or new meals, and no before-after glucose data from anyone actually following the recommendations. With a training set of 25 the model is at obvious risk of poor generalization, yet that risk is not quantified or even discussed. The dietitian ratings come after iterative prompt changes judged by the same experts, so the improvement is partly circular.

This is the kind of applied systems paper that might interest people already building nutrition apps who want one more example of chaining CGM models with LLMs. Readers who need either reproducible ML performance numbers or evidence that the recommendations change glucose will find little to use. The work does not reach the threshold for serious peer review; the central preventive claim rests on untested components.

Referee Report

2 major / 1 minor

Summary. The paper presents MetaPlate, a system integrating a machine-learning glucose-response predictor (trained on multimodal CGM/wearable/meal data from 25 subjects), counterfactual optimization to adjust macronutrients so predicted postprandial glucose stays ≤140 mg/dL, and a RAG-LLM layer that queries the USDA database to produce human-readable personalized meal recommendations. The central evaluation is a before/after expert-in-the-loop study with registered dietitians that reports improved ratings on meal realism, portion suitability, and recommendation likelihood after prompt refinement.

Significance. If the glucose predictor generalizes and the counterfactual adjustments prove valid, the framework could supply a practical real-time decision-support tool that translates CGM data into actionable, interpretable dietary advice. The explicit use of domain constraints inside the LLM generation step is a constructive design choice that addresses known LLM hallucination risks in health applications.

major comments (2)

[Abstract] Abstract (evaluation paragraph): the reported improvements in realism/portion/recommendation likelihood are obtained after iterative prompt refinement judged by the same experts; no independent test set, control condition, or quantitative metric of predictor accuracy or realized glycemic effect is supplied, so the evaluation cannot establish that the counterfactual module achieves its stated preventive goal.
[Abstract] Abstract (model description): the glucose-response model is trained on multimodal data from only 25 individuals; because the counterfactual optimization module directly uses this model's predictions to modify macronutrient amounts for new users and meals, the small cohort size creates a high risk that out-of-sample predictions will be unreliable, rendering the downstream recommendations' claimed effect on hyperglycemia prevention unsupported.

minor comments (1)

The manuscript does not specify the exact machine-learning architecture, feature set, or cross-validation procedure used for the glucose predictor; adding these details (even if only in supplementary material) would improve reproducibility without altering the central claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments highlighting important limitations in the scope of our evaluation and the generalizability of the glucose-response model. We address each point below and will incorporate clarifications and expanded limitations discussions in a revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract (evaluation paragraph): the reported improvements in realism/portion/recommendation likelihood are obtained after iterative prompt refinement judged by the same experts; no independent test set, control condition, or quantitative metric of predictor accuracy or realized glycemic effect is supplied, so the evaluation cannot establish that the counterfactual module achieves its stated preventive goal.

Authors: We agree that the current evaluation is an expert-in-the-loop assessment of recommendation quality (realism, portion suitability, likelihood) rather than a direct test of glycemic prevention. The study design intentionally focused on refining the RAG-LLM component via dietitian feedback and does not include an independent test set, control arm, or measured postprandial glucose outcomes. We will revise the abstract to explicitly state that the reported improvements reflect expert-judged recommendation quality after prompt refinement, not clinical efficacy of the counterfactual module. We will also add a dedicated limitations paragraph clarifying this scope. revision: yes
Referee: [Abstract] Abstract (model description): the glucose-response model is trained on multimodal data from only 25 individuals; because the counterfactual optimization module directly uses this model's predictions to modify macronutrient amounts for new users and meals, the small cohort size creates a high risk that out-of-sample predictions will be unreliable, rendering the downstream recommendations' claimed effect on hyperglycemia prevention unsupported.

Authors: The cohort of 25 subjects is a genuine limitation that restricts claims about out-of-sample reliability and broad preventive effects. The manuscript presents MetaPlate as an integrated proof-of-concept framework rather than a validated clinical tool. We will expand the limitations section to discuss the small sample size, potential overfitting risks for the glucose predictor, and the consequent need for larger validation studies before claiming reliable hyperglycemia prevention in new users. revision: partial

Circularity Check

1 steps flagged

Expert evaluation of improvements tied to iterative prompt refinement by same assessors

specific steps

self definitional [Abstract]
"We evaluate MetaPlate via a structured expert-in-the-loop assessment with registered dietitians (RDs), comparing performance before and after prompt refinement. Results show improvements in meal realism, portion suitability, and recommendation likelihood, with expert feedback indicating a shift from clinically implausible outputs to actionable, contextually appropriate recommendations."

The claimed improvements are demonstrated via before/after comparison within the same expert-in-the-loop refinement process. The 'improvement' is therefore partly defined by the iterative prompt adjustments and expert judgments that constitute the evaluation procedure, rather than an independent external benchmark.

full rationale

The paper's evaluation of MetaPlate shows improvements by comparing RAG-LLM outputs before and after prompt refinement, with judgments from the same registered dietitians in the expert-in-the-loop process. This introduces moderate circularity because the reported gains in realism and suitability are measured within the refinement loop itself. No other circular steps found in the model training, counterfactual optimization, or RAG components, which rely on standard techniques without reducing to self-definition or fitted inputs by construction. The n=25 sample size is a generalization concern but not circularity.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

Abstract-only review prevents exhaustive enumeration; the 140 mg/dL target and the 25-person cohort size appear as fixed choices without stated justification or sensitivity analysis.

free parameters (2)

glucose_target_threshold
Fixed at ≤140 mg/dL; no derivation or external validation supplied in abstract.
training_cohort_size
25 individuals used to build the glucose model; no justification for sample size or diversity.

axioms (1)

domain assumption The ML glucose-response model produces sufficiently accurate predictions to support counterfactual meal edits.
Invoked implicitly when the CF module adjusts macronutrients on the basis of model output.

pith-pipeline@v0.9.1-grok · 5832 in / 1413 out tokens · 23110 ms · 2026-06-27T14:26:17.857307+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

73 extracted references · 5 canonical work pages · 1 internal anchor

[1]

Meal Macronutrient Adjustment4

Model Development3. Meal Macronutrient Adjustment4. Human Translatable Information Retrieval 5. Healthy Meal Suggestion Subject to: Meal data Glucose data Mobility data Activity data Fig. 1. MetaPlate framework consists of multiple phases: (1) data acquisition from healthy adults in free-living condition using CGM sensor, wristband and smartphone applicat...
[2]

Since Embrace Plus records data in UTC, all timestamps are converted to a common local timezone to ensure consistency with CGM and dietary logs

Data Synchronization and Alignment : Data streams from CGM, wristband, and nutrition logs are first tempo- rally aligned. Since Embrace Plus records data in UTC, all timestamps are converted to a common local timezone to ensure consistency with CGM and dietary logs. The CGM-derived signals are then upsampled to a uniform 6 IEEE JOURNAL OF BIOMEDICAL AND H...
[3]

Meal Event Identification and Aggregation : Meal events are extracted from nutrition logs. Due to the tendency of users to log multiple food items within a short time span, temporally adjacent meal entries occurring within a 30-minute interval are grouped into a single meal cluster. Each cluster is represented by its latest timestamp, corresponding to the...
[4]

Feature Extraction : For each event at time tm, a feature vector x is constructed using data from a two- hour pre-meal window [tm−2h, tm). The extracted features include statistical summaries (mean and standard devia- tion) of wristband-derived signals, including step counts, activity counts, METs, EDA, skin temperature, and pulse rate. Subjects’ age and ...
[5]

This def- inition aligns with clinical understanding of postprandial glucose excursions and captures peak glycemic response following a meal

Target Variable Construction : The forecast target y was defined as the maximum glucose value observed within a two-hour postprandial window (tm, tm + 2h]. This def- inition aligns with clinical understanding of postprandial glucose excursions and captures peak glycemic response following a meal
[6]

Train/test split : To prevent subject-level data leakage and ensure better generalization, the dataset is partitioned at subject level. Specifically, a 10/3 subject-wise split is followed, with ten participants being randomly selected for model training and the remaining three participants are set aside for evaluation and meal plan generation. The data pr...
[7]

In this cohort, participants were monitored in a controlled lab setting while consuming standardized meals and wearing a Dexcom G6 Pro and an Empatica E4 wrist-worn device

Supplementing the training data : Given the limited size of the train dataset ( 376 samples), the training data is supplemented with an additional dataset from the MealMeter [27] project (IRB #15102) comprising 12 subjects ( n = 168 samples). In this cohort, participants were monitored in a controlled lab setting while consuming standardized meals and wea...

2026
[8]

Var.), Pearson correlation coeﬀicient ( r), MAPE, and sMAPE

Regression model : The forecasting model is evaluated using a set of standard regression performance metrics: RMSE, MAE, median absolute error (MedAE), coeﬀicient of determination ( R2), explained variance (Exp. Var.), Pearson correlation coeﬀicient ( r), MAPE, and sMAPE
[9]

A CF is considered valid if the predicted postprandial glucose level falls below a predefined evaluation threshold τeval = 140 mg/dL

Counterfactuals : CFs are validated using the metrics below- Validity assesses whether the generated CFs achieve the desired glycemic outcome under a regression setting. A CF is considered valid if the predicted postprandial glucose level falls below a predefined evaluation threshold τeval = 140 mg/dL. validity = 1 |X | X (x,m0)∈X /x31 fθ(x, m∗) ≤ τeval (...
[10]

Lower RMSE indicates better adherence to the target macronutrient constraints

LLM mappings : The LLM-based meal mapping mod- ule is evaluated along three dimensions- Constraint Satisfaction (RMSE) measures how closely the LLM generated meal matches the target macronutrient profile using root mean squared error (RMSE) for each macronutrient: RMSEj = vuut 1 N NX i=1 mLLM ij − m∗ ij 2 (20) where j ∈ {C, P, F } denotes carbohydrates, p...
[11]

this is a snack, not a meal

Expert-Based Validation of the Interventions : We evalu- ate the clinical relevance and practical applicability of the generated meal interventions through expert assessment. Case-level Evaluation: Experts were provided with the subject context, predicted postprandial glucose response, and the corresponding MetaPlate-generated meal recom- mendation for ea...
[12]

Continuous glucose monitoring in a healthy population: understanding the post-prandial glycemic response in individuals without diabetes mellitus

P. R. E. Jarvis, J. L. Cardin, P. M. Nisevich-Bede, and J. P. McCarter, “Continuous glucose monitoring in a healthy population: understanding the post-prandial glycemic response in individuals without diabetes mellitus. ” Metabolism: clinical and experimental, p. 155640, 2023

2023
[13]

B. Giri, S. Dey, T. Das, M. Sarkar, J. Banerjee, and S. K. Dash, “Chronic hyperglycemia mediated physiological alteration and metabolic distortion leads to organ dysfunction, infection, can- cer progression and other pathophysiological consequences: An update on glucose toxicity. ” Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie, vol. 107, ...

2018
[14]

Is nondiabetic hyperglycemia a risk factor for cardiovascular disease? a meta- analysis of prospective studies

E. B. Levitan, Y. Song, E. S. Ford, and S. Liu, “Is nondiabetic hyperglycemia a risk factor for cardiovascular disease? a meta- analysis of prospective studies. ” Archives of internal medicine, vol. 164 19, pp. 2147–55, 2004

2004
[15]

Uncovering person- alized glucose responses and circadian rhythms from multiple wearable biosensors with bayesian dynamical modeling,

N. E. Phillips, T.-H. Collet, and F. Naef, “Uncovering person- alized glucose responses and circadian rhythms from multiple wearable biosensors with bayesian dynamical modeling,” Cell Reports Methods, vol. 3, 2023

2023
[16]

Personalized nutrition by prediction of glycemic responses

D. A. Zeevi, T. Korem, N. Zmora, D. Israeli, D. Roth- schild, A. Weinberger, O. Ben-Yacov, D. Lador, T. A vnit-Sagi, M. Lotan-Pompan, J. Suez, J. A. Mahdi, E. Matot, G. Malka, N. Kosower, M. Rein, G. Zilberman-Schapira, L. Dohnalová, M. Pevsner-Fischer, R. Bikovsky, Z. Halpern, E. Elinav, and E. Segal, “Personalized nutrition by prediction of glycemic res...

2015
[17]

Machine learning-based glucose prediction with use of continuous glucose and physical activity monitoring data: The maastricht study,

W. P. van Doorn, Y. D. Foreman, N. C. Schaper, H. H. Savelberg, A. Koster, C. J. H. van der Kallen, A. Wesselius, M. T. Schram, R. M. A. Henry, P. C. Dagnelie, B. E. de Galan, O. Bekers, C. D. A. Stehouwer, S. J. R. Meex, and M. C. Brouwers, “Machine learning-based glucose prediction with use of continuous glucose and physical activity monitoring data: Th...

2021
[18]

Attengluco: Multimodal transformer-based blood glucose forecasting on ai-readi dataset,

E. Farahmand, R. R. Azghan, N. T. Chatrudi, E. Kim, G. K. Gudur, E. Thomaz, G. Pedrielli, P. K. Turaga, and H. Ghasemzadeh, “Attengluco: Multimodal transformer-based blood glucose forecasting on ai-readi dataset,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–7, 2025

2025
[19]

Time-aware cross-attention for multi-modal sensor-based blood glucose forecasting,

A. Machiraju, E. Farahmand, S. B. Soumma, A. Arefeen, C. Johnston, and H. Ghasemzadeh, “Time-aware cross-attention for multi-modal sensor-based blood glucose forecasting,” 2025 IEEE 21st International Conference on Body Sensor Networks (BSN), pp. 1–4, 2025

2025
[20]

Population-specific glucose prediction in diabetes care with transformer-based deep learning on the edge,

T. Zhu, L. Kuang, C. Piao, J. Zeng, K. Li, and P. Georgiou, “Population-specific glucose prediction in diabetes care with transformer-based deep learning on the edge,” IEEE Transac- tions on Biomedical Circuits and Systems, vol. 18, pp. 236–246, 2024

2024
[21]

Glucoseassist: Per- sonalized blood glucose level predictions and early dysglycemia detection,

P. Shroff, A. Arefeen, and H. Ghasemzadeh, “Glucoseassist: Per- sonalized blood glucose level predictions and early dysglycemia detection,” 2023 IEEE 19th International Conference on Body Sensor Networks (BSN), pp. 1–4, 2023. AREFEEN et al.: METAPLATE: COUNTERF ACTUAL-GUIDED PERSONALIZED FOOD RECOMMENDATIONS 13

2023
[22]

Glycemic-aware and architecture-agnostic training framework for blood glucose forecasting in type 1 diabetes,

S. Khamesian, A. Arefeen, M. A. Grando, B. Thompson, and H. Ghasemzadeh, “Glycemic-aware and architecture-agnostic training framework for blood glucose forecasting in type 1 diabetes,” 2025

2025
[23]

Glyrag: Context-aware retrieval-augmented framework for blood glucose forecasting,

S. B. Soumma and H. Ghasemzadeh, “Glyrag: Context-aware retrieval-augmented framework for blood glucose forecasting,” ArXiv, vol. abs/2601.05353, 2026

work page arXiv 2026
[24]

An ai-based nutrition recommendation system: technical vali- dation with insights from mediterranean cuisine,

K. Kalpakoglou, L. Calderón-Pérez, N. Boqué, M. Guldas, Çağla Erdoğan Demir, L. P. Gymnopoulos, and K. Dimitropoulos, “An ai-based nutrition recommendation system: technical vali- dation with insights from mediterranean cuisine,” Frontiers in Nutrition, vol. 12, 2025

2025
[25]

Computational framework for sequential diet recommendation: Integrating linear optimization and clinical domain knowledge,

A. Arefeen, N. Jaribi, B. J. Mortazavi, and H. Ghasemzadeh, “Computational framework for sequential diet recommendation: Integrating linear optimization and clinical domain knowledge,” 2022 IEEE/ACM Conference on Connected Health: Applica- tions, Systems and Engineering Technologies (CHASE), pp. 91– 98, 2022

2022
[26]

Ai-driven personalized nutrition: Rag-based digital health solution for obesity and type 2 diabetes,

A. K. Gavai and J. van Hillegersberg, “Ai-driven personalized nutrition: Rag-based digital health solution for obesity and type 2 diabetes,” PLOS Digital Health, vol. 4, 2025

2025
[27]

Nutrigen: Personalized meal plan generator leveraging large language models to enhance dietary and nutritional adherence,

S. Khamesian, A. Arefeen, S. M. Carpenter, and H. Ghasemzadeh, “Nutrigen: Personalized meal plan generator leveraging large language models to enhance dietary and nutritional adherence,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–7, 2025

2025
[28]

Chatdiet: Empowering personalized nutrition-oriented food recommender chatbots through an llm- augmented framework,

Z. Yang, E. Khatibi, N. Nagesh, M. Abbasian, I. Azimi, R. Jain, and A. M. Rahmani, “Chatdiet: Empowering personalized nutrition-oriented food recommender chatbots through an llm- augmented framework,” Smart Health, vol. 32, p. 100465, 2024

2024
[29]

Mopi-hfrs: A multi-objective personalized health-aware food recommendation system with llm-enhanced interpreta- tion,

Z. Zhang, Z. Wang, T. Ma, V. S. Taneja, S. Nelson, N. H. L. Le, K. Murugesan, M. Ju, N. V. Chawla, C. Zhang, and Y. Ye, “Mopi-hfrs: A multi-objective personalized health-aware food recommendation system with llm-enhanced interpreta- tion,” Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2024

2024
[30]

Counterfactual explanations as interventions in latent space,

R. Crupi, A. Castelnovo, D. Regoli, and B. S. M. Gonzalez, “Counterfactual explanations as interventions in latent space,” Data Mining and Knowledge Discovery, vol. 38, pp. 2733 – 2769, 2021

2021
[31]

Designing user-centric be- havioral interventions to prevent dysglycemia with novel coun- terfactual explanations,

A. Arefeen and H. Ghasemzadeh, “Designing user-centric be- havioral interventions to prevent dysglycemia with novel coun- terfactual explanations,” ArXiv, vol. abs/2310.01684, 2023

work page arXiv 2023
[32]

GlyTwin: Digital Twin for Glucose Control in Type 1 Diabetes Through Optimal Behavioral Modifications Using Patient-Centric Counterfactuals

A. Arefeen, S. Khamesian, M. A. Grando, B. Thompson, and H. Ghasemzadeh, “Glytwin: Digital twin for glucose control in type 1 diabetes through optimal behavioral modifications using patient-centric counterfactuals,” ArXiv, vol. abs/2504.09846, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[33]

Glyman: Glycemic management using patient-centric counterfactuals,

——, “Glyman: Glycemic management using patient-centric counterfactuals,” 2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–5, 2024

2024
[34]

Entropy-based logic explanations of neural networks,

P. Barbiero, G. Ciravegna, F. Giannini, P. Li’o, M. Gori, and S. Melacci, “Entropy-based logic explanations of neural networks,” ArXiv, vol. abs/2106.06804, 2021

work page arXiv 2021
[35]

Explaining ma- chine learning classifiers through diverse counterfactual expla- nations,

R. K. Mothilal, A. Sharma, and C. Tan, “Explaining ma- chine learning classifiers through diverse counterfactual expla- nations,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 607–617

2020
[36]

Nice: an algorithm for nearest instance counterfactual explanations,

D. Brughmans and D. Martens, “Nice: an algorithm for nearest instance counterfactual explanations,” Data Mining and Knowl- edge Discovery, pp. 1–39, 2021

2021
[37]

A model-agnostic and data-independent tabu search algorithm to generate counterfactuals for tabular, image, and text data,

R. M. B. de Oliveira, K. Sörensen, and D. Martens, “A model-agnostic and data-independent tabu search algorithm to generate counterfactuals for tabular, image, and text data,” European Journal of Operational Research, 2023

2023
[38]

Mealmeter: Using multimodal sensing and machine learning for automatically estimating nutrition intake,

A. Arefeen, S. N. Fessler, S. M. Mostafavi, C. Johnston, and H. Ghasemzadeh, “Mealmeter: Using multimodal sensing and machine learning for automatically estimating nutrition intake,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–6, 2025

2025
[39]

Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity,

V. T. van Hees, L. Gorzelniak, E. C. D. León, M. Eder, M. R. Pias, S. Taherian, U. Ekelund, F. Renström, P. W. Franks, A. Horsch, and S. Brage, “Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity,” PLoS ONE, vol. 8, 2013

2013
[40]

Age group comparability of raw accelerometer output from wrist- and hip-worn monitors

M. Hildebrand, V. T. van Hees, B. H. Hansen, and U. Ekelund, “Age group comparability of raw accelerometer output from wrist- and hip-worn monitors. ” Medicine and science in sports and exercise, vol. 46 9, pp. 1816–24, 2014

2014
[41]

Estimation of daily energy expenditure in pregnant and non-pregnant women using a wrist-worn tri-axial accelerometer,

V. T. van Hees, F. Renström, A. Wright, A. Gradmark, M. Catt, K. Y. Chen, M. Löf, L. J. C. Bluck, J. Pomeroy, N. J. Wareham, U. Ekelund, S. Brage, and P. W. Franks, “Estimation of daily energy expenditure in pregnant and non-pregnant women using a wrist-worn tri-axial accelerometer,” PLoS ONE, vol. 6, 2011

2011
[42]

Guide to the assessment of physical activity: Clinical and research applications: a scientific statement from the american heart association

S. J. Strath, L. A. Kaminsky, B. E. Ainsworth, U. Ekelund, P. S. Freedson, R. A. Gary, C. R. Richardson, D. T. Smith, and A. M. Swartz, “Guide to the assessment of physical activity: Clinical and research applications: a scientific statement from the american heart association. ” Circulation, vol. 128 20, pp. 2259–79, 2013

2013
[43]

Accelerometer data collection and processing criteria to assess physical activity and other outcomes: A sys- tematic review and practical considerations,

J. H. Migueles, C. Cadenas-Sánchez, U. Ekelund, C. D. Nys- tröm, J. Mora-Gonzalez, M. Löf, I. Labayen, J. R. Ruiz, and F. B. Ortega, “Accelerometer data collection and processing criteria to assess physical activity and other outcomes: A sys- tematic review and practical considerations,” Sports Medicine, vol. 47, pp. 1821–1845, 2017

2017
[44]

MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers

W. Wang, F. Wei, L. Dong, H. Bao, N. Yang, and M. Zhou, “Minilm: Deep self-attention distillation for task- agnostic compression of pre-trained transformers,” ArXiv, vol. abs/2002.10957, 2020

work page arXiv 2002
[45]

Measuring the quality of explanations: The system causability scale (scs),

A. Holzinger, A. M. Carrington, and H. Müller, “Measuring the quality of explanations: The system causability scale (scs),” Kunstliche Intelligenz, vol. 34, pp. 193 – 198, 2019

2019
[46]

Learning personal food preferences via food logs embedding,

A. A. Metwally, A. K. Leong, A. Desai, A. Nagarjuna, D. Perel- man, and M. P. Snyder, “Learning personal food preferences via food logs embedding,” 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2281–2286, 2021. IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS 1 Supplementary Materials for MetaPlate I. M ODEL FINE -T U...

2021
[47]

Professional Role : • Registered Dietitian / Nutritionist • Endocrinologist / Certiﬁed Diabetes Educator • Physician • Nurse / Nurse Practitioner • Other
[48]

Y ears of Experience : • 0–2 years • 3–5 years • 6–10 years • 11–15 years • >15 years
[49]

Comfort with Meal Design : Rated on a 10-point Likert scale (1 = Not at all, 10 = V ery comfortable). B. Evaluation Criteria For each case, participants rated the following (1–10 Likert scale):
[50]

Glycemic appropriateness (maintaining glucose <140 mg/dL)
[51]

Portion size appropriateness
[52]

Alignment with dietary guidelines
[53]

Likelihood of recommendation Participants could also provide optional free-text com- ments. C. Case Descriptions
[54]

Case 1 : Subject: 23 y, Female, BMI 32 Pre-meal: 51 g carb, 27.5 g protein, 21.7 g fat at 113.7 mg/dL Predicted peak glucose: 147 mg/dL MetaPlate Recommendation: Roasted chicken breast (113 g), Brown rice (148 g), Boiled broccoli (91 g), Olive oil (17 g) Nutritional Summary: 43 g carbs, 32.5 g protein, 19.6 g fat, 475 kcal
[55]

Case 2 : Subject: 23 y, Female, BMI 32 Pre-meal: 29 g carb, 47 g protein, 8.3 g fat at 110 mg/dL Predicted peak: 150 mg/dL Recommendation: Chicken breast (140 g), white rice (150 g), asparagus (100 g) Nutritional Summary: ∼30 g carbs, ∼40 g protein, ∼6.3 g fat, ∼491 kcal 2 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS T ABLE I: Hyperparameter search...
[56]

Case 3 : Subject: 23 y, Female, BMI 32 Pre-meal: 112 g carb, 33 g protein, 42 g fat at 111 mg/dL Predicted peak: 158 mg/dL Recommendation: Chicken breast (155 g), sweet potato (200 g), broccoli, olive oil Nutritional Summary: ∼45 g carbs, ∼54 g protein, ∼38 g fat, ∼600 kcal
[57]

Case 4 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 35 g carb, 13 g protein, 10 g fat at 118 mg/dL Predicted peak: 160 mg/dL Recommendation: Shrimp (155 g), asparagus (100 g), butter and olive oil Nutritional Summary: ∼17.1 g carbs, ∼40.5 g pro- tein, ∼26 g fat, ∼465 kcal
[58]

Case 5 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 83 g carb, 25 g protein, 24 g fat at 121 mg/dL Predicted peak: 151 mg/dL Recommendation: Salmon (90 g), sweet potato (155 g), berry sauce, walnuts Nutritional Summary: ∼45 g carbs, ∼25 g protein, ∼25 g fat, ∼505 kcal
[59]

Case 6 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 32 g carb, 7 g protein, 8 g fat at 137 mg/dL Predicted peak: 153 mg/dL Recommendation: Tuna (90 g), whole wheat crack- ers, avocado Nutritional Summary: ∼15.5 g carbs, ∼25 g protein, ∼9 g fat, ∼243 kcal
[60]

Case 7 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 20 g carb, 10 g protein, 6 g fat at 110 mg/dL Predicted peak: 150 mg/dL Recommendation: Greek yogurt, blueberries, al- monds Nutritional Summary: ∼17 g carbs, ∼11 g protein, ∼9.2 g fat, ∼195 kcal
[61]

Case 8 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 101 g carb, 29 g protein, 25 g fat at 106 mg/dL Predicted peak: 161 mg/dL Recommendation: Salmon, quinoa, broccoli, olive oil Nutritional Summary: ∼39 g carbs, ∼34 g protein, ∼26.4 g fat, ∼511 kcal
[62]

Case 9 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 48 g carb, 37 g protein, 17 g fat at 105 mg/dL Predicted peak: 148 mg/dL Recommendation: Ground turkey, brown rice, green beans, olive oil Nutritional Summary: ∼40 g carbs, ∼39 g protein, ∼21 g fat, ∼505 kcal
[63]

: GL YTWIN: ENHANCING DIGIT AL TWIN FOR GLUCOSE CONTROL IN TYPE 1 DIABETES USING P A TIENT -CENTRIC COUNTERFACTUAL TREA TMENTS 3 D

Case 10 : Subject: 23 y, Female, BMI 32 Pre-meal: 45.5 g carb, 15.5 g protein, 15.5 g fat at 125 mg/dL Predicted peak: 165 mg/dL Recommendation: Greek yogurt, walnuts, honey, egg, blueberries Nutritional Summary: ∼22 g carbs, ∼20 g protein, ∼26 g fat, ∼402 kcal AREFEEN et al. : GL YTWIN: ENHANCING DIGIT AL TWIN FOR GLUCOSE CONTROL IN TYPE 1 DIABETES USING...
[64]

clinical plausibility and meal realism,
[65]

adherence to target macronutrients,
[66]

nutritional balance and variety,
[67]

A valid output must look like a real meal that ,→ a person could reasonably eat

simplicity. A valid output must look like a real meal that ,→ a person could reasonably eat. Do NOT ,→ output a snack, a random food pile, or ,→ a minimal macro-only plate. Hard constraints: - Use 3 to 5 food items whenever possible. - Every meal should include: - 1 main protein source, - 1 carbohydrate source, - 1 non-starchy vegetable or fruit, - 0 to 1...
[68]

- Protein should generally not fall below 4 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS ,→ target unless impossible

protein. - Protein should generally not fall below 4 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS ,→ target unless impossible. - Do not over-correct by collapsing carbs to ,→ near zero when the target is moderate ,→ or high. - Preserve a balanced distribution rather than ,→ forcing extreme macro minimization. - If the requested macro targets imply ...
[69]

Build a meal concept first: protein + carb ,→ + produce
[70]

Search USDA items that fit the concept
[71]

Check whether the meal still looks like an ,→ actual meal
[72]

Check whether the portions are normal and ,→ edible
[73]

Finish: Return only the JSON object described ,→ above

Only then finalize the macro fit. Finish: Return only the JSON object described ,→ above

[1] [1]

Meal Macronutrient Adjustment4

Model Development3. Meal Macronutrient Adjustment4. Human Translatable Information Retrieval 5. Healthy Meal Suggestion Subject to: Meal data Glucose data Mobility data Activity data Fig. 1. MetaPlate framework consists of multiple phases: (1) data acquisition from healthy adults in free-living condition using CGM sensor, wristband and smartphone applicat...

[2] [2]

Since Embrace Plus records data in UTC, all timestamps are converted to a common local timezone to ensure consistency with CGM and dietary logs

Data Synchronization and Alignment : Data streams from CGM, wristband, and nutrition logs are first tempo- rally aligned. Since Embrace Plus records data in UTC, all timestamps are converted to a common local timezone to ensure consistency with CGM and dietary logs. The CGM-derived signals are then upsampled to a uniform 6 IEEE JOURNAL OF BIOMEDICAL AND H...

[3] [3]

Meal Event Identification and Aggregation : Meal events are extracted from nutrition logs. Due to the tendency of users to log multiple food items within a short time span, temporally adjacent meal entries occurring within a 30-minute interval are grouped into a single meal cluster. Each cluster is represented by its latest timestamp, corresponding to the...

[4] [4]

Feature Extraction : For each event at time tm, a feature vector x is constructed using data from a two- hour pre-meal window [tm−2h, tm). The extracted features include statistical summaries (mean and standard devia- tion) of wristband-derived signals, including step counts, activity counts, METs, EDA, skin temperature, and pulse rate. Subjects’ age and ...

[5] [5]

This def- inition aligns with clinical understanding of postprandial glucose excursions and captures peak glycemic response following a meal

Target Variable Construction : The forecast target y was defined as the maximum glucose value observed within a two-hour postprandial window (tm, tm + 2h]. This def- inition aligns with clinical understanding of postprandial glucose excursions and captures peak glycemic response following a meal

[6] [6]

Train/test split : To prevent subject-level data leakage and ensure better generalization, the dataset is partitioned at subject level. Specifically, a 10/3 subject-wise split is followed, with ten participants being randomly selected for model training and the remaining three participants are set aside for evaluation and meal plan generation. The data pr...

[7] [7]

In this cohort, participants were monitored in a controlled lab setting while consuming standardized meals and wearing a Dexcom G6 Pro and an Empatica E4 wrist-worn device

Supplementing the training data : Given the limited size of the train dataset ( 376 samples), the training data is supplemented with an additional dataset from the MealMeter [27] project (IRB #15102) comprising 12 subjects ( n = 168 samples). In this cohort, participants were monitored in a controlled lab setting while consuming standardized meals and wea...

2026

[8] [8]

Var.), Pearson correlation coeﬀicient ( r), MAPE, and sMAPE

Regression model : The forecasting model is evaluated using a set of standard regression performance metrics: RMSE, MAE, median absolute error (MedAE), coeﬀicient of determination ( R2), explained variance (Exp. Var.), Pearson correlation coeﬀicient ( r), MAPE, and sMAPE

[9] [9]

A CF is considered valid if the predicted postprandial glucose level falls below a predefined evaluation threshold τeval = 140 mg/dL

Counterfactuals : CFs are validated using the metrics below- Validity assesses whether the generated CFs achieve the desired glycemic outcome under a regression setting. A CF is considered valid if the predicted postprandial glucose level falls below a predefined evaluation threshold τeval = 140 mg/dL. validity = 1 |X | X (x,m0)∈X /x31 fθ(x, m∗) ≤ τeval (...

[10] [10]

Lower RMSE indicates better adherence to the target macronutrient constraints

LLM mappings : The LLM-based meal mapping mod- ule is evaluated along three dimensions- Constraint Satisfaction (RMSE) measures how closely the LLM generated meal matches the target macronutrient profile using root mean squared error (RMSE) for each macronutrient: RMSEj = vuut 1 N NX i=1 mLLM ij − m∗ ij 2 (20) where j ∈ {C, P, F } denotes carbohydrates, p...

[11] [11]

this is a snack, not a meal

Expert-Based Validation of the Interventions : We evalu- ate the clinical relevance and practical applicability of the generated meal interventions through expert assessment. Case-level Evaluation: Experts were provided with the subject context, predicted postprandial glucose response, and the corresponding MetaPlate-generated meal recom- mendation for ea...

[12] [12]

Continuous glucose monitoring in a healthy population: understanding the post-prandial glycemic response in individuals without diabetes mellitus

P. R. E. Jarvis, J. L. Cardin, P. M. Nisevich-Bede, and J. P. McCarter, “Continuous glucose monitoring in a healthy population: understanding the post-prandial glycemic response in individuals without diabetes mellitus. ” Metabolism: clinical and experimental, p. 155640, 2023

2023

[13] [13]

B. Giri, S. Dey, T. Das, M. Sarkar, J. Banerjee, and S. K. Dash, “Chronic hyperglycemia mediated physiological alteration and metabolic distortion leads to organ dysfunction, infection, can- cer progression and other pathophysiological consequences: An update on glucose toxicity. ” Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie, vol. 107, ...

2018

[14] [14]

Is nondiabetic hyperglycemia a risk factor for cardiovascular disease? a meta- analysis of prospective studies

E. B. Levitan, Y. Song, E. S. Ford, and S. Liu, “Is nondiabetic hyperglycemia a risk factor for cardiovascular disease? a meta- analysis of prospective studies. ” Archives of internal medicine, vol. 164 19, pp. 2147–55, 2004

2004

[15] [15]

Uncovering person- alized glucose responses and circadian rhythms from multiple wearable biosensors with bayesian dynamical modeling,

N. E. Phillips, T.-H. Collet, and F. Naef, “Uncovering person- alized glucose responses and circadian rhythms from multiple wearable biosensors with bayesian dynamical modeling,” Cell Reports Methods, vol. 3, 2023

2023

[16] [16]

Personalized nutrition by prediction of glycemic responses

D. A. Zeevi, T. Korem, N. Zmora, D. Israeli, D. Roth- schild, A. Weinberger, O. Ben-Yacov, D. Lador, T. A vnit-Sagi, M. Lotan-Pompan, J. Suez, J. A. Mahdi, E. Matot, G. Malka, N. Kosower, M. Rein, G. Zilberman-Schapira, L. Dohnalová, M. Pevsner-Fischer, R. Bikovsky, Z. Halpern, E. Elinav, and E. Segal, “Personalized nutrition by prediction of glycemic res...

2015

[17] [17]

Machine learning-based glucose prediction with use of continuous glucose and physical activity monitoring data: The maastricht study,

W. P. van Doorn, Y. D. Foreman, N. C. Schaper, H. H. Savelberg, A. Koster, C. J. H. van der Kallen, A. Wesselius, M. T. Schram, R. M. A. Henry, P. C. Dagnelie, B. E. de Galan, O. Bekers, C. D. A. Stehouwer, S. J. R. Meex, and M. C. Brouwers, “Machine learning-based glucose prediction with use of continuous glucose and physical activity monitoring data: Th...

2021

[18] [18]

Attengluco: Multimodal transformer-based blood glucose forecasting on ai-readi dataset,

E. Farahmand, R. R. Azghan, N. T. Chatrudi, E. Kim, G. K. Gudur, E. Thomaz, G. Pedrielli, P. K. Turaga, and H. Ghasemzadeh, “Attengluco: Multimodal transformer-based blood glucose forecasting on ai-readi dataset,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–7, 2025

2025

[19] [19]

Time-aware cross-attention for multi-modal sensor-based blood glucose forecasting,

A. Machiraju, E. Farahmand, S. B. Soumma, A. Arefeen, C. Johnston, and H. Ghasemzadeh, “Time-aware cross-attention for multi-modal sensor-based blood glucose forecasting,” 2025 IEEE 21st International Conference on Body Sensor Networks (BSN), pp. 1–4, 2025

2025

[20] [20]

Population-specific glucose prediction in diabetes care with transformer-based deep learning on the edge,

T. Zhu, L. Kuang, C. Piao, J. Zeng, K. Li, and P. Georgiou, “Population-specific glucose prediction in diabetes care with transformer-based deep learning on the edge,” IEEE Transac- tions on Biomedical Circuits and Systems, vol. 18, pp. 236–246, 2024

2024

[21] [21]

Glucoseassist: Per- sonalized blood glucose level predictions and early dysglycemia detection,

P. Shroff, A. Arefeen, and H. Ghasemzadeh, “Glucoseassist: Per- sonalized blood glucose level predictions and early dysglycemia detection,” 2023 IEEE 19th International Conference on Body Sensor Networks (BSN), pp. 1–4, 2023. AREFEEN et al.: METAPLATE: COUNTERF ACTUAL-GUIDED PERSONALIZED FOOD RECOMMENDATIONS 13

2023

[22] [22]

Glycemic-aware and architecture-agnostic training framework for blood glucose forecasting in type 1 diabetes,

S. Khamesian, A. Arefeen, M. A. Grando, B. Thompson, and H. Ghasemzadeh, “Glycemic-aware and architecture-agnostic training framework for blood glucose forecasting in type 1 diabetes,” 2025

2025

[23] [23]

Glyrag: Context-aware retrieval-augmented framework for blood glucose forecasting,

S. B. Soumma and H. Ghasemzadeh, “Glyrag: Context-aware retrieval-augmented framework for blood glucose forecasting,” ArXiv, vol. abs/2601.05353, 2026

work page arXiv 2026

[24] [24]

An ai-based nutrition recommendation system: technical vali- dation with insights from mediterranean cuisine,

K. Kalpakoglou, L. Calderón-Pérez, N. Boqué, M. Guldas, Çağla Erdoğan Demir, L. P. Gymnopoulos, and K. Dimitropoulos, “An ai-based nutrition recommendation system: technical vali- dation with insights from mediterranean cuisine,” Frontiers in Nutrition, vol. 12, 2025

2025

[25] [25]

Computational framework for sequential diet recommendation: Integrating linear optimization and clinical domain knowledge,

A. Arefeen, N. Jaribi, B. J. Mortazavi, and H. Ghasemzadeh, “Computational framework for sequential diet recommendation: Integrating linear optimization and clinical domain knowledge,” 2022 IEEE/ACM Conference on Connected Health: Applica- tions, Systems and Engineering Technologies (CHASE), pp. 91– 98, 2022

2022

[26] [26]

Ai-driven personalized nutrition: Rag-based digital health solution for obesity and type 2 diabetes,

A. K. Gavai and J. van Hillegersberg, “Ai-driven personalized nutrition: Rag-based digital health solution for obesity and type 2 diabetes,” PLOS Digital Health, vol. 4, 2025

2025

[27] [27]

Nutrigen: Personalized meal plan generator leveraging large language models to enhance dietary and nutritional adherence,

S. Khamesian, A. Arefeen, S. M. Carpenter, and H. Ghasemzadeh, “Nutrigen: Personalized meal plan generator leveraging large language models to enhance dietary and nutritional adherence,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–7, 2025

2025

[28] [28]

Chatdiet: Empowering personalized nutrition-oriented food recommender chatbots through an llm- augmented framework,

Z. Yang, E. Khatibi, N. Nagesh, M. Abbasian, I. Azimi, R. Jain, and A. M. Rahmani, “Chatdiet: Empowering personalized nutrition-oriented food recommender chatbots through an llm- augmented framework,” Smart Health, vol. 32, p. 100465, 2024

2024

[29] [29]

Mopi-hfrs: A multi-objective personalized health-aware food recommendation system with llm-enhanced interpreta- tion,

Z. Zhang, Z. Wang, T. Ma, V. S. Taneja, S. Nelson, N. H. L. Le, K. Murugesan, M. Ju, N. V. Chawla, C. Zhang, and Y. Ye, “Mopi-hfrs: A multi-objective personalized health-aware food recommendation system with llm-enhanced interpreta- tion,” Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2024

2024

[30] [30]

Counterfactual explanations as interventions in latent space,

R. Crupi, A. Castelnovo, D. Regoli, and B. S. M. Gonzalez, “Counterfactual explanations as interventions in latent space,” Data Mining and Knowledge Discovery, vol. 38, pp. 2733 – 2769, 2021

2021

[31] [31]

Designing user-centric be- havioral interventions to prevent dysglycemia with novel coun- terfactual explanations,

A. Arefeen and H. Ghasemzadeh, “Designing user-centric be- havioral interventions to prevent dysglycemia with novel coun- terfactual explanations,” ArXiv, vol. abs/2310.01684, 2023

work page arXiv 2023

[32] [32]

GlyTwin: Digital Twin for Glucose Control in Type 1 Diabetes Through Optimal Behavioral Modifications Using Patient-Centric Counterfactuals

A. Arefeen, S. Khamesian, M. A. Grando, B. Thompson, and H. Ghasemzadeh, “Glytwin: Digital twin for glucose control in type 1 diabetes through optimal behavioral modifications using patient-centric counterfactuals,” ArXiv, vol. abs/2504.09846, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[33] [33]

Glyman: Glycemic management using patient-centric counterfactuals,

——, “Glyman: Glycemic management using patient-centric counterfactuals,” 2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–5, 2024

2024

[34] [34]

Entropy-based logic explanations of neural networks,

P. Barbiero, G. Ciravegna, F. Giannini, P. Li’o, M. Gori, and S. Melacci, “Entropy-based logic explanations of neural networks,” ArXiv, vol. abs/2106.06804, 2021

work page arXiv 2021

[35] [35]

Explaining ma- chine learning classifiers through diverse counterfactual expla- nations,

R. K. Mothilal, A. Sharma, and C. Tan, “Explaining ma- chine learning classifiers through diverse counterfactual expla- nations,” in Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 2020, pp. 607–617

2020

[36] [36]

Nice: an algorithm for nearest instance counterfactual explanations,

D. Brughmans and D. Martens, “Nice: an algorithm for nearest instance counterfactual explanations,” Data Mining and Knowl- edge Discovery, pp. 1–39, 2021

2021

[37] [37]

A model-agnostic and data-independent tabu search algorithm to generate counterfactuals for tabular, image, and text data,

R. M. B. de Oliveira, K. Sörensen, and D. Martens, “A model-agnostic and data-independent tabu search algorithm to generate counterfactuals for tabular, image, and text data,” European Journal of Operational Research, 2023

2023

[38] [38]

Mealmeter: Using multimodal sensing and machine learning for automatically estimating nutrition intake,

A. Arefeen, S. N. Fessler, S. M. Mostafavi, C. Johnston, and H. Ghasemzadeh, “Mealmeter: Using multimodal sensing and machine learning for automatically estimating nutrition intake,” 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1–6, 2025

2025

[39] [39]

Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity,

V. T. van Hees, L. Gorzelniak, E. C. D. León, M. Eder, M. R. Pias, S. Taherian, U. Ekelund, F. Renström, P. W. Franks, A. Horsch, and S. Brage, “Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity,” PLoS ONE, vol. 8, 2013

2013

[40] [40]

Age group comparability of raw accelerometer output from wrist- and hip-worn monitors

M. Hildebrand, V. T. van Hees, B. H. Hansen, and U. Ekelund, “Age group comparability of raw accelerometer output from wrist- and hip-worn monitors. ” Medicine and science in sports and exercise, vol. 46 9, pp. 1816–24, 2014

2014

[41] [41]

Estimation of daily energy expenditure in pregnant and non-pregnant women using a wrist-worn tri-axial accelerometer,

V. T. van Hees, F. Renström, A. Wright, A. Gradmark, M. Catt, K. Y. Chen, M. Löf, L. J. C. Bluck, J. Pomeroy, N. J. Wareham, U. Ekelund, S. Brage, and P. W. Franks, “Estimation of daily energy expenditure in pregnant and non-pregnant women using a wrist-worn tri-axial accelerometer,” PLoS ONE, vol. 6, 2011

2011

[42] [42]

Guide to the assessment of physical activity: Clinical and research applications: a scientific statement from the american heart association

S. J. Strath, L. A. Kaminsky, B. E. Ainsworth, U. Ekelund, P. S. Freedson, R. A. Gary, C. R. Richardson, D. T. Smith, and A. M. Swartz, “Guide to the assessment of physical activity: Clinical and research applications: a scientific statement from the american heart association. ” Circulation, vol. 128 20, pp. 2259–79, 2013

2013

[43] [43]

Accelerometer data collection and processing criteria to assess physical activity and other outcomes: A sys- tematic review and practical considerations,

J. H. Migueles, C. Cadenas-Sánchez, U. Ekelund, C. D. Nys- tröm, J. Mora-Gonzalez, M. Löf, I. Labayen, J. R. Ruiz, and F. B. Ortega, “Accelerometer data collection and processing criteria to assess physical activity and other outcomes: A sys- tematic review and practical considerations,” Sports Medicine, vol. 47, pp. 1821–1845, 2017

2017

[44] [44]

MiniLM: Deep self-attention distillation for task-agnostic compression of pre-trained transformers

W. Wang, F. Wei, L. Dong, H. Bao, N. Yang, and M. Zhou, “Minilm: Deep self-attention distillation for task- agnostic compression of pre-trained transformers,” ArXiv, vol. abs/2002.10957, 2020

work page arXiv 2002

[45] [45]

Measuring the quality of explanations: The system causability scale (scs),

A. Holzinger, A. M. Carrington, and H. Müller, “Measuring the quality of explanations: The system causability scale (scs),” Kunstliche Intelligenz, vol. 34, pp. 193 – 198, 2019

2019

[46] [46]

Learning personal food preferences via food logs embedding,

A. A. Metwally, A. K. Leong, A. Desai, A. Nagarjuna, D. Perel- man, and M. P. Snyder, “Learning personal food preferences via food logs embedding,” 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 2281–2286, 2021. IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS 1 Supplementary Materials for MetaPlate I. M ODEL FINE -T U...

2021

[47] [47]

Professional Role : • Registered Dietitian / Nutritionist • Endocrinologist / Certiﬁed Diabetes Educator • Physician • Nurse / Nurse Practitioner • Other

[48] [48]

Y ears of Experience : • 0–2 years • 3–5 years • 6–10 years • 11–15 years • >15 years

[49] [49]

Comfort with Meal Design : Rated on a 10-point Likert scale (1 = Not at all, 10 = V ery comfortable). B. Evaluation Criteria For each case, participants rated the following (1–10 Likert scale):

[50] [50]

Glycemic appropriateness (maintaining glucose <140 mg/dL)

[51] [51]

Portion size appropriateness

[52] [52]

Alignment with dietary guidelines

[53] [53]

Likelihood of recommendation Participants could also provide optional free-text com- ments. C. Case Descriptions

[54] [54]

Case 1 : Subject: 23 y, Female, BMI 32 Pre-meal: 51 g carb, 27.5 g protein, 21.7 g fat at 113.7 mg/dL Predicted peak glucose: 147 mg/dL MetaPlate Recommendation: Roasted chicken breast (113 g), Brown rice (148 g), Boiled broccoli (91 g), Olive oil (17 g) Nutritional Summary: 43 g carbs, 32.5 g protein, 19.6 g fat, 475 kcal

[55] [55]

Case 2 : Subject: 23 y, Female, BMI 32 Pre-meal: 29 g carb, 47 g protein, 8.3 g fat at 110 mg/dL Predicted peak: 150 mg/dL Recommendation: Chicken breast (140 g), white rice (150 g), asparagus (100 g) Nutritional Summary: ∼30 g carbs, ∼40 g protein, ∼6.3 g fat, ∼491 kcal 2 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS T ABLE I: Hyperparameter search...

[56] [56]

Case 3 : Subject: 23 y, Female, BMI 32 Pre-meal: 112 g carb, 33 g protein, 42 g fat at 111 mg/dL Predicted peak: 158 mg/dL Recommendation: Chicken breast (155 g), sweet potato (200 g), broccoli, olive oil Nutritional Summary: ∼45 g carbs, ∼54 g protein, ∼38 g fat, ∼600 kcal

[57] [57]

Case 4 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 35 g carb, 13 g protein, 10 g fat at 118 mg/dL Predicted peak: 160 mg/dL Recommendation: Shrimp (155 g), asparagus (100 g), butter and olive oil Nutritional Summary: ∼17.1 g carbs, ∼40.5 g pro- tein, ∼26 g fat, ∼465 kcal

[58] [58]

Case 5 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 83 g carb, 25 g protein, 24 g fat at 121 mg/dL Predicted peak: 151 mg/dL Recommendation: Salmon (90 g), sweet potato (155 g), berry sauce, walnuts Nutritional Summary: ∼45 g carbs, ∼25 g protein, ∼25 g fat, ∼505 kcal

[59] [59]

Case 6 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 32 g carb, 7 g protein, 8 g fat at 137 mg/dL Predicted peak: 153 mg/dL Recommendation: Tuna (90 g), whole wheat crack- ers, avocado Nutritional Summary: ∼15.5 g carbs, ∼25 g protein, ∼9 g fat, ∼243 kcal

[60] [60]

Case 7 : Subject: 26 y, Female, BMI 22.2 Pre-meal: 20 g carb, 10 g protein, 6 g fat at 110 mg/dL Predicted peak: 150 mg/dL Recommendation: Greek yogurt, blueberries, al- monds Nutritional Summary: ∼17 g carbs, ∼11 g protein, ∼9.2 g fat, ∼195 kcal

[61] [61]

Case 8 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 101 g carb, 29 g protein, 25 g fat at 106 mg/dL Predicted peak: 161 mg/dL Recommendation: Salmon, quinoa, broccoli, olive oil Nutritional Summary: ∼39 g carbs, ∼34 g protein, ∼26.4 g fat, ∼511 kcal

[62] [62]

Case 9 : Subject: 28 y, Female, BMI 26.4 Pre-meal: 48 g carb, 37 g protein, 17 g fat at 105 mg/dL Predicted peak: 148 mg/dL Recommendation: Ground turkey, brown rice, green beans, olive oil Nutritional Summary: ∼40 g carbs, ∼39 g protein, ∼21 g fat, ∼505 kcal

[63] [63]

: GL YTWIN: ENHANCING DIGIT AL TWIN FOR GLUCOSE CONTROL IN TYPE 1 DIABETES USING P A TIENT -CENTRIC COUNTERFACTUAL TREA TMENTS 3 D

Case 10 : Subject: 23 y, Female, BMI 32 Pre-meal: 45.5 g carb, 15.5 g protein, 15.5 g fat at 125 mg/dL Predicted peak: 165 mg/dL Recommendation: Greek yogurt, walnuts, honey, egg, blueberries Nutritional Summary: ∼22 g carbs, ∼20 g protein, ∼26 g fat, ∼402 kcal AREFEEN et al. : GL YTWIN: ENHANCING DIGIT AL TWIN FOR GLUCOSE CONTROL IN TYPE 1 DIABETES USING...

[64] [64]

clinical plausibility and meal realism,

[65] [65]

adherence to target macronutrients,

[66] [66]

nutritional balance and variety,

[67] [67]

A valid output must look like a real meal that ,→ a person could reasonably eat

simplicity. A valid output must look like a real meal that ,→ a person could reasonably eat. Do NOT ,→ output a snack, a random food pile, or ,→ a minimal macro-only plate. Hard constraints: - Use 3 to 5 food items whenever possible. - Every meal should include: - 1 main protein source, - 1 carbohydrate source, - 1 non-starchy vegetable or fruit, - 0 to 1...

[68] [68]

- Protein should generally not fall below 4 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS ,→ target unless impossible

protein. - Protein should generally not fall below 4 IEEE JOURNAL OF BIOMEDICAL AND HEAL TH INFORMA TICS ,→ target unless impossible. - Do not over-correct by collapsing carbs to ,→ near zero when the target is moderate ,→ or high. - Preserve a balanced distribution rather than ,→ forcing extreme macro minimization. - If the requested macro targets imply ...

[69] [69]

Build a meal concept first: protein + carb ,→ + produce

[70] [70]

Search USDA items that fit the concept

[71] [71]

Check whether the meal still looks like an ,→ actual meal

[72] [72]

Check whether the portions are normal and ,→ edible

[73] [73]

Finish: Return only the JSON object described ,→ above

Only then finalize the macro fit. Finish: Return only the JSON object described ,→ above