arxiv: 2605.04882 · v1 · submitted 2026-05-06 · 💻 cs.CV · cs.AI· cs.LG· eess.IV· q-bio.QM

Recognition: unknown

FairEnc: A Fair Vision-Language Model with Fair Vision and Text Encoders for Glaucoma Detection

Mohamed Elhabebe , Ayman El-Baz , Qing Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-08 16:24 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGeess.IVq-bio.QM

keywords glaucoma detectionvision-language modelsfairnessdebiasingsynthetic clinical descriptionsmutual information regularizationadversarial debiasingdemographic disparity

0 comments

The pith

FairEnc debiases both vision and text encoders in a vision-language model to reduce demographic disparities in glaucoma detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes FairEnc as a pretraining method for vision-language models that jointly reduces bias in textual and visual modalities for multiple sensitive attributes including race, gender, ethnicity, and language. For the text encoder it generates synthetic clinical descriptions via large language models that vary those attributes while aiming to keep disease semantics intact, then applies contrastive alignment to produce demographic-invariant representations. For the vision encoder it combines mutual information regularization to weaken statistical links between features and demographics with multi-discriminator adversarial debiasing. Experiments on the Harvard-FairVLMed dataset show lower values on disparity measures DPD and DEOdds alongside competitive diagnostic performance in zero-shot and linear probing setups, with similar advantages persisting on the private FairFundus dataset under distribution shifts.

Core claim

FairEnc jointly mitigates biases in textual and visual modalities with respect to multiple sensitive attributes by leveraging LLM-generated synthetic clinical descriptions with varied sensitive attributes and a contrastive alignment objective for the textual encoder, while using mutual information regularization plus multi-discriminator adversarial debiasing for the visual encoder; this produces lower demographic disparity as measured by DPD and DEOdds on the Harvard-FairVLMed dataset while retaining strong diagnostic performance under zero-shot and linear probing evaluations and generalizing fairness advantages under cross-domain shifts.

What carries the argument

Dual-level fairness strategy that pairs contrastive alignment on attribute-varied synthetic text descriptions with mutual-information regularization plus multi-discriminator adversarial debiasing on visual features.

If this is right

Reduces DPD and DEOdds disparity metrics on the Harvard-FairVLMed dataset.
Maintains strong diagnostic performance under both zero-shot and linear probing evaluations.
Preserves fairness advantages under cross-domain and cross-modality shifts on the FairFundus dataset.
Keeps diagnostic performance within a competitive range across the tested settings.
Supports potential for more equitable deployment in real-world clinical settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The multi-attribute debiasing could address intersectional fairness issues more directly than single-attribute methods.
Synthetic clinical note generation might be adapted to create fair training data for other medical vision-language tasks.
If the fairness gains hold on larger or more diverse populations, the method could inform fairness requirements in clinical AI guidelines.

Load-bearing premise

LLM-generated synthetic clinical descriptions with varied sensitive attributes can preserve disease semantics without introducing new biases or artifacts that undermine either fairness or diagnostic utility.

What would settle it

A direct comparison on the Harvard-FairVLMed test set in which replacing the synthetic descriptions with real clinical notes causes DPD or DEOdds to rise to the level of an unmodified baseline VLM while diagnostic accuracy stays the same.

Figures

Figures reproduced from arXiv: 2605.04882 by Ayman El-Baz, Mohamed Elhabebe, Qing Liu.

**Figure 1.** Figure 1: Race (left) and language (right) distributions of samples in Harvard-FairVLMed [22]. Second, demographic information, such as patient age, gender, and ethnicity is included in the clinical texts as shown in view at source ↗

**Figure 2.** Figure 2: Examples from Harvard-FairVLMed Dataset [22] for Clincal Notes that Include Demographic Information. FairCLIP [22], Robust FairCLIP [3] takes the variations of sample pairs into consideration and down-weights the bad pairs while FairMOE [31] employs multiple experts to mitigate the bias. While these methods seek fair features by pretraining separate models for each demographic attribute, they fail to achie… view at source ↗

**Figure 3.** Figure 3: FairEnc Overall Architecture including values for race, gender, ethnicity, preferred language, or age: { 𝐗 (𝑖) 𝑆𝑦𝑛-𝑡𝑥𝑡,𝑘}𝐾 𝑘=1 = Qwen ( 𝐗 (𝑖) 𝑁𝑒𝑢𝑡𝑟-𝑡𝑥𝑡, 𝑃 𝑟𝑜𝑚𝑝𝑡𝑟𝑎𝑛𝑑𝑜𝑚) . (2) To maintain the consistent semantics about the diseases and manifestations, we insert random demographic attributes and values, ensuring natural and fluent sentence construction. In our experiments, 𝐾 is set to 5 and 𝑃 𝑟𝑜𝑚𝑝𝑡𝑟𝑎𝑛𝑑𝑜𝑚 is… view at source ↗

**Figure 4.** Figure 4: Example from the FairFundus dataset showing a fundus image and its corresponding clinical note. 4.1. Datasets and Evaluation Metrics Harvard-FairVLMed [22]. This dataset comprises 10,000 samples, each consisting of scanning laser ophthalmoscopy (SLO) fundus images paired with corresponding clinical notes summarizing glaucoma diagnoses. In addition to glaucoma labels, the dataset includes demographic identi… view at source ↗

**Figure 6.** Figure 6: Glaucoma Relevance per Group of each Sensitive Attribute in Harvard-FairVLMed Training set. Relevance is the Proportion of Positive Samples within each Group. 5.2. Limitations Despite its demonstrated effectiveness, this work has several limitations that merit discussion. First, although FairEnc exhibits strong fairness generalization, its superior predictive performance does not consistently generalize … view at source ↗

read the original abstract

Automated glaucoma detection is critical for preventing irreversible vision loss and reducing the burden on healthcare systems. However, ensuring fairness across diverse patient populations remains a significant challenge. In this paper, we propose FairEnc, a fair pretraining method for vision-language models (VLMs) that enables simultaneous debiasing across multiple sensitive attributes. FairEnc jointly mitigates biases in both textual and visual modalities with respect to multiple sensitive attributes, including race, gender, ethnicity, and language. Specifically, for the textual encoder, we leverage a large language model to generate synthetic clinical descriptions with varied sensitive attributes while preserving disease semantics, and employ a contrastive alignment objective to encourage demographic-invariant representations. For the visual encoder, we propose a dual-level fairness strategy that combines mutual information regularization to reduce statistical dependence between learned features and demographic groups, with multi-discriminator adversarial debiasing. Comprehensive experiments on the publicly available Harvard-FairVLMed dataset demonstrate that FairEnc effectively reduces demographic disparity as measured by DPD and DEOdds while achieving strong diagnostic performance under both zero-shot and linear probing evaluations. Additional experiments on the private FairFundus dataset show that FairEnc consistently preserves fairness advantages under cross-domain and cross-modality settings and maintains diagnostic performance within a competitive range. These results highlight FairEnc's ability to generalize fairness under distribution shifts, supporting its potential for more equitable deployment in real-world clinical settings. Our codebase and synthetic clinical notes are available at https://github.com/Mohamed-Elhabebe/FairEnc

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FairEnc combines LLM synthetic text with dual visual debiasing for multi-attribute fairness in glaucoma VLMs, but the lack of checks on whether those synthetics preserve diagnostic content is a clear weak point.

read the letter

The paper's main contribution is a joint debiasing setup for vision-language models aimed at glaucoma detection. It generates synthetic clinical notes via LLM to vary race, gender, ethnicity, and language while trying to hold disease semantics fixed, then applies contrastive alignment on the text side. On the vision side it adds mutual information regularization plus multiple adversarial discriminators to reduce dependence on demographic groups. The claim is that this cuts demographic disparity metrics like DPD and DEOdds on the Harvard-FairVLMed dataset while keeping zero-shot and linear-probing performance competitive, with some cross-domain checks on a private fundus set.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes FairEnc, a fair pretraining method for vision-language models in glaucoma detection. It jointly debias the text encoder via LLM-generated synthetic clinical descriptions that vary sensitive attributes (race, gender, ethnicity, language) while aiming to preserve disease semantics, using contrastive alignment for demographic-invariant representations; the vision encoder uses mutual information regularization plus multi-discriminator adversarial debiasing. Experiments on the public Harvard-FairVLMed dataset and private FairFundus dataset claim reduced demographic disparities (via DPD and DEOdds) alongside competitive diagnostic performance in zero-shot and linear-probing settings, with generalization under cross-domain and cross-modality shifts.

Significance. If the central empirical claims hold after addressing validation gaps, the work could advance multi-attribute fairness techniques for medical VLMs and support more equitable clinical deployment. The public release of code and synthetic notes aids reproducibility, which is a strength.

major comments (3)

[Text encoder debiasing (Methods)] Text encoder debiasing (Methods section describing LLM synthetic notes): The approach rests on generating synthetic clinical descriptions that vary sensitive attributes while 'preserving disease semantics,' yet no quantitative checks (BLEU/ROUGE, embedding cosine similarity to originals, or ophthalmologist ratings of glaucoma-specific findings) are reported. Without these, the contrastive alignment objective may align on altered or diluted diagnostic cues, directly undermining the reliability of the reported DPD/DEOdds reductions and diagnostic performance.
[Abstract and Results] Experimental claims (Abstract and Results): The abstract asserts that FairEnc 'effectively reduces demographic disparity' and achieves 'strong diagnostic performance' on Harvard-FairVLMed, but supplies no numerical effect sizes, baseline comparisons (e.g., vs. standard CLIP or prior fair VLMs), statistical tests, or ablation tables. These omissions make it impossible to verify the load-bearing claim that fairness gains occur without performance trade-offs.
[Visual encoder fairness (Methods)] Visual encoder strategy (dual-level fairness description): The combination of mutual information regularization and multi-discriminator adversarial debiasing is presented as jointly mitigating bias, but no ablation isolating each component's contribution (e.g., fairness metrics with only MI regularization) is provided. This leaves unclear whether both elements are necessary for the claimed cross-dataset generalization.

minor comments (2)

[Abstract] The abstract uses vague qualifiers ('strong', 'competitive range', 'consistently preserves') without defining reference baselines or reporting key metrics; adding one or two concrete numbers would improve readability.
[Code and data availability] Ensure the released GitHub repository includes the exact LLM prompts, temperature settings, and filtering criteria used to generate the synthetic notes, as these details are essential for replication.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which have helped improve the rigor and clarity of our work. We address each major point below and have revised the manuscript accordingly.

read point-by-point responses

Referee: [Text encoder debiasing (Methods)] Text encoder debiasing (Methods section describing LLM synthetic notes): The approach rests on generating synthetic clinical descriptions that vary sensitive attributes while 'preserving disease semantics,' yet no quantitative checks (BLEU/ROUGE, embedding cosine similarity to originals, or ophthalmologist ratings of glaucoma-specific findings) are reported. Without these, the contrastive alignment objective may align on altered or diluted diagnostic cues, directly undermining the reliability of the reported DPD/DEOdds reductions and diagnostic performance.

Authors: We agree that explicit quantitative validation of the synthetic notes is needed to confirm preservation of disease semantics. In the revised manuscript, we have added BLEU and ROUGE scores between synthetic and original clinical descriptions, cosine similarity of sentence embeddings, and a small-scale ophthalmologist rating study on glaucoma-specific findings for a subset of notes. These metrics support that diagnostic cues are retained while sensitive attributes vary. revision: yes
Referee: [Abstract and Results] Experimental claims (Abstract and Results): The abstract asserts that FairEnc 'effectively reduces demographic disparity' and achieves 'strong diagnostic performance' on Harvard-FairVLMed, but supplies no numerical effect sizes, baseline comparisons (e.g., vs. standard CLIP or prior fair VLMs), statistical tests, or ablation tables. These omissions make it impossible to verify the load-bearing claim that fairness gains occur without performance trade-offs.

Authors: We acknowledge the need for more specific reporting. The revised abstract and results section now include numerical effect sizes for DPD and DEOdds reductions, direct comparisons to CLIP and prior fair VLMs, statistical significance tests, and explicit references to ablation tables demonstrating that fairness improvements occur with minimal or no diagnostic performance loss. revision: yes
Referee: [Visual encoder fairness (Methods)] Visual encoder strategy (dual-level fairness description): The combination of mutual information regularization and multi-discriminator adversarial debiasing is presented as jointly mitigating bias, but no ablation isolating each component's contribution (e.g., fairness metrics with only MI regularization) is provided. This leaves unclear whether both elements are necessary for the claimed cross-dataset generalization.

Authors: We agree that component ablations are required to clarify necessity. The revised manuscript includes new ablation experiments reporting fairness metrics (DPD, DEOdds) when using only mutual information regularization, only the multi-discriminator, and the full dual-level approach. These results confirm both components contribute to the observed cross-dataset and cross-modality generalization. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical method with external validation

full rationale

The paper presents FairEnc as a proposed architecture combining LLM-based synthetic text generation, contrastive alignment, mutual information regularization, and adversarial debiasing, then reports empirical outcomes on the public Harvard-FairVLMed and private FairFundus datasets. No derivation chain, equations, or first-principles claims are offered that reduce by construction to fitted parameters, self-definitions, or self-citations; performance and fairness metrics (DPD, DEOdds) are measured directly from held-out evaluations rather than being presupposed by the method itself. The central assumptions (semantic fidelity of synthetics, debiasing efficacy) are testable externally and do not collapse into the inputs by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Central claim depends on the assumption that synthetic data generation and the chosen regularization/adversarial objectives can decouple demographics from disease signals without side effects; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption LLM-generated synthetic clinical descriptions preserve disease semantics while varying sensitive attributes
Invoked for textual encoder debiasing strategy
domain assumption Mutual information regularization plus multi-discriminator adversarial training removes demographic dependence without harming utility
Core premise of the visual encoder fairness component

pith-pipeline@v0.9.0 · 5591 in / 1240 out tokens · 45592 ms · 2026-05-08T16:24:30.858378+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 4 canonical work pages · 3 internal anchors

[1]

Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., Wallach, H.,
[2]

A reductions approach to fair classification, in: International conference on machine learning, PMLR. pp. 60–69
[3]

Fairregression:Quantitative definitions and reduction-based algorithms, in: International Confer- ence on Machine Learning, PMLR

Agarwal,A.,Dudík,M.,Wu,Z.S.,2019. Fairregression:Quantitative definitions and reduction-based algorithms, in: International Confer- ence on Machine Learning, PMLR. pp. 120–129

2019
[4]

Bansal, S., Wu, M., Wang, X., Hu, S., 2025. Robust fairness vision- language learning for medical image analysis, in: 2025 IEEE 8th International Conference on Multimedia Information Processing and Retrieval (MIPR), IEEE Computer Society. pp. 463–469

2025
[5]

Belghazi, M.I., Baratin, A., Rajeshwar, S., Ozair, S., Bengio, Y., Courville,A.,Hjelm,D.,2018.Mutualinformationneuralestimation, in: International conference on machine learning, PMLR. pp. 531– 540

2018
[6]

Gdpooled transformer: glaucoma detection using pooled attention based transformer with attention mechanism

Bharathi, V., Shaik, S., 2026. Gdpooled transformer: glaucoma detection using pooled attention based transformer with attention mechanism. International Ophthalmology 46, 90

2026
[7]

Achieving group fairness under erroneous pseudo-labels of sensitive attributes, in: International Conference on Neural Information Pro- cessing, Springer

Cai, Y., Zhang, X., Xie, H., Shi, X., Chen, K., Shang, M., 2024. Achieving group fairness under erroneous pseudo-labels of sensitive attributes, in: International Conference on Neural Information Pro- cessing, Springer. pp. 87–101

2024
[8]

A simple framework for contrastive learning of visual representations, in: In- ternational conference on machine learning, PMLR

Chen, T., Kornblith, S., Norouzi, M., Hinton, G., 2020. A simple framework for contrastive learning of visual representations, in: In- ternational conference on machine learning, PMLR. pp. 1597–1607

2020
[9]

Club:A contrastive log-ratio upper bound of mutual information, in: Interna- tional conference on machine learning, PMLR

Cheng,P.,Hao,W.,Dai,S.,Liu,J.,Gan,Z.,Carin,L.,2020. Club:A contrastive log-ratio upper bound of mutual information, in: Interna- tional conference on machine learning, PMLR. pp. 1779–1788

2020
[10]

Christophe, C., Kanithi, P.K., Raha, T., Khan, S., Pimentel, M.A.,
[11]

Med42-v2: A suite of clinical llms,

Med42-v2: A suite of clinical llms. arXiv preprint arXiv:2408.06142

work page arXiv
[12]

Sinkhorndistances:Lightspeedcomputationofop- timal transport

Cuturi,M.,2013. Sinkhorndistances:Lightspeedcomputationofop- timal transport. Advances in Neural Information Processing Systems 26

2013
[13]

Deng, W., Zhong, Y., Dou, Q., Li, X., 2023. On fairness of medical imageclassificationwithmultiplesensitiveattributesvialearningor- thogonalrepresentations,in:InternationalConferenceonInformation Processing in Medical Imaging, Springer. pp. 158–169

2023
[14]

An image is worth 16x16 words: Transformers for image recognition at scale, in: International Conference on Learning Representations

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al., 2021. An image is worth 16x16 words: Transformers for image recognition at scale, in: International Conference on Learning Representations

2021
[15]

Efficient bias mitigation without privileged information, in: European Conference on Computer Vi- sion, Springer

Espinosa Zarlenga, M., Sankaranarayanan, S., Andrews, J.T., Shams, Z., Jamnik, M., Xiang, A., 2024. Efficient bias mitigation without privileged information, in: European Conference on Computer Vi- sion, Springer. pp. 148–166

2024
[16]

Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al- Dahle, A., Letman, A., Mathur, A., Schelten, A., Vaughan, A., et al.,
[17]

The Llama 3 Herd of Models

The llama 3 herd of models. arXiv preprint arXiv:2407.21783

work page internal anchor Pith review arXiv
[18]

Learning fair classifiers with partially annotated group labels, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp

Jung, S., Chun, S., Moon, T., 2022. Learning fair classifiers with partially annotated group labels, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10348– 10357

2022
[19]

Re-weighting based group fairness regularization via classwise robust optimization, in: International Conference on Learning Representations

Jung, S., Park, T., Chun, S., Moon, T., 2023. Re-weighting based group fairness regularization via classwise robust optimization, in: International Conference on Learning Representations

2023
[20]

Learning debiased classifier with biased committee

Kim, N., Hwang, S., Ahn, S., Park, J., Kwak, S., 2022. Learning debiased classifier with biased committee. Advances in Neural Information Processing Systems 35, 18403–18415

2022
[21]

Fairness without demographics through adversarially reweighted learning

Lahoti,P.,Beutel,A.,Chen,J.,Lee,K.,Prost,F.,Thain,N.,Wang,X., Chi, E., 2020. Fairness without demographics through adversarially reweighted learning. Advances in Neural Information Processing Systems 33, 728–740

2020
[22]

Blip:Bootstrappinglanguage- image pre-training for unified vision-language understanding and generation,in:Internationalconferenceonmachinelearning,PMLR

Li,J.,Li,D.,Xiong,C.,Hoi,S.,2022. Blip:Bootstrappinglanguage- image pre-training for unified vision-language understanding and generation,in:Internationalconferenceonmachinelearning,PMLR. pp. 12888–12900

2022
[23]

Just train twice: Improving grouprobustnesswithouttraininggroupinformation,in:International Conference on Machine Learning, PMLR

Liu, E.Z., Haghgoo, B., Chen, A.S., Raghunathan, A., Koh, P.W., Sagawa, S., Liang, P., Finn, C., 2021. Just train twice: Improving grouprobustnesswithouttraininggroupinformation,in:International Conference on Machine Learning, PMLR. pp. 6781–6792

2021
[24]

Fairlisa: Fair user modeling with limited sensitive attributes information

Liu, Q., Jiang, H., Wang, F., Zhuang, Y., Wu, L., Gao, W., Chen, E., et al., 2023. Fairlisa: Fair user modeling with limited sensitive attributes information. Advances in Neural Information Processing Systems 36, 41432–41450

2023
[25]

Fairclip:Har- nessing fairness in vision-language learning, in: Proceedings of the IEEE/CVFConferenceonComputerVisionandPatternRecognition, pp

Luo, Y., Shi, M., Khan, M.O., Afzal, M.M., Huang, H., Yuan, S., Tian,Y.,Song,L.,Kouhana,A.,Elze,T.,etal.,2024a. Fairclip:Har- nessing fairness in vision-language learning, in: Proceedings of the IEEE/CVFConferenceonComputerVisionandPatternRecognition, pp. 12289–12301
[26]

Harvardglaucomafairness:aretinalnerve disease dataset for fairness learning and fair identity normalization

Luo, Y., Tian, Y., Shi, M., Pasquale, L.R., Shen, L.Q., Zebardast, N., Elze,T.,Wang,M.,2024b. Harvardglaucomafairness:aretinalnerve disease dataset for fairness learning and fair identity normalization. IEEE Transactions on Medical Imaging 43, 2623–2633
[27]

Representation Learning with Contrastive Predictive Coding

Oord, A.v.d., Li, Y., Vinyals, O., 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748

work page internal anchor Pith review arXiv 2018
[28]

Simple and fast group robustness by automatic feature reweighting, in: Inter- nationalConferenceonMachineLearning,PMLR.pp.28448–28467

Qiu, S., Potapczynski, A., Izmailov, P., Wilson, A.G., 2023. Simple and fast group robustness by automatic feature reweighting, in: Inter- nationalConferenceonMachineLearning,PMLR.pp.28448–28467

2023
[29]

Learning transferable visual models from natural language supervision, in: International conference on machine learning, PMLR

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S.,Sastry,G.,Askell,A.,Mishkin,P.,Clark,J.,etal.,2021. Learning transferable visual models from natural language supervision, in: International conference on machine learning, PMLR. pp. 8748– 8763

2021
[30]

Fairbatch: Batch selectionformodelfairness,in:InternationalConferenceonLearning Representations

Roh, Y., Lee, K., Whang, S.E., Suh, C., 2021. Fairbatch: Batch selectionformodelfairness,in:InternationalConferenceonLearning Representations

2021
[31]

Blindness and glaucoma: A multicenter data review from 7 academic eye clinics

Rossetti, L., Digiuni, M., Montesano, G., Centofanti, M., Fea, A.M., Iester, M., et al., 2015. Blindness and glaucoma: A multicenter data review from 7 academic eye clinics. PLOS ONE 10, e0136632

2015
[32]

Distri- butionally robust neural networks, in: International Conference on Learning Representations

Sagawa, S., Koh, P.W., Hashimoto, T.B., Liang, P., 2020. Distri- butionally robust neural networks, in: International Conference on Learning Representations

2020
[33]

Neuraldiscreterepresen- tation learning

VanDenOord,A.,Vinyals,O.,etal.,2017. Neuraldiscreterepresen- tation learning. Advances in Neural Information Processing Systems 30

2017
[34]

Fair-moe: Medical fairness-orientedmixtureofexpertsinvision-languagemodels,in:In- ternationalConferenceonMedicalImageComputingandComputer- Assisted Intervention, Springer

Wang, P., Tong, L., Wu, J., Liu, J., Liu, Z., 2025. Fair-moe: Medical fairness-orientedmixtureofexpertsinvision-languagemodels,in:In- ternationalConferenceonMedicalImageComputingandComputer- Assisted Intervention, Springer. pp. 186–196

2025
[35]

Vision trans- formersbasedclassificationforglaucomatouseyecondition,in:2022 26th International Conference on Pattern Recognition (ICPR), IEEE

Wassel, M., Hamdi, A.M., Adly, N., Torki, M., 2022. Vision trans- formersbasedclassificationforglaucomatouseyecondition,in:2022 26th International Conference on Pattern Recognition (ICPR), IEEE. pp. 5082–5088

2022
[36]

Qwen3 Technical Report

Yang,A.,Li,A.,Yang,B.,Zhang,B.,Hui,B.,Zheng,B.,Yu,B.,Gao, C., Huang, C., Lv, C., et al., 2025. Qwen3 technical report. arXiv preprint arXiv:2505.09388

work page internal anchor Pith review arXiv 2025
[37]

Mitigating un- wanted biases with adversarial learning, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp

Zhang, B.H., Lemoine, B., Mitchell, M., 2018. Mitigating un- wanted biases with adversarial learning, in: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340

2018
[38]

Fairness-awarecontrastivelearningwithpartiallyannotatedsensitive attributes, in: International Conference on Learning Representations

Zhang, F., Kuang, K., Chen, L., Liu, Y., Wu, C., Xiao, J., 2023. Fairness-awarecontrastivelearningwithpartiallyannotatedsensitive attributes, in: International Conference on Learning Representations

2023
[39]

Vision-language models for vision tasks: A survey

Zhang, J., Huang, J., Jin, S., Lu, S., 2024. Vision-language models for vision tasks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence . :Preprint submitted to Elsevier Page 17 of 17

2024