pith. machine review for the scientific record. sign in

arxiv: 2604.23790 · v1 · submitted 2026-04-26 · 💻 cs.LG · stat.ML

Recognition: unknown

A General Representation-Based Approach to Multi-Source Domain Adaptation

Authors on Pith no claims yet

Pith reviewed 2026-05-08 06:33 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords domain adaptationMarkov blanketrepresentation learningidentifiabilitymulti-sourceunsupervised domain adaptationdistribution shiftcausal representation
0
0 comments X

The pith

General domain adaptation is achieved by partitioning the Markov blanket of the label into parents, children, and spouses in the learned representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing domain adaptation methods often depend on strong assumptions like independent latent factors or unchanging label distributions, which limit their use. This paper instead proposes learning representations that capture the full Markov blanket but then partitions it specifically into the label's parents, children, and spouses. This partition makes it possible to transfer knowledge across multiple source domains to an unlabeled target in a general way. The approach comes with theoretical identifiability guarantees and supports a nonparametric implementation for handling varied distribution shifts.

Core claim

General domain adaptation can be achieved by partitioning the representations of the Markov blanket into those of the label's parents, children, and spouses. Its identifiability guarantee can be established, enabling a practical nonparametric approach that handles different types of distribution shifts without restrictive assumptions on the joint distribution.

What carries the argument

The partition of the Markov blanket representations into the label's parents, children, and spouses, which carries the identifiability and transfer properties.

If this is right

  • The framework works for general settings without assuming independent latents or invariant labels.
  • It provides identifiability of the target domain joint distribution.
  • It allows developing a practical nonparametric method for multi-source unsupervised domain adaptation.
  • It handles various kinds of distribution shifts relative to the prediction task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This partitioning strategy could be applied to improve robustness in other transfer learning scenarios beyond domain adaptation.
  • Exploring how to learn such partitioned representations in deep networks might lead to more causal-aware models.
  • Connections to causal graphical models suggest this could help in settings with changing causal mechanisms across domains.

Load-bearing premise

Latent representations can be learned nonparametrically so that the partition of the Markov blanket into parents, children, and spouses is identifiable from the data.

What would settle it

A dataset with known causal structure where the learned representations do not permit recovery of the parent-child-spouse partition, resulting in no performance gain over standard methods on the target domain.

Figures

Figures reproduced from arXiv: 2604.23790 by Guangyi Chen, Ignavier Ng, Kun Zhang, Yan Li, Yujia Zheng, Zijian Li.

Figure 1
Figure 1. Figure 1: An example of the generative process considered view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the General Approach for Multi￾source Domain Adaptation (GAMA). The model first maps input images X to a latent space Z using a VAE framework. The latent variables Z are partitioned into sev￾eral components: Zmb, Zpa, Zch, and Zsps. Two additional VAEs are employed to capture the relationships among the latent variables and label, which aids in estimating θ for improved predictions. For the thr… view at source ↗
Figure 3
Figure 3. Figure 3: The t-SNE visualizations of the learned features on the view at source ↗
read the original abstract

A central problem in unsupervised domain adaptation is determining what to transfer from labeled source domains to an unlabeled target domain. To handle high-dimensional observations (e.g., images), a line of approaches use deep learning to learn latent representations of the observations, which facilitate knowledge transfer in the latent space. However, existing approaches often rely on restrictive assumptions to establish identifiability of the joint distribution in the target domain, such as independent latent variables or invariant label distributions, limiting their real-world applicability. In this work, we propose a general domain adaptation framework that learns compact latent representations to capture distribution shifts relative to the prediction task and address the fundamental question of what representations should be learned and transferred. Notably, we first demonstrate that learning representations based on all the predictive information, i.e., the label's Markov blanket in terms of the learned representations, is often underspecified in general settings. Instead, we show that, interestingly, general domain adaptation can be achieved by partitioning the representations of Markov blanket into those of the label's parents, children, and spouses. Moreover, its identifiability guarantee can be established. Building on these theoretical insights, we develop a practical, nonparametric approach for domain adaptation in a general setting, which can handle different types of distribution shifts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a general framework for multi-source unsupervised domain adaptation in high-dimensional settings. It first shows that learning latent representations of the full Markov blanket of the label is underspecified for transfer in general cases. It then claims that partitioning these representations into those of the label's parents, children, and spouses enables both practical domain adaptation across different shift types and an identifiability guarantee in a nonparametric, assumption-light setting. A practical method is developed to learn and transfer such partitioned representations.

Significance. If the identifiability result and the practical method hold without hidden restrictions, the work would be significant for representation learning in domain adaptation. It directly addresses the open question of what to transfer by grounding the choice in causal roles within the Markov blanket, potentially allowing more robust handling of complex, non-invariant shifts than prior methods that assume independent latents or invariant labels. The nonparametric claim and multi-source focus could influence both theory and applications if the proofs are rigorous and the experiments confirm the partition is recoverable from data.

major comments (3)
  1. [§3] §3 (theoretical analysis on underspecification): the demonstration that the full Markov blanket representation is underspecified for DA is load-bearing for the central claim. The manuscript must show explicitly (via counterexample or non-uniqueness construction) that multiple partitions yield the same predictive performance yet fail to transfer, and that this holds beyond specific shift mechanisms.
  2. [§4] §4 (identifiability result): the guarantee that the partition into parents/children/spouses is identifiable from multi-domain observations alone is the key theoretical contribution. Standard causal identifiability theorems require either interventions or restrictions on the joint distribution to distinguish these roles; the paper should state the precise conditions on the domain-shift mechanisms (e.g., which components are affected by which domains) that supply the necessary variation, and verify that no implicit parametric assumptions are introduced in the nonparametric estimator.
  3. [§5] §5 (practical nonparametric method): the algorithm for recovering the partitioned representations must be shown to implement the theoretical partition without supervision on which latent dimensions correspond to parents vs. children vs. spouses. If the loss or architecture implicitly assumes a particular shift structure, the generality claim is undermined.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'its identifiability guarantee' is ambiguous; clarify whether the guarantee applies to the partitioned representations or to the full adaptation procedure.
  2. [Notation] Notation: introduce the symbols for the partitioned Markov blanket components (e.g., R_P, R_C, R_S) at the first use and maintain consistent usage throughout the theoretical and algorithmic sections.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report, as well as the recommendation for major revision. We address each major comment below, providing clarifications on the theoretical results and indicating revisions where appropriate to strengthen the manuscript.

read point-by-point responses
  1. Referee: §3 (theoretical analysis on underspecification): the demonstration that the full Markov blanket representation is underspecified for DA is load-bearing for the central claim. The manuscript must show explicitly (via counterexample or non-uniqueness construction) that multiple partitions yield the same predictive performance yet fail to transfer, and that this holds beyond specific shift mechanisms.

    Authors: We agree that an explicit demonstration is essential for the central claim. Section 3 provides a non-uniqueness construction using a general structural causal model in which different recombinations of the full Markov blanket achieve identical predictive performance on the source domains but produce inconsistent transfer under target shifts that act asymmetrically on the blanket components. The construction is nonparametric and applies to arbitrary shift mechanisms rather than being restricted to specific forms. To make this more explicit as requested, we will expand the section with an additional general argument showing failure of non-partitioned representations for any shift that differentiates the causal roles. revision: partial

  2. Referee: §4 (identifiability result): the guarantee that the partition into parents/children/spouses is identifiable from multi-domain observations alone is the key theoretical contribution. Standard causal identifiability theorems require either interventions or restrictions on the joint distribution to distinguish these roles; the paper should state the precise conditions on the domain-shift mechanisms (e.g., which components are affected by which domains) that supply the necessary variation, and verify that no implicit parametric assumptions are introduced in the nonparametric estimator.

    Authors: We appreciate this point. The identifiability result in Section 4 holds under the multi-source setting where domain shifts provide differential variation across the causal roles: specifically, for each of parents, children, and spouses there is at least one domain in which the shift mechanism alters that role's distribution independently of the others. This supplies the necessary variation to distinguish the roles from the observed changes in the joint distributions alone. The result and the associated estimator are fully nonparametric, with no parametric restrictions on the conditional distributions or the estimator itself. We will add an explicit statement of these conditions as a dedicated remark in the revision. revision: yes

  3. Referee: §5 (practical nonparametric method): the algorithm for recovering the partitioned representations must be shown to implement the theoretical partition without supervision on which latent dimensions correspond to parents vs. children vs. spouses. If the loss or architecture implicitly assumes a particular shift structure, the generality claim is undermined.

    Authors: The method in Section 5 recovers the partition without any supervision on dimension-to-role assignments. The optimization combines a standard prediction objective with a multi-domain loss that, by the identifiability theorem, separates the representations according to their distinct causal properties (stability for parents, direct effect on the label for children, and conditional dependence for spouses). The architecture is a generic deep network with no built-in assumptions on shift types. Synthetic experiments confirm that the learned dimensions align with the theoretical roles from data alone. We will insert a clarifying paragraph in the revision that maps each term in the loss to the corresponding causal role. revision: no

Circularity Check

0 steps flagged

No circularity; claims rest on stated theoretical partitioning without reduction to inputs or self-citations

full rationale

The abstract presents the core result as a demonstration that the full Markov blanket is underspecified while partitioning into parents/children/spouses yields both adaptation and identifiability. No equations, fitted parameters, or self-citations are quoted that would make any prediction equivalent to its inputs by construction. The derivation chain is described as building on 'theoretical insights' to a nonparametric method, with no visible self-definitional loop, renamed known result, or load-bearing self-citation. The identifiability guarantee is asserted as established rather than presupposed, leaving the paper self-contained against external benchmarks on the provided summary.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Abstract provides limited information on assumptions; the framework appears to rest on the existence of suitable latent representations and the ability to identify the Markov blanket partition from data.

axioms (2)
  • domain assumption Existence of latent representations that capture distribution shifts relative to the prediction task
    Invoked to justify learning compact representations for transfer
  • domain assumption The data generating process admits a Markov blanket structure for the label in the learned representation space
    Central to the partitioning argument
invented entities (1)
  • Partitioned Markov blanket representations (parents, children, spouses) no independent evidence
    purpose: To achieve identifiability and general domain adaptation
    Introduced as the key mechanism for the framework

pith-pipeline@v0.9.0 · 5530 in / 1316 out tokens · 26763 ms · 2026-05-08T06:33:43.996694+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

117 extracted references · 14 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases

    Adams, J., Hansen, N., and Zhang, K. Identification of partially observed linear causal models: Graphical conditions for the non-gaussian and heterogeneous cases. Advances in Neural Information Processing Systems, 34: 0 22822--22833, 2021

  3. [3]

    Interventional causal representation learning

    Ahuja, K., Mahajan, D., Wang, Y., and Bengio, Y. Interventional causal representation learning. In International Conference on Machine Learning, 2023

  4. [4]

    Ben-David, S., Blitzer, J., Crammer, K., Kulesza, A., Pereira, F., and Vaughan, J. W. A theory of learning from different domains. Machine learning, 79: 0 151--175, 2010

  5. [5]

    Identifying linearly-mixed causal representations from multi-node interventions

    Bing, S., Ninad, U., Wahl, J., and Runge, J. Identifying linearly-mixed causal representations from multi-node interventions. In Conference on Causal Learning and Reasoning, 2024

  6. [6]

    Domain separation networks

    Bousmalis, K., Trigeorgis, G., Silberman, N., Krishnan, D., and Erhan, D. Domain separation networks. Advances in neural information processing systems, 29, 2016

  7. [7]

    Brehmer, J., De Haan, P., Lippe, P., and Cohen, T. S. Weakly supervised causal representation learning. Advances in Neural Information Processing Systems, 35: 0 38319--38331, 2022

  8. [8]

    Function classes for identifiable nonlinear independent component analysis

    Buchholz, S., Besserve, M., and Sch \"o lkopf, B. Function classes for identifiable nonlinear independent component analysis. In Advances in Neural Information Processing Systems, 2022

  9. [9]

    Learning disentangled semantic representation for domain adaptation

    Cai, R., Li, Z., Wei, P., Qiao, J., Zhang, K., and Hao, Z. Learning disentangled semantic representation for domain adaptation. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019 a

  10. [10]

    Triad constraints for learning causal structure of latent variables

    Cai, R., Xie, F., Glymour, C., Hao, Z., and Zhang, K. Triad constraints for learning causal structure of latent variables. Advances in neural information processing systems, 32, 2019 b

  11. [11]

    Joint domain alignment and discriminative feature learning for unsupervised deep domain adaptation

    Chen, C., Chen, Z., Jiang, B., and Jin, X. Joint domain alignment and discriminative feature learning for unsupervised deep domain adaptation. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp.\ 3296--3303, 2019 a

  12. [12]

    Progressive feature alignment for unsupervised domain adaptation

    Chen, C., Xie, W., Huang, W., Rong, Y., Ding, X., Huang, Y., Xu, T., and Huang, J. Progressive feature alignment for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 627--636, 2019 b

  13. [13]

    Transferability vs

    Chen, X., Wang, S., Long, M., and Wang, J. Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation. In International conference on machine learning, pp.\ 1081--1090. PMLR, 2019 c

  14. [14]

    and B \"u hlmann, P

    Chen, Y. and B \"u hlmann, P. Domain adaptation under structural causal models. The Journal of Machine Learning Research, 22 0 (1): 0 11856--11935, 2021

  15. [15]

    Independent component analysis -- a new concept? Signal Processing, 36: 0 287--314, 1994

    Comon, P. Independent component analysis -- a new concept? Signal Processing, 36: 0 287--314, 1994

  16. [16]

    Generative model based noise robust training for unsupervised domain adaptation

    Deng, Z., Li, D., He, J., Song, Y.-Z., and Xiang, T. Generative model based noise robust training for unsupervised domain adaptation. arXiv preprint arXiv:2303.05734, 2023

  17. [17]

    A versatile causal discovery framework to allow causally-related hidden variables

    Dong, X., Huang, B., Ng, I., Song, X., Zheng, Y., Jin, S., Legaspi, R., Spirtes, P., and Zhang, K. A versatile causal discovery framework to allow causally-related hidden variables. In The Twelfth International Conference on Learning Representations, 2023

  18. [18]

    Farahani, A., Voghoei, S., Rasheed, K., and Arabnia, H. R. A brief review of domain adaptation. In Stahlbock, R., Weiss, G. M., Abou-Nasr, M., Yang, C.-Y., Arabnia, H. R., and Deligiannidis, L. (eds.), Advances in Data Science and Information Engineering, pp.\ 877--894, Cham, 2021. Springer International Publishing. ISBN 978-3-030-71704-9

  19. [19]

    and Lempitsky, V

    Ganin, Y. and Lempitsky, V. Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pp.\ 1180--1189. PMLR, 2015

  20. [20]

    Domain-adversarial training of neural networks

    Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., March, M., and Lempitsky, V. Domain-adversarial training of neural networks. Journal of machine learning research, 17 0 (59): 0 1--35, 2016

  21. [21]

    A unified view of label shift estimation

    Garg, S., Wu, Y., Balakrishnan, S., and Lipton, Z. A unified view of label shift estimation. Advances in Neural Information Processing Systems, 33: 0 3290--3300, 2020

  22. [22]

    Domain adaptation with conditional transferable components

    Gong, M., Zhang, K., Liu, T., Tao, D., Glymour, C., and Sch \"o lkopf, B. Domain adaptation with conditional transferable components. In International conference on machine learning, pp.\ 2839--2848. PMLR, 2016

  23. [23]

    u gelgen, J., Stimper, V., Sch \

    Gresele, L., Von K \"u gelgen, J., Stimper, V., Sch \"o lkopf, B., and Besserve, M. Independent mechanism analysis, a new concept? Advances in neural information processing systems, 34: 0 28233--28248, 2021

  24. [24]

    Deep residual learning for image recognition

    He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

  25. [25]

    Causal discovery from heterogeneous/nonstationary data

    Huang, B., Zhang, K., Zhang, J., Ramsey, J., Sanchez-Romero, R., Glymour, C., and Sch \"o lkopf, B. Causal discovery from heterogeneous/nonstationary data. Journal of Machine Learning Research, 21 0 (89): 0 1--53, 2020

  26. [26]

    Huang, B., Low, C. J. H., Xie, F., Glymour, C., and Zhang, K. Latent hierarchical causal structure discovery with rank constraints. Advances in Neural Information Processing Systems, 35: 0 5549--5561, 2022

  27. [27]

    and Morioka, H

    Hyvarinen, A. and Morioka, H. Unsupervised feature extraction by time-contrastive learning and nonlinear ica. Advances in neural information processing systems, 29, 2016

  28. [28]

    and Morioka, H

    Hyvarinen, A. and Morioka, H. Nonlinear ica of temporally dependent stationary sources. In Artificial Intelligence and Statistics, pp.\ 460--469. PMLR, 2017

  29. [29]

    and Pajunen, P

    Hyv \"a rinen, A. and Pajunen, P. Nonlinear independent component analysis: Existence and uniqueness results. Neural networks, 12 0 (3): 0 429--439, 1999

  30. [30]

    Independent component analysis

    Hyvarinen, A., Karhunen, J., and Oja, E. Independent component analysis. Studies in informatics and control, 11 0 (2): 0 205--207, 2002

  31. [31]

    Nonlinear ICA using auxiliary variables and generalized contrastive learning

    Hyvarinen, A., Sasaki, H., and Turner, R. Nonlinear ICA using auxiliary variables and generalized contrastive learning. In International Conference on Artificial Intelligence and Statistics, 2019

  32. [32]

    Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning

    Hyvärinen, A., Khemakhem, I., and Morioka, H. Nonlinear independent component analysis for principled disentanglement in unsupervised deep learning. Patterns, 4 0 (10): 0 100844, 2023. ISSN 2666-3899

  33. [33]

    Categorical reparameterization with gumbel-softmax

    Jang, E., Gu, S., and Poole, B. Categorical reparameterization with gumbel-softmax. In International Conference on Learning Representations, 2017

  34. [34]

    and Aragam, B

    Jiang, Y. and Aragam, B. Learning nonparametric latent causal graphs with unknown interventions. In Thirty-seventh Conference on Neural Information Processing Systems, 2023

  35. [35]

    and Syrgkanis, V

    Jin, J. and Syrgkanis, V. Learning causal representations from general environments: Identifiability and intrinsic ambiguity. arXiv preprint arXiv:2311.12267, 2023

  36. [36]

    Contrastive adaptation network for single-and multi-source domain adaptation

    Kang, G., Jiang, L., Wei, Y., Yang, Y., and Hauptmann, A. Contrastive adaptation network for single-and multi-source domain adaptation. IEEE transactions on pattern analysis and machine intelligence, 44 0 (4): 0 1793--1804, 2020

  37. [37]

    Variational autoencoders and nonlinear ica: A unifying framework

    Khemakhem, I., Kingma, D., Monti, R., and Hyvarinen, A. Variational autoencoders and nonlinear ica: A unifying framework. In International conference on artificial intelligence and statistics, pp.\ 2207--2217. PMLR, 2020

  38. [38]

    Kingma, D. P. and Welling, M. Auto-encoding variational Bayes . In International Conference on Learning Representations, 2014

  39. [39]

    Learning latent causal graphs via mixture oracles

    Kivva, B., Rajendran, G., Ravikumar, P., and Aragam, B. Learning latent causal graphs via mixture oracles. Advances in Neural Information Processing Systems, 34: 0 18087--18101, 2021

  40. [40]

    Partial disentanglement for domain adaptation

    Kong, L., Xie, S., Yao, W., Zheng, Y., Chen, G., Stojanov, P., Akinwande, V., and Zhang, K. Partial disentanglement for domain adaptation. In International conference on machine learning, pp.\ 11455--11472. PMLR, 2022

  41. [41]

    Kori, A., Sanchez, P., Vilouras, K., Glocker, B., and Tsaftaris, S. A. A causal ordering prior for unsupervised representation learning. arXiv preprint arXiv:2307.05704, 2023

  42. [42]

    R., Sharma, Y., Everett, K., Priol, R

    Lachapelle, S., L \'o pez, P. R., Sharma, Y., Everett, K., Priol, R. L., Lacoste, A., and Lacoste-Julien, S. Disentanglement via mechanism sparsity regularization: A new principle for nonlinear ICA . Conference on Causal Learning and Reasoning, 2022

  43. [43]

    R., Sharma, Y., Everett, K., Priol, R

    Lachapelle, S., López, P. R., Sharma, Y., Everett, K., Priol, R. L., Lacoste, A., and Lacoste-Julien, S. Nonparametric partial disentanglement via mechanism sparsity: Sparse actions, interventions and sparse temporal dependencies. arXiv preprint arXiv:2401.04890, 2024

  44. [44]

    Li, D., Yang, Y., Song, Y.-Z., and Hospedales, T. M. Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pp.\ 5542--5550, 2017

  45. [45]

    Multidomain adaptation with sample and source distillation

    Li, K., Lu, J., Zuo, H., and Zhang, G. Multidomain adaptation with sample and source distillation. IEEE Transactions on Cybernetics, 54 0 (4): 0 2193--2205, 2023 a

  46. [46]

    Multi-source domain adaptation handling inaccurate label spaces

    Li, K., Lu, J., Zuo, H., and Zhang, G. Multi-source domain adaptation handling inaccurate label spaces. Neurocomputing, 594: 0 127824, 2024 a

  47. [47]

    T-svdnet: Exploring high-order prototypical correlations for multi-source domain adaptation

    Li, R., Jia, X., He, J., Chen, S., and Hu, Q. T-svdnet: Exploring high-order prototypical correlations for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.\ 9991--10000, 2021

  48. [48]

    Transferable feature filtration network for multi-source domain adaptation

    Li, Y., Wang, S., Wang, B., Hao, Z., and Chai, H. Transferable feature filtration network for multi-source domain adaptation. Knowledge-Based Systems, 260: 0 110113, 2023 b

  49. [49]

    Subspace identification for multi-source domain adaptation

    Li, Z., Cai, R., Chen, G., Sun, B., Hao, Z., and Zhang, K. Subspace identification for multi-source domain adaptation. Advances in Neural Information Processing Systems, 36, 2024 b

  50. [50]

    u gelgen, J., Buchholz, S., Besserve, M., Gresele, L., and Sch \

    Liang, W., Keki \'c , A., von K \"u gelgen, J., Buchholz, S., Besserve, M., Gresele, L., and Sch \"o lkopf, B. Causal component analysis. In Thirty-seventh Conference on Neural Information Processing Systems, 2023

  51. [51]

    Factorizing multivariate function classes

    Lin, J. Factorizing multivariate function classes. Advances in neural information processing systems, 10, 1997

  52. [52]

    M., Cohen, T., and Gavves, S

    Lippe, P., Magliacane, S., L \"o we, S., Asano, Y. M., Cohen, T., and Gavves, S. CITRIS : Causal identifiability from temporal intervened sequences. In International Conference on Machine Learning, 2022

  53. [53]

    M., Cohen, T., and Gavves, E

    Lippe, P., Magliacane, S., L \"o we, S., Asano, Y. M., Cohen, T., and Gavves, E. Causal representation learning for instantaneous and temporal effects in interactive systems. In The Eleventh International Conference on Learning Representations, 2023

  54. [54]

    Detecting and correcting for label shift with black box predictors

    Lipton, Z., Wang, Y.-X., and Smola, A. Detecting and correcting for label shift with black box predictors. In International conference on machine learning, pp.\ 3122--3130. PMLR, 2018

  55. [55]

    Liu, Y., Zhang, Z., Gong, D., Gong, M., Huang, B., Hengel, A. v. d., Zhang, K., and Shi, J. Q. Identifiable latent causal content for domain adaptation under latent covariate shift. arXiv preprint arXiv:2208.14161, 2022

  56. [56]

    Challenging common assumptions in the unsupervised learning of disentangled representations

    Locatello, F., Bauer, S., Lucic, M., Raetsch, G., Gelly, S., Sch \"o lkopf, B., and Bachem, O. Challenging common assumptions in the unsupervised learning of disentangled representations. In International conference on machine learning, pp.\ 4114--4124. PMLR, 2019

  57. [57]

    Learning transferable features with deep adaptation networks

    Long, M., Cao, Y., Wang, J., and Jordan, M. Learning transferable features with deep adaptation networks. In International conference on machine learning, pp.\ 97--105. PMLR, 2015

  58. [58]

    Long, M., Zhu, H., Wang, J., and Jordan, M. I. Deep transfer learning with joint adaptation networks. In International conference on machine learning, pp.\ 2208--2217. PMLR, 2017

  59. [59]

    Long, M., CAO, Z., Wang, J., and Jordan, M. I. Conditional adversarial domain adaptation. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018

  60. [60]

    M., and Sch \"o lkopf, B

    Lu, C., Wu, Y., Hern \'a ndez-Lobato, J. M., and Sch \"o lkopf, B. Invariant causal representation learning for out-of-distribution generalization. In International Conference on Learning Representations, 2021

  61. [61]

    Magliacane, S., Van Ommen, T., Claassen, T., Bongers, S., Versteeg, P., and Mooij, J. M. Domain adaptation by using causal inference to predict invariant conditional distributions. Advances in neural information processing systems, 31, 2018

  62. [62]

    R., Caputo, B., and Ricci, E

    Mancini, M., Porzi, L., Bulo, S. R., Caputo, B., and Ricci, E. Boosting domain adaptation by discovering latent domains. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3771--3780, 2018

  63. [63]

    Reliable causal discovery with improved exact search and weaker assumptions

    Ng, I., Zheng, Y., Zhang, J., and Zhang, K. Reliable causal discovery with improved exact search and weaker assumptions. In Advances in Neural Information Processing Systems, 2021

  64. [64]

    Pan, S. J. and Yang, Q. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22 0 (10): 0 1345--1359, 2009

  65. [65]

    Park, G. Y. and Lee, S. W. Information-theoretic regularization for multi-source domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.\ 9214--9223, 2021

  66. [66]

    M., Gopalan, R., Li, R., and Chellappa, R

    Patel, V. M., Gopalan, R., Li, R., and Chellappa, R. Visual domain adaptation: A survey of recent advances. IEEE Signal Processing Magazine, 32 0 (3): 0 53--69, 2015. doi:10.1109/MSP.2014.2347059

  67. [67]

    Moment matching for multi-source domain adaptation

    Peng, X., Bai, Q., Xia, X., Huang, Z., Saenko, K., and Wang, B. Moment matching for multi-source domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pp.\ 1406--1415, 2019

  68. [68]

    Unsupervised learning under latent label shift

    Roberts, M., Mani, P., Garg, S., and Lipton, Z. Unsupervised learning under latent label shift. Advances in Neural Information Processing Systems, 35: 0 18763--18778, 2022

  69. [69]

    Maximum classifier discrepancy for unsupervised domain adaptation

    Saito, K., Watanabe, K., Ushiku, Y., and Harada, T. Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 3723--3732, 2018

  70. [70]

    On causal and anticausal learning

    Sch \"o lkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., and Mooij, J. On causal and anticausal learning. In Proceedings of the 29th International Conference on International Conference on Machine Learning, pp.\ 459--466, 2012

  71. [71]

    R., Kalchbrenner, N., Goyal, A., and Bengio, Y

    Sch \"o lkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., and Bengio, Y. Towards causal representation learning. Proceedings of the IEEE, 109 0 (5): 0 612--634, 2021

  72. [72]

    Weakly supervised disentangled generative causal representation learning

    Shen, X., Liu, F., Dong, H., Lian, Q., Chen, Z., and Zhang, T. Weakly supervised disentangled generative causal representation learning. Journal of Machine Learning Research, 23 0 (241): 0 1--55, 2022

  73. [73]

    Improving predictive inference under covariate shift by weighting the log-likelihood function

    Shimodaira, H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, 90 0 (2): 0 227--244, 2000

  74. [74]

    H., Narui, H., and Ermon, S

    Shu, R., Bui, H. H., Narui, H., and Ermon, S. A DIRT-T approach to unsupervised domain adaptation. arXiv preprint arXiv:1802.08735, 2018

  75. [75]

    X., and Wang, B

    Shui, C., Li, Z., Li, J., Gagn \'e , C., Ling, C. X., and Wang, B. Aggregating from multiple target-shifted sources. In International Conference on Machine Learning, pp.\ 9638--9648. PMLR, 2021

  76. [76]

    Learning the structure of linear latent variable models

    Silva, R., Scheines, R., Glymour, C., and Spirtes, P. Learning the structure of linear latent variable models. Journal of Machine Learning Research, 7 0 (8): 0 191--246, 2006. URL http://jmlr.org/papers/v7/silva06a.html

  77. [77]

    Causation, Prediction, and Search

    Spirtes, P., Glymour, C., and Scheines, R. Causation, Prediction, and Search. MIT press, 2nd edition, 2001

  78. [78]

    S., and Uhler, C

    Squires, C., Seigal, A., Bhate, S. S., and Uhler, C. Linear causal disentanglement via interventions. In International Conference on Machine Learning, 2023

  79. [79]

    Data-driven approach to multiple-source domain adaptation

    Stojanov, P., Gong, M., Carbonell, J., and Zhang, K. Data-driven approach to multiple-source domain adaptation. In The 22nd International Conference on Artificial Intelligence and Statistics, pp.\ 3487--3496. PMLR, 2019

  80. [80]

    Domain adaptation with invariant representation learning: What transformations to learn? Advances in Neural Information Processing Systems, 34: 0 24791--24803, 2021

    Stojanov, P., Li, Z., Gong, M., Cai, R., Carbonell, J., and Zhang, K. Domain adaptation with invariant representation learning: What transformations to learn? Advances in Neural Information Processing Systems, 34: 0 24791--24803, 2021

Showing first 80 references.