Shared Generative Latent Representation Learning for Multi-view Clustering

Junbin Gao; Ming Yin; Weitian Huang

arxiv: 1907.09747 · v1 · pith:XQA22FZ3new · submitted 2019-07-23 · 💻 cs.CV

Shared Generative Latent Representation Learning for Multi-view Clustering

Ming Yin , Weitian Huang , Junbin Gao This is my paper

Pith reviewed 2026-05-24 17:45 UTC · model grok-4.3

classification 💻 cs.CV

keywords multi-view clusteringgenerative latent representationmixture of Gaussiansdeep generative modelshared embeddingcross-view correlationnonlinear features

0 comments

The pith

A shared generative latent representation modeled as a mixture of Gaussians clusters multi-view data more accurately by capturing cross-view correlations and nonlinear features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a multi-view clustering method that learns one generative latent representation shared across all input views, where this representation follows a mixture of Gaussian distributions. The approach rests on the premise that diverse views of the same objects still share an underlying common embedding. Deep generative techniques are used to extract nonlinear features from each view while explicitly modeling the statistical dependencies that link the views together. This design is intended to overcome limitations of earlier methods on large-scale data and on accurate sample reconstruction. If the shared representation works as described, clustering decisions drawn from the latent space should integrate information from every view more effectively than single-view or non-generative alternatives.

Core claim

The proposed model learns a shared generative latent representation that obeys a mixture of Gaussian distributions from multi-view data; this representation simultaneously extracts nonlinear features from each view and captures the correlations among all views, yielding improved clustering performance on datasets of varying scales.

What carries the argument

shared generative latent representation obeying a mixture of Gaussian distributions

If this is right

Clustering accuracy rises because the latent space integrates information from every view rather than treating views in isolation.
Sample reconstruction quality improves relative to prior multi-view methods that lack an explicit generative component.
The same learned representation supports clustering on both small and large-scale datasets without separate scaling adjustments.
Nonlinear feature extraction becomes automatic through the deep generative pathway instead of requiring hand-crafted kernels or linear projections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The mixture-of-Gaussians structure could be replaced by other flexible priors to test whether the clustering gains depend on the specific distributional form.
The shared representation might transfer to related tasks such as multi-view classification or cross-view retrieval without retraining the full model.
If the assumption of a shared embedding holds only for certain data domains, the method would be expected to degrade on views with fundamentally incompatible structures.
Extending the generative component to allow view-specific noise terms could relax the strict common-embedding requirement while retaining the correlation-capturing benefit.

Load-bearing premise

Multi-view data share a single common latent embedding despite differences among the views.

What would settle it

On a dataset constructed so that the views are generated from completely independent latent factors, the method would show no accuracy gain over the best single-view clustering baseline.

Figures

Figures reproduced from arXiv: 1907.09747 by Junbin Gao, Ming Yin, Weitian Huang.

**Figure 2.** Figure 2: Visualization to show the latent subspaces of Caltech-7 dataset. [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization to show the latent subspaces of UCI digits by DMVCVAE [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

read the original abstract

Clustering multi-view data has been a fundamental research topic in the computer vision community. It has been shown that a better accuracy can be achieved by integrating information of all the views than just using one view individually. However, the existing methods often struggle with the issues of dealing with the large-scale datasets and the poor performance in reconstructing samples. This paper proposes a novel multi-view clustering method by learning a shared generative latent representation that obeys a mixture of Gaussian distributions. The motivation is based on the fact that the multi-view data share a common latent embedding despite the diversity among the views. Specifically, benefited from the success of the deep generative learning, the proposed model not only can extract the nonlinear features from the views, but render a powerful ability in capturing the correlations among all the views. The extensive experimental results, on several datasets with different scales, demonstrate that the proposed method outperforms the state-of-the-art methods under a range of performance criteria.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends deep generative models with a shared GMM latent for multi-view clustering but provides no ablations to confirm the shared embedding drives the gains.

read the letter

The core move here is encoding each view with its own deep network into one shared latent z drawn from a mixture of Gaussians, then clustering directly on z. That framing is presented as new relative to the cited priors, and the experiments on datasets of varying sizes claim better numbers than the baselines on the usual metrics. The work does handle the nonlinear feature extraction and large-scale angle that the abstract flags as motivation, and the results section appears to deliver on the outperformance claim at least at the level reported. The math follows the standard VAE-plus-GMM template without obvious internal contradictions. The soft spot is the load-bearing assumption that a single shared latent is both necessary and sufficient to capture cross-view correlations. The paper states the motivation but does not run the obvious control that replaces the shared z with view-specific latents or explicit cross terms and measures the drop. Without that, the performance edge cannot be pinned specifically on the shared-embedding hypothesis rather than added model capacity. The citation pattern is conventional for the subfield and does not hide gaps. This is incremental work aimed at the multi-view clustering crowd in computer vision. A reader looking for a generative baseline with reported numbers on standard benchmarks would find it useful, but anyone wanting to understand why the shared structure matters would need the missing controls. It is worth sending to a serious referee who can ask for those ablations and check the implementation details.

Referee Report

1 major / 1 minor

Summary. The paper proposes a multi-view clustering method that learns a shared generative latent representation z ~ mixture of Gaussians. Each view is mapped by its own deep encoder network into this common latent space; clustering is then performed on z. The central motivation is that multi-view data share a common latent embedding despite view diversity; the model is claimed to extract nonlinear features and capture cross-view correlations, with experiments on datasets of varying scales showing outperformance over prior methods.

Significance. If the shared-embedding hypothesis is empirically supported, the work would offer a generative deep-learning route to multi-view clustering that addresses reconstruction and scalability limitations of earlier approaches. The combination of per-view encoders with a single GMM latent space is a natural extension of VAE-style models to the multi-view setting and could be reusable if the ablation gap is closed.

major comments (1)

[Model and Experiments sections] The central claim that performance gains arise from capturing correlations via a shared latent embedding (abstract and motivation) is load-bearing yet untested. No ablation replaces the single shared z with view-specific latents (or adds explicit cross-view terms) while keeping the same deep encoders and GMM clustering step; without this comparison on the same datasets, gains cannot be attributed to the shared-embedding hypothesis rather than added model capacity.

minor comments (1)

[Abstract] Abstract states only high-level motivation and claims; quantitative results, architecture details, and loss formulations appear only later, which slows assessment of the contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the manuscript. The major comment raises an important point about validating the contribution of the shared latent embedding, which we address below.

read point-by-point responses

Referee: [Model and Experiments sections] The central claim that performance gains arise from capturing correlations via a shared latent embedding (abstract and motivation) is load-bearing yet untested. No ablation replaces the single shared z with view-specific latents (or adds explicit cross-view terms) while keeping the same deep encoders and GMM clustering step; without this comparison on the same datasets, gains cannot be attributed to the shared-embedding hypothesis rather than added model capacity.

Authors: We agree that the current experiments do not include a direct ablation that isolates the shared latent space by replacing it with view-specific latents while holding encoder depth, GMM clustering, and other components fixed. The existing comparisons are against prior multi-view methods rather than controlled variants of the proposed architecture. To address this, the revised manuscript will add such an ablation study on the same datasets, training a view-specific latent variant (independent per-view GMMs) with matched encoder capacity for direct comparison. This will allow clearer attribution of gains to the shared-embedding design. revision: yes

Circularity Check

0 steps flagged

No circularity; model is an empirical architecture with external validation

full rationale

The paper introduces a deep generative model that encodes views into a shared latent z ~ GMM and performs clustering on z. The central claim (nonlinear feature extraction and cross-view correlation capture) is presented as a modeling choice motivated by the shared-embedding assumption, then validated by outperforming baselines on multiple datasets. No equations reduce a 'prediction' to a fitted input by construction, no load-bearing self-citations appear, and no uniqueness theorem or ansatz is smuggled in. The derivation chain is therefore self-contained against external benchmarks rather than tautological.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that a common latent embedding exists across views and that a Gaussian mixture in that space captures the data distribution; no free parameters or invented entities are explicitly quantified in the abstract.

free parameters (1)

Number of mixture components
The number of Gaussians (clusters) must be chosen or tuned; this is a standard free parameter in GMM-based clustering.

axioms (1)

domain assumption Multi-view data share a common latent embedding despite view diversity
Explicitly stated as the motivation for the shared representation.

pith-pipeline@v0.9.0 · 5685 in / 1115 out tokens · 37775 ms · 2026-05-24T17:45:58.795155+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 4 internal anchors

[1]

Andrew, R

G. Andrew, R. Arora, J. Bilmes, and K. Livescu. Deep canonical correlation analysis. In ICML, pages 1247–1255, 2013

work page 2013
[2]

X. Cai, F. Nie, and H. Huang. Multi-view k-means clustering on big data. In IJCAI, pages 2598–2604, 2013

work page 2013
[3]

Caron, P

M. Caron, P. Bojanowski, A. Joulin, and M. Douze. Deep clustering for unsupervised learning of visual features. In ECCV, 2018

work page 2018
[4]

Chang, L

J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan. Deep adaptive image clustering. In ICCV, 2017. 3Here we cited the reported results from their original papers as the lack of the corre- sponding source codes. “ − ” means there is no report in the original paper. 16

work page 2017
[5]

Chaudhuri, S

K. Chaudhuri, S. M. Kakade, K. Livescu, and K. Sridharan. Multi-view clustering via canonical correlation analysis. In ICML, pages 129–136, 2009

work page 2009
[6]

S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43(1):129–159, Jan. 2001

work page 2001
[7]

Dalal and B

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages 886–893, 2005

work page 2005
[8]

J. Deng, W. Dong, R. Socher, L. jia Li, K. Li, and L. Fei-fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009

work page 2009
[9]

C. Du, C. Du, and H. He. Sharing deep generative representation for perceived image reconstruction from human brain activity. In IJCNN, pages 1049–1056, 2017

work page 2017
[10]

Dua and C

D. Dua and C. Graﬀ. UCI machine learning repository, 2017

work page 2017
[11]

H. Gao, F. Nie, X. Li, and H. Huang. Multi-view subspace clustering. In ICCV, pages 4238–4246, 2015

work page 2015
[12]

G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006

work page 2006
[13]

P. Ji, T. Zhang, H. Li, M. Salzmann, and I. Reid. Deep subspace clustering networks. In NIPS, pages 24–33, 2017

work page 2017
[14]

Jiang, Y

Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou. Variational deep em- bedding: An unsupervised and generative approach to clustering. In IJCAI, pages 1965–1972, 2017

work page 1965
[15]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, volume abs/1412.6980, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[16]

D. P. Kingma and M. Welling. Auto-encoding variational Bayes. CoRR, abs/1312.6114, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[17]

F.-F. Li, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In CVPR Workshop , pages 178–178, 2004

work page 2004
[18]

Y. Li, F. Nie, H. Huang, and J. Huang. Large-scale multi-view spectral clustering via bipartite graph. In AAAI, volume 4, pages 2750–2756, 2015

work page 2015
[19]

J. Liu, C. Wang, J. Gao, and J. Han. Multi-view clustering via joint nonneg- ative matrix factorization. In SIAM Data Mining , 2013

work page 2013
[20]

van der Maaten and G

L. van der Maaten and G. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9(11):2579–2605, 2008

work page 2008
[21]

Ngiam, A

J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. Multimodal deep learning. In ICML, pages 689–696, 2011

work page 2011
[22]

Ojala, M

T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classiﬁcation with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence , 24(7):971–987, 2002

work page 2002
[23]

X. Peng, S. Xiao, J. Feng, W.-Y. Yau, and Z. Yi. Deep subspace clustering 17 with sparsity prior. In IJCAI, pages 1925–1931, 2016

work page 1925
[24]

Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin. Varia- tional autoencoder for deep learning of images, labels and captions. In NIPS, pages 2352–2360, 2016

work page 2016
[25]

Srivastava and R

N. Srivastava and R. Salakhutdinov. Multimodal learning with deep Boltz- mann machines. Journal of Machine Learning Research , 15(1):2949–2980, 2014

work page 2014
[26]

S. Sun. A survey of multi-view machine learning. Neural Computing and Applications, 23(7):2031–2038, 2013

work page 2031
[27]

F. Tian, B. Gao, Q. Cui, E. Chen, and T.-Y. Liu. Learning deep representa- tions for graph clustering. In AAAI, pages 1293–1299, 2014

work page 2014
[28]

H. Wang, F. Nie, and H. Huang. Multi-view clustering and feature learning via structured sparsity. In ICML, volume 28, pages 352–360, 2013

work page 2013
[29]

W. Wang, R. Arora, K. Livescu, and J. Bilmes. On deep multi-view repre- sentation learning. In ICML, pages l083-1092, 2015

work page 2015
[30]

W. Wang, X. Yan, H. Lee, and K. Livescu. Deep variational canonical corre- lation analysis. preprint arXiv:1610.03454, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[31]

J. Xie, R. Girshick, and A. Farhadi. Unsupervised deep embedding for clus- tering analysis. In ICML, pages 478–487, 2016

work page 2016
[32]

C. Xu, Z. Guan, W. Zhao, Y. Niu, Q. Wang, and Z. Wang. Deep multi-view concept learning. In IJCAI, pages 2898-2904, 2018

work page 2018
[33]

C. Xu, D. Tao, and C. Xu. A survey on multi-view learning. preprint arXiv:1304.5634, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[34]

J. Xu, J. Han, F. Nie, and X. Li. Re-weighted discriminatively embedded k-means for multi-view clustering. IEEE Transactions on Image Processing , 26(6):3016-3027, 2017

work page 2017
[35]

B. Yang, X. Fu, N. D. Sidiropoulos, and M. Hong. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In ICML, pages 3861– 3870, 2017

work page 2017
[36]

M. Yin, J. Gao, S. Xie, and Y. Guo. Multiview subspace clustering via tensorial t-product representation. IEEE Transactions on Neural Networks and Learning Systems , 30(3):851–864, 2019

work page 2019
[37]

Zhang, H

C. Zhang, H. Fu, S. Liu, G. Liu, and X. Cao. Low-rank tensor constrained multiview subspace clustering. In ICCV, pages 1582-1590, 2015

work page 2015
[38]

Zhang, L

Z. Zhang, L. Liu, F. Shen, H. T. Shen, and L. Shao. Binary multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, doi:10.1109/TPAMI.2018.2847335, pages 1–1, 2018

work page doi:10.1109/tpami.2018.2847335 2018
[39]

Bengio, P

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS, pages 153-160, 2007. 18

work page 2007

[1] [1]

Andrew, R

G. Andrew, R. Arora, J. Bilmes, and K. Livescu. Deep canonical correlation analysis. In ICML, pages 1247–1255, 2013

work page 2013

[2] [2]

X. Cai, F. Nie, and H. Huang. Multi-view k-means clustering on big data. In IJCAI, pages 2598–2604, 2013

work page 2013

[3] [3]

Caron, P

M. Caron, P. Bojanowski, A. Joulin, and M. Douze. Deep clustering for unsupervised learning of visual features. In ECCV, 2018

work page 2018

[4] [4]

Chang, L

J. Chang, L. Wang, G. Meng, S. Xiang, and C. Pan. Deep adaptive image clustering. In ICCV, 2017. 3Here we cited the reported results from their original papers as the lack of the corre- sponding source codes. “ − ” means there is no report in the original paper. 16

work page 2017

[5] [5]

Chaudhuri, S

K. Chaudhuri, S. M. Kakade, K. Livescu, and K. Sridharan. Multi-view clustering via canonical correlation analysis. In ICML, pages 129–136, 2009

work page 2009

[6] [6]

S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43(1):129–159, Jan. 2001

work page 2001

[7] [7]

Dalal and B

N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages 886–893, 2005

work page 2005

[8] [8]

J. Deng, W. Dong, R. Socher, L. jia Li, K. Li, and L. Fei-fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009

work page 2009

[9] [9]

C. Du, C. Du, and H. He. Sharing deep generative representation for perceived image reconstruction from human brain activity. In IJCNN, pages 1049–1056, 2017

work page 2017

[10] [10]

Dua and C

D. Dua and C. Graﬀ. UCI machine learning repository, 2017

work page 2017

[11] [11]

H. Gao, F. Nie, X. Li, and H. Huang. Multi-view subspace clustering. In ICCV, pages 4238–4246, 2015

work page 2015

[12] [12]

G. E. Hinton and R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006

work page 2006

[13] [13]

P. Ji, T. Zhang, H. Li, M. Salzmann, and I. Reid. Deep subspace clustering networks. In NIPS, pages 24–33, 2017

work page 2017

[14] [14]

Jiang, Y

Z. Jiang, Y. Zheng, H. Tan, B. Tang, and H. Zhou. Variational deep em- bedding: An unsupervised and generative approach to clustering. In IJCAI, pages 1965–1972, 2017

work page 1965

[15] [15]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, volume abs/1412.6980, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[16] [16]

D. P. Kingma and M. Welling. Auto-encoding variational Bayes. CoRR, abs/1312.6114, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014

[17] [17]

F.-F. Li, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In CVPR Workshop , pages 178–178, 2004

work page 2004

[18] [18]

Y. Li, F. Nie, H. Huang, and J. Huang. Large-scale multi-view spectral clustering via bipartite graph. In AAAI, volume 4, pages 2750–2756, 2015

work page 2015

[19] [19]

J. Liu, C. Wang, J. Gao, and J. Han. Multi-view clustering via joint nonneg- ative matrix factorization. In SIAM Data Mining , 2013

work page 2013

[20] [20]

van der Maaten and G

L. van der Maaten and G. Hinton. Visualizing data using t-SNE. Journal of Machine Learning Research, 9(11):2579–2605, 2008

work page 2008

[21] [21]

Ngiam, A

J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng. Multimodal deep learning. In ICML, pages 689–696, 2011

work page 2011

[22] [22]

Ojala, M

T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classiﬁcation with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence , 24(7):971–987, 2002

work page 2002

[23] [23]

X. Peng, S. Xiao, J. Feng, W.-Y. Yau, and Z. Yi. Deep subspace clustering 17 with sparsity prior. In IJCAI, pages 1925–1931, 2016

work page 1925

[24] [24]

Y. Pu, Z. Gan, R. Henao, X. Yuan, C. Li, A. Stevens, and L. Carin. Varia- tional autoencoder for deep learning of images, labels and captions. In NIPS, pages 2352–2360, 2016

work page 2016

[25] [25]

Srivastava and R

N. Srivastava and R. Salakhutdinov. Multimodal learning with deep Boltz- mann machines. Journal of Machine Learning Research , 15(1):2949–2980, 2014

work page 2014

[26] [26]

S. Sun. A survey of multi-view machine learning. Neural Computing and Applications, 23(7):2031–2038, 2013

work page 2031

[27] [27]

F. Tian, B. Gao, Q. Cui, E. Chen, and T.-Y. Liu. Learning deep representa- tions for graph clustering. In AAAI, pages 1293–1299, 2014

work page 2014

[28] [28]

H. Wang, F. Nie, and H. Huang. Multi-view clustering and feature learning via structured sparsity. In ICML, volume 28, pages 352–360, 2013

work page 2013

[29] [29]

W. Wang, R. Arora, K. Livescu, and J. Bilmes. On deep multi-view repre- sentation learning. In ICML, pages l083-1092, 2015

work page 2015

[30] [30]

W. Wang, X. Yan, H. Lee, and K. Livescu. Deep variational canonical corre- lation analysis. preprint arXiv:1610.03454, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[31] [31]

J. Xie, R. Girshick, and A. Farhadi. Unsupervised deep embedding for clus- tering analysis. In ICML, pages 478–487, 2016

work page 2016

[32] [32]

C. Xu, Z. Guan, W. Zhao, Y. Niu, Q. Wang, and Z. Wang. Deep multi-view concept learning. In IJCAI, pages 2898-2904, 2018

work page 2018

[33] [33]

C. Xu, D. Tao, and C. Xu. A survey on multi-view learning. preprint arXiv:1304.5634, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[34] [34]

J. Xu, J. Han, F. Nie, and X. Li. Re-weighted discriminatively embedded k-means for multi-view clustering. IEEE Transactions on Image Processing , 26(6):3016-3027, 2017

work page 2017

[35] [35]

B. Yang, X. Fu, N. D. Sidiropoulos, and M. Hong. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In ICML, pages 3861– 3870, 2017

work page 2017

[36] [36]

M. Yin, J. Gao, S. Xie, and Y. Guo. Multiview subspace clustering via tensorial t-product representation. IEEE Transactions on Neural Networks and Learning Systems , 30(3):851–864, 2019

work page 2019

[37] [37]

Zhang, H

C. Zhang, H. Fu, S. Liu, G. Liu, and X. Cao. Low-rank tensor constrained multiview subspace clustering. In ICCV, pages 1582-1590, 2015

work page 2015

[38] [38]

Zhang, L

Z. Zhang, L. Liu, F. Shen, H. T. Shen, and L. Shao. Binary multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, doi:10.1109/TPAMI.2018.2847335, pages 1–1, 2018

work page doi:10.1109/tpami.2018.2847335 2018

[39] [39]

Bengio, P

Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS, pages 153-160, 2007. 18

work page 2007