Recognition: 1 theorem link
· Lean TheoremGAN-based Domain Adaptation for Image-aware Layout Generation in Advertising Poster Design
Pith reviewed 2026-05-10 17:52 UTC · model grok-4.3
The pith
A GAN with pixel-level discriminator adapts to clean product images and generates image-aware layouts for advertising posters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PDA-GAN introduces a pixel-level discriminator attached to shallow feature maps that computes per-pixel GAN loss, allowing the layout generator trained on inpainted posters to produce high-quality image-aware layouts when given clean product images; quantitative and qualitative results on the CGL-Dataset show this outperforms prior models.
What carries the argument
The pixel-level discriminator (PD) in PDA-GAN, which applies the adversarial loss to individual pixels of shallow feature maps to close the domain gap between inpainted training posters and clean input images.
If this is right
- Layouts respect fine visual details such as product shape and texture instead of treating the image as a uniform background.
- The three new content-aware metrics give a more relevant score for how graphic elements interact with image content than position-only measures.
- The same unsupervised adaptation step can be added to other conditional layout generators that must train on imperfect or synthetic data.
Where Pith is reading between the lines
- The pixel discriminator may help any image-conditioned generative model when the training images contain localized artifacts.
- The new metrics could be reused to benchmark layout quality in mobile UI or magazine design tasks.
- If the inpainting method improves, the remaining domain gap shrinks and the adaptation network may become simpler or unnecessary.
Load-bearing premise
The visual artifacts left by inpainting create a domain gap that unsupervised pixel-level adaptation can close without any paired clean training examples.
What would settle it
Run PDA-GAN and the Gaussian-blur baseline on a held-out set of real clean product images never seen during training; if human raters or the new content-aware metrics show no improvement in layout quality or content alignment, the adaptation claim is false.
Figures
read the original abstract
Layout plays a crucial role in graphic design and poster generation. Recently, the application of deep learning models for layout generation has gained significant attention. This paper focuses on using a GAN-based model conditioned on images to generate advertising poster graphic layouts, requiring a dataset of paired product images and layouts. To address this task, we introduce the Content-aware Graphic Layout Dataset (CGL-Dataset), consisting of 60,548 paired inpainted posters with annotations and 121,000 clean product images. The inpainting artifacts introduce a domain gap between the inpainted posters and clean images. To bridge this gap, we design two GAN-based models. The first model, CGL-GAN, uses Gaussian blur on the inpainted regions to generate layouts. The second model combines unsupervised domain adaptation by introducing a GAN with a pixel-level discriminator (PD), abbreviated as PDA-GAN, to generate image-aware layouts based on the visual texture of input images. The PD is connected to shallow-level feature maps and computes the GAN loss for each input-image pixel. Additionally, we propose three novel content-aware metrics to assess the model's ability to capture the intricate relationships between graphic elements and image content. Quantitative and qualitative evaluations demonstrate that PDA-GAN achieves state-of-the-art performance and generates high-quality image-aware layouts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Content-aware Graphic Layout Dataset (CGL-Dataset) with 60,548 paired inpainted posters and 121,000 clean product images to support image-conditioned layout generation for advertising posters. It proposes CGL-GAN (using Gaussian blur on inpainted regions) and PDA-GAN (an unsupervised domain-adaptation GAN with a pixel-level discriminator attached to shallow feature maps that computes per-pixel GAN loss). Three new content-aware metrics are defined to evaluate how well generated layouts respect image content, and the authors claim that PDA-GAN achieves state-of-the-art quantitative and qualitative results on clean product images.
Significance. If the domain adaptation demonstrably aligns the inpainted training distribution with clean test images and the new metrics are shown to be non-redundant with existing layout metrics, the work would provide a practical advance for automated poster design and a useful public dataset. The explicit handling of the inpainting artifact gap is a relevant engineering contribution in conditional layout generation.
major comments (3)
- [Method / PDA-GAN architecture description] The central claim that PDA-GAN produces image-aware layouts on clean inputs rests on the pixel-level discriminator successfully bridging the inpainting domain gap. No domain-discrepancy metrics (e.g., MMD, Fréchet distance on feature distributions, or discriminator accuracy on held-out clean vs. inpainted pairs), t-SNE visualizations, or ablation removing the PD are reported. Without such evidence the performance gains on clean images could be attributable to the base conditional generator rather than adaptation.
- [Evaluation / Metrics section] The three proposed content-aware metrics are introduced to quantify relationships between graphic elements and image content, yet the manuscript provides neither their exact mathematical definitions nor a correlation analysis showing they capture information orthogonal to standard layout metrics (e.g., IoU, overlap, or alignment scores). This weakens the assertion that PDA-GAN is superior specifically in content awareness.
- [Experiments / Quantitative comparison] Table of quantitative results (presumably in §5) reports SOTA numbers for PDA-GAN, but the baselines listed do not include any other domain-adaptation or inpainting-robust layout models. Consequently it is impossible to isolate whether the reported improvement stems from the pixel discriminator or from other architectural choices.
minor comments (2)
- [Abstract] The abstract introduces the abbreviation PDA-GAN without first spelling out “Pixel-level Discriminator Adaptation GAN.”
- [Method] Notation for the per-pixel GAN loss (how the discriminator output is aggregated over pixels) is not defined in the provided description of the PD attachment to shallow feature maps.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our manuscript. We address each major point below, providing clarifications and committing to revisions where appropriate to strengthen the evidence for our domain adaptation approach and metrics.
read point-by-point responses
-
Referee: The central claim that PDA-GAN produces image-aware layouts on clean inputs rests on the pixel-level discriminator successfully bridging the inpainting domain gap. No domain-discrepancy metrics (e.g., MMD, Fréchet distance on feature distributions, or discriminator accuracy on held-out clean vs. inpainted pairs), t-SNE visualizations, or ablation removing the PD are reported. Without such evidence the performance gains on clean images could be attributable to the base conditional generator rather than adaptation.
Authors: We appreciate this observation. The superior quantitative and qualitative results of PDA-GAN over CGL-GAN on clean images provide supporting evidence that the pixel discriminator contributes to bridging the domain gap. To directly address the concern and strengthen the claim, we will add an ablation study comparing PDA-GAN with and without the pixel discriminator, along with domain discrepancy metrics such as MMD on feature distributions and t-SNE visualizations of inpainted vs. clean image features in the revised manuscript. revision: yes
-
Referee: The three proposed content-aware metrics are introduced to quantify relationships between graphic elements and image content, yet the manuscript provides neither their exact mathematical definitions nor a correlation analysis showing they capture information orthogonal to standard layout metrics (e.g., IoU, overlap, or alignment scores). This weakens the assertion that PDA-GAN is superior specifically in content awareness.
Authors: The exact mathematical definitions of the three content-aware metrics are detailed in Section 4.3 of the manuscript. We agree that an explicit correlation analysis would better demonstrate their value. In the revised version, we will add a correlation study (including Pearson or Spearman coefficients) between our metrics and standard layout metrics such as IoU, overlap, and alignment scores to show they capture orthogonal information related to content awareness. revision: yes
-
Referee: Table of quantitative results (presumably in §5) reports SOTA numbers for PDA-GAN, but the baselines listed do not include any other domain-adaptation or inpainting-robust layout models. Consequently it is impossible to isolate whether the reported improvement stems from the pixel discriminator or from other architectural choices.
Authors: We note that, to the best of our knowledge at submission time, no prior domain-adaptation methods existed specifically for image-conditioned layout generation on inpainted advertising posters. The comparison between CGL-GAN and PDA-GAN is designed to isolate the effect of the pixel discriminator. We will expand the related work and discussion sections to cover relevant domain adaptation techniques from other vision tasks and clarify the rationale for our baseline selection in the revision. revision: partial
Circularity Check
No significant circularity in empirical GAN training and evaluation chain
full rationale
The paper presents a standard empirical pipeline: creation of a paired inpainted/clean image dataset, training of CGL-GAN and PDA-GAN variants (with pixel-level discriminator attached to shallow features for unsupervised domain adaptation), and evaluation via three proposed content-aware metrics plus quantitative/qualitative results. No derivation step reduces a claimed prediction or result to its inputs by construction, no fitted parameters are relabeled as independent predictions, and no load-bearing self-citations or uniqueness theorems are invoked to force the architecture or metrics. The central claims rest on experimental outcomes rather than tautological redefinitions.
Axiom & Free-Parameter Ledger
free parameters (1)
- GAN training hyperparameters
axioms (1)
- domain assumption GANs with pixel-level discriminators can adapt layouts to image visual texture
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
PDA-GAN... pixel-level discriminator (PD) attached to shallow-level feature maps... computes the GAN loss for each input-image pixel
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Generative adversarial nets,
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y . Bengio, “Generative adversarial nets,” inAnnual Conference on Neural Information Processing Systems, 2014, pp. 2672–2680
2014
-
[2]
Layoutgan: Generating graphic layouts with wireframe discriminators
J. Li, J. Yang, A. Hertzmann, J. Zhang, and T. Xu, “Layoutgan: Generating graphic layouts with wireframe discriminators.” ICLR, 2019
2019
-
[3]
Attribute- conditioned layout GAN for automatic graphic design,
J. Li, J. Yang, J. Zhang, C. Liu, C. Wang, and T. Xu, “Attribute- conditioned layout GAN for automatic graphic design,”IEEE Trans. Vis. Comput. Graph., vol. 27, no. 10, pp. 4039–4048, 2021
2021
-
[4]
Content-aware generative modeling of graphic design layouts,
X. Zheng, X. Qiao, Y . Cao, and R. W. H. Lau, “Content-aware generative modeling of graphic design layouts,”ACM Trans. Graph., vol. 38, no. 4, pp. 133:1–133:15, 2019
2019
-
[5]
End-to-end object detection with transformers,
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” inECCV (1), ser. Lecture Notes in Computer Science, vol. 12346. Springer, 2020, pp. 213–229
2020
-
[6]
Resolution-robust large mask inpainting with fourier convolutions,
R. Suvorov, E. Logacheva, A. Mashikhin, A. Remizova, A. Ashukha, A. Silvestrov, N. Kong, H. Goka, K. Park, and V . Lempitsky, “Resolution-robust large mask inpainting with fourier convolutions,” inIEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022, Waikoloa, HI, USA, January 3-8, 2022. IEEE, 2022, pp. 3172–3182
2022
-
[7]
Unsupervised domain adaptation for zero-shot learning,
E. Kodirov, T. Xiang, Z. Fu, and S. Gong, “Unsupervised domain adaptation for zero-shot learning,” inICCV. IEEE Computer Society, 2015, pp. 2452–2460
2015
-
[8]
Simultaneous deep transfer across domains and tasks,
E. Tzeng, J. Hoffman, T. Darrell, and K. Saenko, “Simultaneous deep transfer across domains and tasks,” inICCV. IEEE Computer Society, 2015, pp. 4068–4076
2015
-
[9]
Adversarial multiple source domain adaptation,
H. Zhao, S. Zhang, G. Wu, J. M. F. Moura, J. P. Costeira, and G. J. Gordon, “Adversarial multiple source domain adaptation,” inNeurIPS, 2018, pp. 8568–8579
2018
-
[10]
A brief review of domain adaptation,
A. Farahani, S. V oghoei, K. Rasheed, and H. R. Arabnia, “A brief review of domain adaptation,”CoRR, vol. abs/2010.03978, 2020
-
[11]
Cross-modal learning for domain adaptation in 3d semantic segmentation,
M. Jaritz, T. Vu, R. de Charette, ´E. Wirbel, and P. P ´erez, “Cross-modal learning for domain adaptation in 3d semantic segmentation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 2, pp. 1533–1544, 2023
2023
-
[12]
Image-to-image translation with conditional adversarial networks,
P. Isola, J. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” inCVPR. IEEE Computer Society, 2017, pp. 5967–5976
2017
-
[13]
Layoutgan: Generating graphic layouts with wireframe discriminators,
J. Li, J. Yang, A. Hertzmann, J. Zhang, and T. Xu, “Layoutgan: Generating graphic layouts with wireframe discriminators,”CoRR, vol. abs/1901.06767, 2019
-
[14]
Layoutvae: Stochastic scene layout generation from a label set
A. A. Jyothi, T. Durand, J. He, L. Sigal, and G. Mori, “Layoutvae: Stochastic scene layout generation from a label set.” ICCV , 2019, pp. 9894–9903
2019
-
[15]
Variational transformer networks for layout generation,
D. M. Arroyo, J. Postels, and F. Tombari, “Variational transformer networks for layout generation,” inCVPR, 2021, pp. 13 642–13 652
2021
-
[16]
Gans trained by a two time-scale update rule converge to a local nash equilibrium,
M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,”Advances in neural information processing systems, vol. 30, 2017
2017
-
[17]
Retrieval-augmented layout transformer for content-aware layout gen- eration,
D. Horita, N. Inoue, K. Kikuchi, K. Yamaguchi, and K. Aizawa, “Retrieval-augmented layout transformer for content-aware layout gen- eration,” inCVPR, 2024, pp. 67–76
2024
-
[18]
Composition- aware graphic layout GAN for visual-textual presentation designs,
M. Zhou, C. Xu, Y . Ma, T. Ge, Y . Jiang, and W. Xu, “Composition- aware graphic layout GAN for visual-textual presentation designs,” in IJCAI. ijcai.org, 2022, pp. 4995–5001
2022
-
[19]
Unsupervised domain adaption with pixel-level discriminator for image-aware layout genera- tion,
C. Xu, M. Zhou, T. Ge, Y . Jiang, and W. Xu, “Unsupervised domain adaption with pixel-level discriminator for image-aware layout genera- tion,” inCVPR. IEEE, 2023, pp. 10 114–10 123
2023
-
[20]
Adaptive grid-based document layout,
C. E. Jacobs, W. Li, E. Schrier, D. Bargeron, and D. Salesin, “Adaptive grid-based document layout,”ACM Trans. Graph., vol. 22, no. 3, pp. 838–847, 2003
2003
-
[21]
Stochastic language models for style-directed layout analysis of document images,
T. Kanungo and S. Mao, “Stochastic language models for style-directed layout analysis of document images,”IEEE Trans. Image Process., vol. 12, no. 5, pp. 583–596, 2003
2003
-
[22]
Bricolage: example-based retargeting for web design,
R. Kumar, J. O. Talton, S. Ahmad, and S. R. Klemmer, “Bricolage: example-based retargeting for web design,” inProceedings of the Inter- national Conference on Human Factors in Computing Systems, 2011, pp. 2197–2206
2011
-
[23]
Automatic stylistic manga layout,
Y . Cao, A. B. Chan, and R. W. H. Lau, “Automatic stylistic manga layout,”ACM Trans. Graph., vol. 31, no. 6, pp. 141:1–141:10, 2012
2012
-
[24]
Learning layouts for single-pagegraphic designs,
P. O’Donovan, A. Agarwala, and A. Hertzmann, “Learning layouts for single-pagegraphic designs,”IEEE Trans. Vis. Comput. Graph., vol. 20, no. 8, pp. 1200–1213, 2014
2014
-
[25]
Influence of color-to-gray conversion on the performance of document image binarization: Toward a novel optimization problem,
R. Hedjam, H. Z. Nafchi, M. Kalacska, and M. Cheriet, “Influence of color-to-gray conversion on the performance of document image binarization: Toward a novel optimization problem,”IEEE Trans. Image Process., vol. 24, no. 11, pp. 3637–3651, 2015
2015
-
[26]
Layoutdm: Discrete diffusion model for controllable layout generation,
N. Inoue, K. Kikuchi, E. Simo-Serra, M. Otani, and K. Yamaguchi, “Layoutdm: Discrete diffusion model for controllable layout generation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10 167–10 176
2023
-
[27]
Layoutdiffusion: Improving graphic layout generation by discrete diffusion probabilistic models,
J. Zhang, J. Guo, S. Sun, J.-G. Lou, and D. Zhang, “Layoutdiffusion: Improving graphic layout generation by discrete diffusion probabilistic models,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 7226–7236
2023
-
[28]
Neural design network: Graphic layout generation with constraints
H. Lee, L. Jiang, I. Essa, P. B. Le, H. Gong, M. Yang, and W. Yang, “Neural design network: Graphic layout generation with constraints.” ECCV , 2020, pp. 491–506
2020
-
[29]
Layouttransformer: Scene layout generation with conceptual and spatial diversity
C. Yang, W. Fan, F. Yang, and Y . F. Wang, “Layouttransformer: Scene layout generation with conceptual and spatial diversity.” CVPR, 2021, pp. 3732–3741
2021
-
[30]
Constrained graphic layout generation via latent optimization
K. Kikuchi, E. Simo-Serra, M. Otani, and K. Yamaguchi, “Constrained graphic layout generation via latent optimization.” ACM Multimedia Conference, 2021, pp. 88–96
2021
-
[31]
Layouttransformer: Layout generation and completion with self-attention,
K. Gupta, J. Lazarow, A. Achille, L. Davis, V . Mahadevan, and A. Shri- vastava, “Layouttransformer: Layout generation and completion with self-attention,” inICCV. IEEE, 2021, pp. 984–994
2021
-
[32]
Layout- prompter: awaken the design ability of large language models,
J. Lin, J. Guo, S. Sun, Z. Yang, J.-G. Lou, and D. Zhang, “Layout- prompter: awaken the design ability of large language models,”Advances in Neural Information Processing Systems, vol. 36, pp. 43 852–43 879, 2023
2023
-
[33]
Posterlayout: A new benchmark and approach for content-aware visual-textual presentation layout,
H. Hsu, X. He, Y . Peng, H. Kong, and Q. Zhang, “Posterlayout: A new benchmark and approach for content-aware visual-textual presentation layout,” inCVPR. IEEE, 2023, pp. 6018–6026
2023
-
[34]
Unsupervised domain adaptation of deep object detectors,
D. Majumdar and V . P. Namboodiri, “Unsupervised domain adaptation of deep object detectors,” inESANN, 2018
2018
-
[35]
Domain adaptation for object detection using SE adaptors and center loss,
S. Nagesh, S. Rajesh, A. Baig, and S. Srinivasan, “Domain adaptation for object detection using SE adaptors and center loss,”CoRR, vol. abs/2205.12923, 2022
-
[36]
Domain adaptation for object recognition using subspace sampling demons,
Y . Zhang and B. D. Davison, “Domain adaptation for object recognition using subspace sampling demons,”Multim. Tools Appl., vol. 80, no. 15, pp. 23 255–23 274, 2021
2021
-
[37]
Spectral unsupervised domain adaptation for visual recognition,
J. Zhang, J. Huang, Z. Tian, and S. Lu, “Spectral unsupervised domain adaptation for visual recognition,” inCVPR. IEEE, 2022, pp. 9819– 9830
2022
-
[38]
Unsupervised pixel-level domain adaptation with generative adversarial networks,
K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised pixel-level domain adaptation with generative adversarial networks,” inCVPR. IEEE Computer Society, 2017, pp. 95–104
2017
-
[39]
Multi-adversarial domain adaptation,
Z. Pei, Z. Cao, M. Long, and J. Wang, “Multi-adversarial domain adaptation,” inAAAI. AAAI Press, 2018, pp. 3934–3941
2018
-
[40]
Learning to model relationships for zero-shot video classification,
J. Gao, T. Zhang, and C. Xu, “Learning to model relationships for zero-shot video classification,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3476–3491, 2021. [Online]. Available: https://doi.org/10.1109/TPAMI.2020.2985708
-
[41]
Multi-source unsupervised domain adaptation via pseudo target domain,
C. Ren, Y . H. Liu, X. Zhang, and K. Huang, “Multi-source unsupervised domain adaptation via pseudo target domain,”IEEE Trans. Image Process., vol. 31, pp. 2122–2135, 2022
2022
-
[42]
Madav2: Advanced multi-anchor based active domain adaptation segmentation,
M. Ning, D. Lu, Y . Xie, D. Chen, D. Wei, Y . Zheng, Y . Tian, S. Yan, and L. Yuan, “Madav2: Advanced multi-anchor based active domain adaptation segmentation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 11, pp. 13 553–13 566, 2023
2023
-
[43]
Domain-adversarial training of neural networks,
Y . Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V . S. Lempitsky, “Domain-adversarial training of neural networks,” inDomain Adaptation in Computer Vision Applications, ser. Advances in Computer Vision and Pattern Recognition, G. Csurka, Ed. Springer, 2017, pp. 189–209. [Online]. Available: https://doi.org/10....
-
[44]
Geometry aligned variational transformer for image-conditioned layout generation,
Y . Cao, Y . Ma, M. Zhou, C. Liu, H. Xie, T. Ge, and Y . Jiang, “Geometry aligned variational transformer for image-conditioned layout generation,” inACM Multimedia. ACM, 2022, pp. 1561–1571
2022
-
[45]
Microsoft COCO: common objects in context,
T. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll ´ar, and C. L. Zitnick, “Microsoft COCO: common objects in context,” inECCV (5), ser. Lecture Notes in Computer Science, vol
-
[46]
Springer, 2014, pp. 740–755
2014
-
[47]
Rico: A mobile app dataset for building data-driven design applications,
B. Deka, Z. Huang, C. Franzen, J. Hibschman, D. Afergan, Y . Li, J. Nichols, and R. Kumar, “Rico: A mobile app dataset for building data-driven design applications,” inUIST. ACM, 2017, pp. 845–854. 18
2017
-
[48]
Learning design semantics for mobile apps,
T. F. Liu, M. Craft, J. Situ, E. Yumer, R. Mech, and R. Kumar, “Learning design semantics for mobile apps,” inUIST. ACM, 2018, pp. 569–579
2018
-
[49]
Publaynet: Largest dataset ever for document layout analysis,
X. Zhong, J. Tang, and A. Jimeno-Yepes, “Publaynet: Largest dataset ever for document layout analysis,” inICDAR, 2019, pp. 1015–1022
2019
-
[50]
Progressive feature polishing network for salient object detection,
B. Wang, Q. Chen, M. Zhou, Z. Zhang, X. Jin, and K. Gai, “Progressive feature polishing network for salient object detection,” inAAAI. AAAI Press, 2020, pp. 12 128–12 135
2020
-
[51]
Rethinking the inception architecture for computer vision,
C. Szegedy, V . Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the inception architecture for computer vision,” inCVPR. IEEE Computer Society, 2016, pp. 2818–2826
2016
-
[52]
NIPS 2016 Tutorial: Generative Adversarial Networks
I. J. Goodfellow, “NIPS 2016 tutorial: Generative adversarial networks,” CoRR, vol. abs/1701.00160, 2017
work page Pith review arXiv 2016
-
[53]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inCVPR. IEEE Computer Society, 2016, pp. 770–778
2016
-
[54]
Feature pyramid networks for object detection,
T. Lin, P. Doll ´ar, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection,” inCVPR. IEEE Computer Society, 2017, pp. 936–944
2017
-
[55]
Attention is all you need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inNIPS, 2017, pp. 5998–6008
2017
-
[56]
Feature pyramid networks for object detection,
T. Lin, P. Doll ´ar, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection,” inCVPR, 2017, pp. 936–944
2017
-
[57]
Very deep convolutional networks for large-scale image recognition,
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” inICLR, 2015
2015
-
[58]
Learning transferable visual models from natural language supervi- sion,
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervi- sion,” inICML, vol. 139, 2021, pp. 8748–8763
2021
-
[59]
Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers,
H. Chefer, S. Gur, and L. Wolf, “Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers,” inICCV. IEEE, 2021, pp. 387–396
2021
-
[60]
Layoutgan: Syn- thesizing graphic layouts with vector-wireframe adversarial networks,
J. Li, J. Yang, A. Hertzmann, J. Zhang, and T. Xu, “Layoutgan: Syn- thesizing graphic layouts with vector-wireframe adversarial networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 7, pp. 2388–2399, 2021
2021
-
[61]
Adam: A method for stochastic optimization,
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inICLR, San Diego, CA, USA, May 7-9, 2015, Y . Bengio and Y . LeCun, Eds., 2015
2015
-
[62]
Language mod- els are few-shot learners,
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020. Chenchen Xureceived the B.Sc. and M.Sc. degrees from Anhui Normal University, Wuhu, China, in 2016 and 2020, res...
1901
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.