pith. machine review for the scientific record. sign in

arxiv: 2510.20299 · v3 · submitted 2025-10-23 · 💻 cs.LG · cs.AI

DB-FGA-Net: Dual Backbone Frequency Gated Attention Network for Multi-Class Brain Tumor Classification with Grad-CAM Interpretability

Pith reviewed 2026-05-18 04:04 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords brain tumor classificationMRIdeep learningattention mechanismGrad-CAMmulti-classmedical imaginginterpretability
0
0 comments X

The pith

A dual-backbone network with frequency-gated attention classifies brain tumors from MRI at up to 99.24 percent accuracy without data augmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a deep learning model that merges VGG16 and Xception networks through a frequency-gated attention block to sort MRI brain scans into two, three, or four tumor categories. It emphasizes strong results achieved without any artificial expansion of the training images, which supports better handling of new and varied cases. The design reaches 99.24 percent accuracy on the main 7000-image collection for the four-class task and 95.77 percent on a separate 3000-image collection. Grad-CAM maps are added to show the image regions that drive each prediction, improving transparency for medical users. A graphical interface is included to allow immediate classification and localization in practice.

Core claim

The central claim is that combining VGG16 and Xception backbones with a Frequency-Gated Attention Block lets the system capture both fine local details and broader context in MRI images, delivering 99.24 percent accuracy on four-class brain tumor classification without data augmentation, 98.68 percent on three-class and 99.85 percent on two-class versions of the same data, and 95.77 percent accuracy when tested on an independent collection while also supplying Grad-CAM visualizations of the relevant tumor areas.

What carries the argument

The Frequency-Gated Attention Block, which applies frequency-domain processing to selectively emphasize complementary features extracted from the paired VGG16 and Xception backbones.

If this is right

  • Classification accuracy remains high across the two-class, three-class, and four-class tumor typing problems on the primary dataset.
  • The model maintains competitive performance when applied to an independent dataset, exceeding several baseline approaches under identical conditions.
  • Grad-CAM heatmaps make the regions used for each decision visible, supporting clinical review of the output.
  • A graphical user interface delivers real-time classification together with tumor localization maps for immediate use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Success without augmentation suggests that frequency-domain attention can reduce dependence on large-scale labeled medical image collections.
  • The same dual-backbone and gating pattern may transfer to classification tasks involving other scan types such as CT or ultrasound.
  • The localization maps could let physicians quickly compare the model's focus area against their own visual assessment during diagnosis.
  • Prospective evaluation on live patient streams from multiple centers would be a direct way to check readiness for everyday hospital deployment.

Load-bearing premise

The two MRI collections used for development and testing already contain enough of the image variability, scanner differences, and tumor appearances that occur in routine clinical neuro-oncology.

What would settle it

Testing the trained model on a fresh set of MRI scans gathered from additional hospitals or different scanner models and observing a substantial drop in accuracy would directly test whether the reported generalization holds.

Figures

Figures reproduced from arXiv: 2510.20299 by MD. Abu Ismail Siddique, Saraf Anzum Shreya, Sharaf Tasnim.

Figure 1
Figure 1. Figure 1: Sample MRI images from the 7K-DS dataset (Glioma, Meningioma, No Tumor and Pituitary). [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Sample MRI images from the 3K-DS dataset (Glioma, Meningioma, No Tumor and Pituitary). [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visual representation of the Workflow of the proposed approach. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Baseline architecture: input images are processed through a CNN backbone, followed by Global Average [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Detailed structure of the FGA block, illustrating the sequential channel and spatial attention mechanisms [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Integrated Base+FGA architecture, showing the CNN backbone enhanced with FGA, followed by Global [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Dual-backbone architecture combining FGA-enhanced VGG16 and Xception. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: 4-Class ROC and Confusion Matrix for the proposed model on 7K-DS. [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: 3-Class ROC and Confusion Matrix for the proposed on 7K-DS. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: 2-Class ROC and Confusion Matrix for the proposed model on 7K-DS. [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: ROC curve and confusion matrix for the proposed model validated on 3K-DS, illustrating discriminative [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Grad-CAM comparison of the CBAM and FGA integrated models on 7K-DS (4-class) [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Grad-CAM visualization and analysis of tumor classes for the proposed DB-FGA-Net model [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Graphical User Interface (GUI) of DB-FGA-Net. The interface demonstrates predictions and Grad-CAM [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
read the original abstract

Brain tumors are a challenging problem in neuro-oncology, where early and precise diagnosis is important for successful treatment. Deep learning-based brain tumor classification methods often rely on heavy data augmentation which can limit generalization and trust in clinical applications. In this paper, we propose a double-backbone network integrating VGG16 and Xception with a Frequency-Gated Attention (FGA) Block to capture complementary local and global features. Our model achieves highly competitive performance without augmentation which demonstrates robustness to variably sized and distributed datasets. For further transparency, Grad-CAM is integrated to visualize the tumor regions based on which the model is giving prediction, bridging the gap between model prediction and clinical interpretability. The proposed framework achieves 99.24% accuracy on the 7K-DS dataset for the 4-class setting, along with 98.68% and 99.85% in the 3-class and 2-class settings, respectively. On the independent 3K-DS dataset, the model generalizes with 95.77% accuracy, outperforming several baseline methods under the same experimental setting. To further support clinical usability, we developed a graphical user interface (GUI) that provides real-time classification and Grad-CAM-based tumor localization. These findings suggest that augmentation-free, interpretable, and deployable deep learning models such as DB-FGA-Net hold strong potential for reliable clinical translation in brain tumor diagnosis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes DB-FGA-Net, a dual-backbone architecture combining VGG16 and Xception with a Frequency-Gated Attention (FGA) block to classify brain tumors from MRI scans into 2-, 3-, or 4-class settings. The central claims are that the model attains 99.24% accuracy (4-class), 98.68% (3-class), and 99.85% (2-class) on the 7K-DS dataset without data augmentation, generalizes to 95.77% accuracy on an independent 3K-DS dataset while outperforming baselines, and provides clinical utility via Grad-CAM visualizations and a real-time GUI.

Significance. If the experimental claims hold after clarification, the work would demonstrate that dual-backbone designs with frequency-gated attention can deliver competitive augmentation-free performance on brain-tumor MRI tasks, reducing reliance on augmentation that may introduce artifacts. Explicit credit is due for the integration of Grad-CAM for decision visualization and the provision of a deployable GUI, both of which directly support interpretability and potential clinical translation.

major comments (2)
  1. [§4.1] §4.1 (Datasets): The descriptions of 7K-DS and 3K-DS provide no metadata on scanner field strength, acquisition parameters, slice thickness, multi-center sourcing, or patient demographics. This information is load-bearing for the no-augmentation generalization claim, because the 95.77% accuracy on the independent set could reflect limited distribution shift rather than robustness to real clinical variability.
  2. [§5] §5 (Experimental Results): Reported accuracies are given as single point estimates without the number of independent runs, standard deviations, or statistical significance tests against baselines. This undermines the claim that DB-FGA-Net 'outperforms several baseline methods under the same experimental setting,' as the magnitude and reliability of improvement cannot be assessed.
minor comments (2)
  1. [Figure 3] Figure 3 (Grad-CAM examples): Only a small number of slices are shown; adding representative cases from each tumor class and from both datasets would strengthen the interpretability section.
  2. [§3.2] Notation in §3.2: The frequency gating operation inside the FGA block is described in prose only; a compact equation or pseudocode would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments on our manuscript. We have carefully considered each point and provide detailed responses below, along with indications of revisions to the manuscript.

read point-by-point responses
  1. Referee: [§4.1] §4.1 (Datasets): The descriptions of 7K-DS and 3K-DS provide no metadata on scanner field strength, acquisition parameters, slice thickness, multi-center sourcing, or patient demographics. This information is load-bearing for the no-augmentation generalization claim, because the 95.77% accuracy on the independent set could reflect limited distribution shift rather than robustness to real clinical variability.

    Authors: We agree that additional metadata on the datasets would provide better context for interpreting the generalization results. The 7K-DS and 3K-DS are standard public datasets used in brain tumor classification literature, but they do not include detailed scanner field strength, acquisition parameters, or patient demographics in their releases. In the revised manuscript, we have expanded Section 4.1 to explicitly state the available information about the datasets and added a discussion on the limitations regarding distribution shift. This clarifies that while the independent dataset demonstrates generalization beyond the training distribution, further validation on multi-center data with full metadata would be valuable for clinical deployment. revision: partial

  2. Referee: [§5] §5 (Experimental Results): Reported accuracies are given as single point estimates without the number of independent runs, standard deviations, or statistical significance tests against baselines. This undermines the claim that DB-FGA-Net 'outperforms several baseline methods under the same experimental setting,' as the magnitude and reliability of improvement cannot be assessed.

    Authors: We acknowledge that single-point estimates do not fully convey the reliability of the results. To address this, we have re-run the experiments with 5 independent trials using different random seeds for initialization and data shuffling. The revised Section 5 now includes mean accuracy, precision, recall, and F1-score with standard deviations for DB-FGA-Net and all baselines. We have also performed statistical significance testing using paired t-tests, reporting p-values to confirm that the improvements are statistically significant. Updated tables and text reflect these changes. revision: yes

Circularity Check

0 steps flagged

Empirical ML paper with independent held-out evaluation

full rationale

This is a standard supervised deep-learning application paper that trains the proposed DB-FGA-Net on the 7K-DS dataset and measures accuracy on held-out test splits plus an independent 3K-DS corpus. No derivation chain, equations, or first-principles results are offered; the headline numbers (99.24 % 4-class, 95.77 % on 3K-DS) are direct empirical measurements rather than quantities obtained by fitting parameters that are then renamed as predictions. The architecture description, Grad-CAM usage, and GUI are likewise implementation details, not circular reductions. The paper is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on standard supervised learning assumptions plus the untested premise that the two private datasets capture real clinical variability. No new physical entities or mathematical axioms beyond ordinary neural network training are introduced.

axioms (1)
  • domain assumption The 7K-DS and 3K-DS MRI datasets are representative of clinical brain tumor imaging variability and distribution shifts.
    Invoked implicitly when claiming generalization without augmentation and clinical translation potential.
invented entities (1)
  • Frequency-Gated Attention (FGA) Block no independent evidence
    purpose: To capture complementary local and global features by gating frequency information between the two backbones.
    New architectural component introduced in the paper; no independent evidence outside the reported experiments is provided.

pith-pipeline@v0.9.0 · 5804 in / 1506 out tokens · 52246 ms · 2026-05-18T04:04:03.745526+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 2 internal anchors

  1. [1]

    Mukand, Dilshad D

    Jon A. Mukand, Dilshad D. Blackinton, Michael G. Crincoli, James J. Lee, Bernadette B. Santos,Incidence of Neurologic Deficits and Rehabilitation of Patients with Brain Tumors, American Journal of Physical Medicine & Rehabilitation, vol. 80, no. 5, pp. 346–350, 2001. DOI:10.1097/00002060−200105000−00006

  2. [2]

    [Online]

    Mayo Clinic,Brain tumor - Symptoms and causes, 2024. [Online]. Available: https://www.mayoclinic.org/ diseases-conditions/brain-tumor/symptoms-causes/syc-20350084

  3. [3]

    [Online]

    American Cancer Society,Key Statistics About Brain and Spinal Cord Tumors in Adults, 2025. [Online]. Available: https://www.cancer.org/cancer/types/brain-spinal-cord-tumors-adults/ about/key-statistics.html

  4. [4]

    [Online]

    Aaron Cohen-Gadol, MD,Brain Tumor Statistics, 2023. [Online]. Available: https://www. aaroncohen-gadol.com/en/patients/brain-tumor/types/statistics

  5. [5]

    [Online]

    The Brain Tumour Charity,Statistics about brain tumours, 2025. [Online]. Available: https://www. thebraintumourcharity.org/

  6. [6]

    Stieg,Early Detection Can Be Key to Surviving a Brain Tumor, 2016

    Philip E. Stieg,Early Detection Can Be Key to Surviving a Brain Tumor, 2016. [Online]. Available: https://neurosurgery.weillcornell.org/about-us/blog/ early-detection-can-be-key-surviving-brain-tumor

  7. [7]

    MRI: What’s the Difference?, 2024

    MD Anderson Cancer Center,CT Scan vs. MRI: What’s the Difference?, 2024. [Online]. Available:https://www. mdanderson.org/cancerwise/ct-scan-vs-mri--what-is-the-difference.h00-159616278.html

  8. [8]

    Bejnordi, Arnaud A.A

    Geert Litjens, Thijs Kooi, Babak E. Bejnordi, Arnaud A.A. Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A.W.M. van der Laak, Bram van Ginneken, Clara I. Sánchez,A Survey on Deep Learning in Medical Image Analysis, Medical Image Analysis, vol. 42, pp. 60–88, 2017. DOI:10.1016/j.media.2017.07.005

  9. [9]

    [Online]

    MIT News,Using AI to predict breast cancer and personalize care, 2019. [Online]. Available: https://news. mit.edu/2019/using-ai-predict-breast-cancer-and-personalize-care-0507

  10. [10]

    Dinggang Shen, Guorong Wu, Heung-Il Suk,Deep Learning in Medical Image Analysis, Annual Review of Biomedical Engineering, vol. 19, pp. 221–248, 2017. DOI:10.1146/annurev−bioeng−071516−044442

  11. [11]

    Ramazan ˙Incir, Ferhat Bozkurt,Improving brain tumor classification with combined convolutional neu- ral networks and transfer learning, Knowledge-Based Systems, vol. 299, p. 111981, 2024. DOI: 10.1016/j.knosys.2024.111981. 23 DB-FGA-Net: Dual Backbone Frequency Gated Attention Network for Multi-Class Brain Tumor Classification with Grad-CAM Interpretability

  12. [12]

    Preetha, M

    R. Preetha, M. Jasmine Pemeena Priyadarsini, J. S. Nisha,Hybrid 3B Net and EfficientNetB2 Model for Multi-Class Brain Tumor Classification, IEEE Access, vol. 13, pp. 63465–63485, 2025. DOI: 10.1109/ACCESS.2025.3558411

  13. [13]

    Rizal Dwi Prayogo, Nur Hamid, Hidetaka Nambo,Hybrid CNN-Based Transfer Learning Enhances Brain Tumor Classification on MRI Images, IEEE Access, vol. 13, pp. 116654–116668, 2025. DOI: 10.1109/ACCESS.2025.3584376

  14. [14]

    Adnan Saeed, Khurram Shehzad, Shahzad Sarwar Bhatti, Saim Ahmed, Ahmad Taher Azar,GGLA-NeXtE2NET: A Dual-Branch Ensemble Network With Gated Global-Local Attention for Enhanced Brain Tumor Recognition, IEEE Access, vol. 13, pp. 7234–7257, 2025. DOI:10.1109/ACCESS.2025.3525518

  15. [15]

    Ciaccio, Ru-San Tan, U

    Bedriye Dogan, Hursit Burak Mutlu, Muhammed Yildirim, Sercan Yalcin, Serpil Aslan, Niranjana Sampathila, Ozal Yildirim, Edward J. Ciaccio, Ru-San Tan, U. Rajendra Acharya,Content-Based Brain Magnetic Resonance Image Retrieval and Classification With the Proposed Deep Learning and Tissue-Based System, IEEE Access, vol. 13, pp. 122684–122697, 2025. DOI:10.1...

  16. [16]

    Hussein Alshaari, Saeed Alqahtani,Deep-EFNet: An Optimized EfficientNetB0 Architecture With Dual Regular- ization for Scalable Multi-Class Brain Tumor Classification in MRI, IEEE Access, vol. 13, pp. 85682–85697,

  17. [17]

    DOI:10.1109/ACCESS.2025.3567919

  18. [18]

    Pratikkumar Chauhan, Munindra Lunagaria, Deepak Verma, Krunal Vaghela, Ganshyam Tejani, Sunil Sharma, Ahmed Khan,PBVit: A Patch-Based Vision Transformer for Enhanced Brain Tumor Detection, IEEE Access, vol. PP, pp. 1–1, 2024. DOI:10.1109/ACCESS.2024.3521002

  19. [19]

    Jyotismita Chaki, Marcin Wozniak,Brain Tumor Categorization and Retrieval Using Deep Brain Incep Res Architecture Based Reinforcement Learning Network, IEEE Access, vol. PP, pp. 1–1, 2023. DOI: 10.1109/ACCESS.2023.3334434

  20. [20]

    Anees Tariq, Muhammad Munwar Iqbal, Muhammad Javed Iqbal, Iftikhar Ahmad,Transforming Brain Tumor Detection Empowering Multi-Class Classification With Vision Transformers and EfficientNetV2, IEEE Access, vol. 13, pp. 63857–63876, 2025. DOI:10.1109/ACCESS.2025.3555638

  21. [21]

    Sivakumar, Ahmad Raza Khan, Syed Umar, R

    N. Sivakumar, Ahmad Raza Khan, Syed Umar, R. N. Ravikumar, I. Bremnavas, Munindra Lunagaria, Krunal Vaghela, Ghanshyam G. Tejani, Sunil Kumar Sharma,A Hybrid Brain Tumor Classification Using FL With FedAvg and FedProx for Privacy and Robustness Across Heterogeneous Data Sources, IEEE Access, vol. 13, pp. 57705–57719, 2025. DOI:10.1109/ACCESS.2025.3549440

  22. [22]

    Ayesha Younis, Li Qiang, Zargaam Afzal, Mohammed Adamu, Halima Bello Kawuwa, Fida Hussain, Hamid Hussain,Abnormal Brain Tumors Classification Using ResNet50 and Its Comprehensive Evaluation, IEEE Access, vol. PP, pp. 1–1, 2024. DOI:10.1109/ACCESS.2024.3403902

  23. [25]

    DOI: 10.6084/m9.f igshare.1512427.v8

    Jun Cheng,Brain Tumor Dataset, figshare, 2017. DOI: 10.6084/m9.f igshare.1512427.v8. [Online]. Available: https://doi.org/10.6084/m9.figshare.1512427.v8

  24. [26]

    DOI: 10.21227/1jny−g144

    Jyotismita Chaki,Brain Tumor MRI Dataset, IEEE Dataport, 2023. DOI: 10.21227/1jny−g144 . [Online]. Available:https://dx.doi.org/10.21227/1jny-g144

  25. [27]

    DOI: 10.34740/KAGGLE/DSV /12745533

    Sartaj Bhuvaji, Ankita Kadam, Prajakta Bhumkar, Sameer Dedge, Swati Kanchan,Brain Tumor Classi- fication (MRI), Kaggle, 2025. DOI: 10.34740/KAGGLE/DSV /12745533 . [Online]. Available: https: //www.kaggle.com/dsv/12745533

  26. [28]

    [Online]

    Ahmed Hamada,Brain Tumor Detection, Kaggle dataset, 2020. [Online]. Available: https://www.kaggle. com/datasets/ahmedhamada0/brain-tumor-detection

  27. [29]

    Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra, Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 618–626, 2017. DOI:10.1109/ICCV.2017.74. 24 DB-FGA-Net: Dual Backbone Frequency Gated Att...

  28. [31]

    Adam: A Method for Stochastic Optimization

    [Online]. Available:https://arxiv.org/abs/1412.6980

  29. [32]

    Gomez, Lukasz Kaiser, Illia Polosukhin,Attention is All You Need, Advances in Neural Information Processing Systems, vol

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin,Attention is All You Need, Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008, 2017

  30. [37]

    Zhuo Xu, Yali Wang, Yu Li, Yandong Zhang,Frequency Attention Network for Image Classification, IEEE Transactions on Image Processing, vol. 29, pp. 6545–6556, 2020. DOI:10.1109/T IP.2020.2995054

  31. [38]

    12905, pp

    Xin Li, Jian Wang, Xinggang Hu, Junzhou Yang,Frequency Domain Attention for Medical Image Segmentation, Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 Workshop, vol. 12905, pp. 3–12,

  32. [39]

    DOI:10.1007/978−3−030−87749−9 1

  33. [40]

    Hussain Hassan, Wadii Boulila,Efficient Approach for Brain Tumor Detection and Classification Using Fuzzy Thresholding and Deep Learning Algorithms, IEEE Access, vol

    Nashaat M. Hussain Hassan, Wadii Boulila,Efficient Approach for Brain Tumor Detection and Classification Using Fuzzy Thresholding and Deep Learning Algorithms, IEEE Access, vol. 13, pp. 78808–78832, 2025. DOI: 10.1109/ACCESS.2025.3566332

  34. [41]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Karen Simonyan, Andrew Zisserman,Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv preprint arXiv:1409.1556, 2014. [Online]. Available:http://arxiv.org/abs/1409.1556

  35. [43]

    1251–1258, 2017

    François Chollet,Xception: Deep Learning with Depthwise Separable Convolutions, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1251–1258, 2017. DOI: 10.1109/CV P R.2017.195. 25