pith. machine review for the scientific record. sign in

arxiv: 2506.00239 · v5 · submitted 2025-05-30 · 💻 cs.AI

Recognition: unknown

SmellNet: A Large-scale Dataset for Real-world Smell Recognition

Authors on Pith no claims yet
classification 💻 cs.AI
keywords scentformersmellnetsmellabilityacrossdatasensor-basedachieves
0
0 comments X
read the original abstract

The ability of AI to sense and identify various substances based on their smell alone can have profound impacts on allergen detection (e.g. smelling gluten or peanuts in a cake), monitoring the manufacturing process, and sensing hormones that indicate emotional states, stress levels, and diseases. Despite these broad impacts, there are few standardized datasets, and therefore little progress, for training and evaluating AI systems' ability to `smell' in the real-world. In this paper, we use small gas and chemical sensors to create SmellNet, a comparatively large dataset for sensor-based machine olfaction that digitizes a diverse range of smells in the natural world. SmellNet contains about 828,000 time-series data points across 50 substances, spanning nuts, spices, herbs, fruits, and vegetables, and 43 mixtures among them with fixed ingredient volumetric ratios, with 68 hours of data collected. Using SmellNet, we developed ScentFormer, a Transformer-based architecture combining temporal differencing and sliding-window augmentation for smell data. For the SmellNet-Base classification tasks, ScentFormer achieves 63.3% Top-1 accuracy with GC-MS supervision, and for the SmellNet-Mixture distribution prediction tasks, ScentFormer achieves 50.2% Top-1@0.1 on the test-seen split. ScentFormer's ability to generalize across conditions and capture transient chemical dynamics demonstrates the promise of temporal modeling in sensor-based olfactory AI. SmellNet and ScentFormer lay the groundwork for sensor-based olfactory applications across healthcare, food and beverage, environmental monitoring, manufacturing, and entertainment.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. AromaGen: Interactive Generation of Rich Olfactory Experiences with Multimodal Language Models

    cs.HC 2026-04 unverdicted novelty 7.0

    AromaGen generates real-time custom aromas from free-form text or visual inputs via multimodal LLM mapping to 12 odorants, matching or exceeding human mixtures after iterative refinement in a 26-person study.