hub

Smiles, a chemical language and information system

David Weininger · 1988 · Journal of Chemical Information and Computer Sciences · DOI 10.1021/ci00057a005

9 Pith papers cite this work, alongside 6,035 external citations. Polarity classification is still indexing.

9 Pith papers citing it

6,035 external citations · Crossref

open at publisher browse 9 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

From Syntax to Semantics: Unveiling the Emergence of Chirality in SMILES Translation Models

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

Chirality emerges in SMILES translation models through an abrupt encoder-centered reorganization of representations after a long plateau, identified via checkpoint analysis and ablation.

Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models

cs.LG · 2026-04-14 · unverdicted · novelty 7.0

SCPT creates similarity-constrained preference triplets from scaffolds to train LLMs as conditional molecular editors that improve properties while keeping scaffolds intact.

Molecules Meet Language: Confound-Aware Representation Learning and Chemical Property Steering in Transformer-VAE Latent Spaces

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Chemically meaningful steering for properties like cLogP and TPSA emerges in entangled Transformer-VAE latent spaces only after controlling for SELFIES representation confounds through residualization and decoded traversals.

FRIGID: Scaling Diffusion-Based Molecular Generation from Mass Spectra at Training and Inference Time

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

FRIGID scales a diffusion-based model for de novo molecular structure generation from mass spectra, reaching over 18% top-1 accuracy on MassSpecGym and tripling prior bests on NPLIB1 via large unlabeled training and inference-time fragmentation refinement with log-linear compute scaling.

Bolek: A Multimodal Language Model for Molecular Reasoning

cs.LG · 2026-05-04 · unverdicted · novelty 5.0

Bolek injects Morgan fingerprint embeddings into an instruction-tuned text model, then fine-tunes on molecular alignment and synthetic chain-of-thought tasks to improve performance and grounding on 15 TDC binary classification endpoints while generalizing to unseen tasks.

Heterogeneous Scientific Foundation Model Collaboration

cs.AI · 2026-04-30 · unverdicted · novelty 5.0

Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.

Galactica: A Large Language Model for Science

cs.CL · 2022-11-16 · unverdicted · novelty 5.0

Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.

Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction

cs.LG · 2026-04-29 · unverdicted · novelty 4.0

Large benchmark shows classical ML and GNNs outperform pretrained large models on most of 22 drug-discovery endpoints under strict cross-validation.

Advancing Ligand-based Virtual Screening and Molecular Generation with Pretrained Molecular Embedding Distance

cs.LG · 2026-04-27 · unverdicted · novelty 4.0

Pretrained molecular embedding distances provide an effective similarity metric for ligand-based virtual screening and molecular generation without task-specific training.

citing papers explorer

Showing 9 of 9 citing papers.

From Syntax to Semantics: Unveiling the Emergence of Chirality in SMILES Translation Models cs.LG · 2026-05-11 · unverdicted · none · ref 5
Chirality emerges in SMILES translation models through an abrupt encoder-centered reorganization of representations after a long plateau, identified via checkpoint analysis and ablation.
Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models cs.LG · 2026-04-14 · unverdicted · none · ref 25
SCPT creates similarity-constrained preference triplets from scaffolds to train LLMs as conditional molecular editors that improve properties while keeping scaffolds intact.
Molecules Meet Language: Confound-Aware Representation Learning and Chemical Property Steering in Transformer-VAE Latent Spaces cs.LG · 2026-05-07 · unverdicted · none · ref 4
Chemically meaningful steering for properties like cLogP and TPSA emerges in entangled Transformer-VAE latent spaces only after controlling for SELFIES representation confounds through residualization and decoded traversals.
FRIGID: Scaling Diffusion-Based Molecular Generation from Mass Spectra at Training and Inference Time cs.LG · 2026-04-17 · unverdicted · none · ref 13
FRIGID scales a diffusion-based model for de novo molecular structure generation from mass spectra, reaching over 18% top-1 accuracy on MassSpecGym and tripling prior bests on NPLIB1 via large unlabeled training and inference-time fragmentation refinement with log-linear compute scaling.
Bolek: A Multimodal Language Model for Molecular Reasoning cs.LG · 2026-05-04 · unverdicted · none · ref 65
Bolek injects Morgan fingerprint embeddings into an instruction-tuned text model, then fine-tunes on molecular alignment and synthetic chain-of-thought tasks to improve performance and grounding on 15 TDC binary classification endpoints while generalizing to unseen tasks.
Heterogeneous Scientific Foundation Model Collaboration cs.AI · 2026-04-30 · unverdicted · none · ref 11
Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
Galactica: A Large Language Model for Science cs.CL · 2022-11-16 · unverdicted · none · ref 249
Galactica, a science-specialized LLM, reports higher scores than GPT-3, Chinchilla, and PaLM on LaTeX knowledge, mathematical reasoning, and medical QA benchmarks while outperforming general models on BIG-bench.
Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction cs.LG · 2026-04-29 · unverdicted · none · ref 31
Large benchmark shows classical ML and GNNs outperform pretrained large models on most of 22 drug-discovery endpoints under strict cross-validation.
Advancing Ligand-based Virtual Screening and Molecular Generation with Pretrained Molecular Embedding Distance cs.LG · 2026-04-27 · unverdicted · none · ref 5
Pretrained molecular embedding distances provide an effective similarity metric for ligand-based virtual screening and molecular generation without task-specific training.

Smiles, a chemical language and information system

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer