BERTimbau: pretrained BERT models for Brazilian Portuguese.Lecture Notes in Computer Science, 12319:403–417, 2020

Fábio Souza, Rodrigo Nogueira, Roberto Lotufo · 2020 · DOI 10.1007/978-3-030-61377-8_28

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

representative citing papers

Toten: A Knowledge-Based System For Structure-Preserving Representation Of Physical Quantities And Technical Notation In Brazilian Portuguese

cs.AI · 2026-06-17 · unverdicted · novelty 5.0

TOTEN is a knowledge-based system for structure-preserving representation of physical quantities and technical notation in Brazilian Portuguese using an ontology of engineering entities and external authorities, outperforming statistical baselines in atomicity and reconstruction.

AI-PAVE-Br: Leveraging Large Language Models for Enhanced Product Attribute Value Extraction through a Golden Set Approach

cs.CL · 2026-06-23 · unverdicted · novelty 4.0

AI-PAVE-Br applies LLMs with prompt engineering to outperform NER baselines on Portuguese product attribute extraction and releases the Golden Set as a new benchmark dataset.

IHUBERT: Vector-Based Semantic Deduplication and Domain-Balanced Pretraining for Persian Resources

cs.CL · 2026-06-18 · unverdicted · novelty 4.0

Trains a 125M-parameter Persian PLM on a curated 45GB corpus using vector semantic deduplication for domain balance, topping QA and NLI benchmarks while remaining competitive on NER and classification.

citing papers explorer

Showing 3 of 3 citing papers.

Toten: A Knowledge-Based System For Structure-Preserving Representation Of Physical Quantities And Technical Notation In Brazilian Portuguese cs.AI · 2026-06-17 · unverdicted · none · ref 38
TOTEN is a knowledge-based system for structure-preserving representation of physical quantities and technical notation in Brazilian Portuguese using an ontology of engineering entities and external authorities, outperforming statistical baselines in atomicity and reconstruction.
AI-PAVE-Br: Leveraging Large Language Models for Enhanced Product Attribute Value Extraction through a Golden Set Approach cs.CL · 2026-06-23 · unverdicted · none · ref 32
AI-PAVE-Br applies LLMs with prompt engineering to outperform NER baselines on Portuguese product attribute extraction and releases the Golden Set as a new benchmark dataset.
IHUBERT: Vector-Based Semantic Deduplication and Domain-Balanced Pretraining for Persian Resources cs.CL · 2026-06-18 · unverdicted · none · ref 13
Trains a 125M-parameter Persian PLM on a curated 45GB corpus using vector semantic deduplication for domain balance, topping QA and NLI benchmarks while remaining competitive on NER and classification.

BERTimbau: pretrained BERT models for Brazilian Portuguese.Lecture Notes in Computer Science, 12319:403–417, 2020

fields

years

verdicts

representative citing papers

citing papers explorer