TOTEN is a knowledge-based system for structure-preserving representation of physical quantities and technical notation in Brazilian Portuguese using an ontology of engineering entities and external authorities, outperforming statistical baselines in atomicity and reconstruction.
BERTimbau: pretrained BERT models for Brazilian Portuguese.Lecture Notes in Computer Science, 12319:403–417, 2020
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
AI-PAVE-Br applies LLMs with prompt engineering to outperform NER baselines on Portuguese product attribute extraction and releases the Golden Set as a new benchmark dataset.
Trains a 125M-parameter Persian PLM on a curated 45GB corpus using vector semantic deduplication for domain balance, topping QA and NLI benchmarks while remaining competitive on NER and classification.
citing papers explorer
-
Toten: A Knowledge-Based System For Structure-Preserving Representation Of Physical Quantities And Technical Notation In Brazilian Portuguese
TOTEN is a knowledge-based system for structure-preserving representation of physical quantities and technical notation in Brazilian Portuguese using an ontology of engineering entities and external authorities, outperforming statistical baselines in atomicity and reconstruction.
-
AI-PAVE-Br: Leveraging Large Language Models for Enhanced Product Attribute Value Extraction through a Golden Set Approach
AI-PAVE-Br applies LLMs with prompt engineering to outperform NER baselines on Portuguese product attribute extraction and releases the Golden Set as a new benchmark dataset.
-
IHUBERT: Vector-Based Semantic Deduplication and Domain-Balanced Pretraining for Persian Resources
Trains a 125M-parameter Persian PLM on a curated 45GB corpus using vector semantic deduplication for domain balance, topping QA and NLI benchmarks while remaining competitive on NER and classification.