InsightGen uses thematic clustering and graph neighborhood selection to generate diverse, relevant insights for open-ended document-grounded questions and releases the SCOpE-QA dataset of 3000 questions.
super hub
APACrefauthors \ 1987
39 Pith papers cite this work, alongside 16,729 external citations. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
authors
co-cited works
representative citing papers
Direct fixed-weight solver for free-support Wasserstein medians relocates atoms using OT barycentric projections and inverse-distance weights, achieving monotone descent on smoothed objectives with fewer subproblems than nested Weiszfeld baselines.
A cycle-consistent MT pipeline generates and similarity-weights training data for coreference resolution, producing gains on four low-resource languages and enabling the task where no corpora existed.
DiffCodeGen clusters code candidates by behavioral similarity from fuzzing-synthesized inputs and selects the largest cluster's medoid, matching or exceeding prior test-time scaling methods with far less token and time cost.
Soft-MSM is a smooth, gradient-enabled version of the context-aware MSM distance for time series alignment that outperforms Soft-DTW alternatives in clustering and nearest-centroid classification.
BISN achieves 0.93 mean leave-one-batch-out accuracy on 2700 NIR spectra from three insect species across three batches, outperforming baselines by 4% while decisions align with lipid and protein absorption regions.
SOLAR aligns soft-token probability mixtures across languages in embedding space during SFT and raises multilingual reasoning accuracy by up to 17.7 points over the base model.
FOSC-X uses bounded dynamic programming to compute top-M optimal non-horizontal cuts from clustering hierarchies in linear time, with or without cluster-count constraints.
Bucket-Level MOO reformulates multilingual fine-tuning as localized multi-objective optimization and proves it enforces a tighter Pareto stationarity condition while improving cross-lingual performance on four LLMs.
Deep UCSL uses a contrastive EM loss on patient-control labels to isolate disease-driven subgroups in medical imaging by suppressing shared healthy variability.
UniTrans pretrains a bank of translator experts and learns combination coefficients from modality mappings in a scene-invariant latent space to enable zero-shot any-to-any feature translation for heterogeneous collaborative perception.
A quantum prototype learning scheme encodes class representatives as generative matrix product states and performs classification and clustering via geometric measures in Hilbert space, outperforming classical prototypes on Fashion-MNIST and ECG data.
A single commercial LLM can cheaply generate large populations of behaviorally equivalent yet structurally diverse malware payloads.
GCD-FGL mitigates neighborhood absorption and global semantic inconsistency in federated generalized category discovery, delivering +4.86 average HRScore gain over baselines on five graph datasets.
CADI quantifies the preservation of relative cluster angles in low-dimensional projections using internal angles from point triples.
A broad empirical benchmark shows how 15 existing test selection metrics perform for fault detection, performance estimation, and retraining under corrupted, adversarial, temporal, natural, and label shifts across image, text, and Android data.
Machine learning clustering of meteor observations produces a new hardness classification H_class that refines traditional Kb models using more parameters and reveals compositional structure in meteoroid populations.
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
AFGNN detects API misuses in Java code more effectively than prior methods by representing usage as graphs and clustering learned embeddings from self-supervised training.
OSS4SG projects retain contributors at 2.2X higher rates with 19.6% higher core status probability than conventional OSS, and a late-spike temporal pattern enables faster core achievement (21 weeks) than early intensive contributions.
LandSegmenter creates a task-specific foundation model for LULC mapping using weak labels from existing products, an RS adapter, text encoder, and confidence-guided fusion to achieve competitive zero-shot performance across modalities and taxonomies.
Presents the bixplot as an extension of the boxplot incorporating contiguous clustering to visualize bimodality and multimodality while displaying individual data points, with Python and R implementations.
A nonparametric framework detects repeated spatial patterns via constrained clustering followed by MMD-based reassignment and block permutation under stationarity and mixing conditions.
GPS tracking across theme parks shows visitor movement forms a continuum rather than discrete types, diverges from self-reports, and reverses feature relationships from site to site, requiring local calibration.
citing papers explorer
-
Multilingual Fine-Tuning via Localized Gradient Conflict Resolution
Bucket-Level MOO reformulates multilingual fine-tuning as localized multi-objective optimization and proves it enforces a tighter Pareto stationarity condition while improving cross-lingual performance on four LLMs.