X-VC achieves zero-shot streaming voice conversion via one-step codec-space conversion with dual-conditioning acoustic converter and role-assignment training on generated paired data.
hub Tool reference
Perez , author F
Tool reference. 83% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Generative sequence models for physical tasks exhibit physical misgeneralization where local prediction errors propagate through physical measurements to distort aggregate distributions over quantities like distance or energy; a data deviation kernel explains and predicts the shifts and supports a内核
SkyPart achieves state-of-the-art single-pass cross-view geo-localization on SUES-200, University-1652, and DenseUAV by using prototype-based part discovery, altitude-conditioned modulation, and Kendall-weighted loss, with widening gains under weather corruptions.
Independent quantum signal injection into graph DEQs yields higher test accuracy and fewer solver iterations than state-dependent or backbone-dependent injection and classical equilibrium models on NCI1, PROTEINS, and MUTAG benchmarks.
Conditional neural fields combined with LSTM networks predict aircraft ditching loads accurately across heterogeneous spatial discretizations using fewer parameters than convolutional autoencoders.
A cVAE plus flow-matching model generates realistic complex-valued brain MRI that preserves phase coherence above 0.997 and yields synthetic data that trains abnormality classifiers to 0.880 AUROC, beating the 0.842 real-data baseline on fastMRI.
NeuVolEx extracts robust spatial features from INR training via a structural encoder and multi-task scheme to enable accurate ROI classification with limited supervision and unsupervised viewpoint clustering in volume exploration.
PREFAB applies preference learning grounded in the peak-end rule to let users annotate only key affective change segments while interpolating the rest, reducing workload and improving confidence in a 25-participant study.
CodecSep performs prompt-driven universal sound separation directly in neural audio codec latents by combining a frozen DAC backbone with a lightweight FiLM-conditioned Transformer masker driven by CLAP embeddings, yielding efficiency gains over AudioSep.
Knowledge distillation from a hybrid CNN-Transformer teacher to a depth-wise separable CNN student, combined with realistic motion and environmental augmentation, produces a 15x smaller EDA denoiser that cuts underwater reconstruction error from 2.809 to 0.215 MAE and raises downstream CNS-OT AUROC.
Empirical study of DP transfer learning reveals that larger clipping bounds outperform under tight privacy and cumulative DP noise explains batch-size effects better than existing heuristics.
The survey frames VLA models as pipelines that generate progressively grounded action tokens and classifies those tokens into eight types to guide future development.
citing papers explorer
-
Conditional Neural Field based Reduced Order Model for Dynamic Ditching Load Prediction
Conditional neural fields combined with LSTM networks predict aircraft ditching loads accurately across heterogeneous spatial discretizations using fewer parameters than convolutional autoencoders.