SegFS is a dual-path architecture that uses sparse keyframe open-vocabulary predictions to condition a fast feature-space network for efficient temporal instance segmentation in videos.
arXiv preprint arXiv:2312.09579 (2023)
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
DL-SLAM uses dual-level (pixel and object) dynamic probabilities from semantic-geometric fusion to produce artifact-free static maps and up to 13% better tracking accuracy in dynamic scenes.
RS4D distills ViT knowledge into SSM backbones for remote sensing instance segmentation, delivering 8x fewer parameters and 9x fewer FLOPs than ViT methods while matching or exceeding accuracy on SSDD, WHU, and NWPU datasets.
PhysGraph reconstructs object-centric 3D geometry from RGB-D, decomposes objects into parts, infers materials and articulations via visual reasoning, and reports SOTA results on semantic segmentation, multi-object mass estimation, and articulation prediction.
Permutation-COMQ is a new post-training quantization algorithm that reorders weights within layers and uses only dot-product and rounding steps to deliver the highest reported accuracy for 2-, 4-, and 8-bit medical foundation models.