Tracing representation progression: Analyzing and enhancing layer-wise similarity

Jiachen Jiang, Jinxin Zhou, Zhihui Zhu · 2024 · arXiv 2406.14479

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1 other 1

citation-polarity summary

background 1 unclear 1

representative citing papers

A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

cs.CL · 2026-03-08 · unverdicted · novelty 7.0

Diffusion language models form more global representations with early-layer redundancy compared to autoregressive models, allowing layer skipping for up to 18.75% FLOP savings while maintaining over 90% performance.

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.

CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure

cs.LG · 2025-09-23 · unverdicted · novelty 6.0

CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.

Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding

cs.AI · 2026-05-10 · unverdicted · novelty 3.0

Advanced language representations shape LLMs' schemas to improve knowledge activation and problem-solving.

Rethinking Cross-Layer Information Routing in Diffusion Transformers

cs.CV · 2026-05-20

Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale

cs.CV · 2026-04-20

citing papers explorer

Showing 6 of 6 citing papers.

A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs cs.CL · 2026-03-08 · unverdicted · none · ref 6
Diffusion language models form more global representations with early-layer redundancy compared to autoregressive models, allowing layer skipping for up to 18.75% FLOP savings while maintaining over 90% performance.
Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs cs.LG · 2026-05-12 · unverdicted · none · ref 29
LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.
CR-Net: Scaling Parameter-Efficient Training with Cross-Layer Low-Rank Structure cs.LG · 2025-09-23 · unverdicted · none · ref 31
CR-Net uses cross-layer low-rank residuals in a dual-path network plus specialized recomputation to outperform prior low-rank methods on 60M-7B model pre-training while using less compute and memory.
Shaping Schema via Language Representation as the Next Frontier for LLM Intelligence Expanding cs.AI · 2026-05-10 · unverdicted · none · ref 134
Advanced language representations shape LLMs' schemas to improve knowledge activation and problem-solving.
Rethinking Cross-Layer Information Routing in Diffusion Transformers cs.CV · 2026-05-20 · unreviewed · ref 22
Back into Plato's Cave: Examining Cross-modal Representational Convergence at Scale cs.CV · 2026-04-20 · unreviewed · ref 45

Tracing representation progression: Analyzing and enhancing layer-wise similarity

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer