hub

John, and Jaydeep P

SACHI: A Stationarity-Aware, All-Digital, Near-Memory, Ising Architecture · 2024 · arXiv 7654.2024

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 baseline 1

citation-polarity summary

background 2 baseline 1

representative citing papers

Enhancing Instruction Prefetching via Cache and TLB Management

cs.AR · 2026-05-12 · unverdicted · novelty 7.0

IP-CaT jointly optimizes TLB and cache management for L1I prefetching via a translation prefetch buffer and trimodal replacement policy, yielding 8.7% geomean speedup over EPI across 105 server workloads.

Sublime: Sublinear Error & Space for Unbounded Skewed Streams

cs.DS · 2026-03-15 · unverdicted · novelty 7.0

Sublime generalizes Count-Min and Count Sketch with dynamically elongating counters and expanding counter arrays to deliver sublinear error growth and lower memory use on skewed unbounded streams.

CLIPGen: A Chiplet Link IP Modeling and Generation Framework for 2.5D Architecture Exploration

cs.AR · 2026-05-26 · unverdicted · novelty 6.0

CLIPGen is a framework for automated generation of chiplet interconnect IP with PPA estimates to support 2.5D SiP architecture exploration.

TLX: Hardware-Native, Evolvable MIMW GPU Compiler for Large-scale Production Environments

cs.AR · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

TLX introduces MIMW-based extensions to Triton that let developers orchestrate warp-group execution and asynchronous hardware features while preserving blocked programming productivity, with kernels deployed in large-scale training and inference.

Affinity Tailor: Dynamic Locality-Aware Scheduling at Scale

cs.OS · 2026-04-30 · unverdicted · novelty 6.0

Affinity Tailor improves per-CPU throughput by 12% on chiplet systems and 3% on non-chiplet systems over Linux CFS by using dynamic compact affinity hints derived from online demand estimates.

ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling

cs.DC · 2026-06-09 · unverdicted · novelty 5.0

ASTRA-sim 3.0 introduces cache-line load-store simulation, a detailed GPU execution model, and InfraGraph to support high-fidelity distributed machine learning infrastructure simulations.

Taking Cryptography Out of the Data Path via Near-Memory Processing in DRAM

cs.CR · 2026-05-19 · unverdicted · novelty 5.0

Real-world PIM on UPMEM accelerates cryptographic algorithms when computation is distributed across multiple DRAM ranks, outperforming CPUs at full scale.

A complete discussion on fully reconfigurable, digital, scalable, graph and sparsity-aware near-memory accelerator for graph neural networks

cs.AR · 2026-05-19 · unverdicted · novelty 5.0 · 2 refs

NEM-GNN proposes a scalable DAC/ADC-less PIM architecture for GNNs with early termination and CAR execution, claiming 80-230x performance and 850-1134x energy gains over prior accelerators.

ROA-Based Subharmonic Injection Locking for Oscillator-Based Ising Machines

cs.AR · 2026-05-18 · unverdicted · novelty 5.0 · 2 refs

ROA brick topology supplies PVT-robust 2.31 GHz SHIL that preserves 93-97% accuracy in 324-node OIM max-cut while ROSC-SHIL loses locking.

PIM-CACHE: High-Efficiency Content-Aware Copy for Processing-In-Memory

cs.ET · 2026-03-24 · unverdicted · novelty 5.0

PIM-CACHE reduces mandatory coarse-grained transfers in UPMEM-style PIM by dynamically staging only non-redundant data via content-aware copy that exploits workload similarity.

Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference

cs.AR · 2025-09-11 · unverdicted · novelty 5.0

PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.

Energy-Aware Computing in the Year 2026

cs.DC · 2026-05-23 · unverdicted · novelty 2.0

The paper reviews energy-aware computing literature and constructs a taxonomy organized by hardware/software aspects, measurement, optimizations, scheduling, scaling, consolidation, federated learning, and cooling.

citing papers explorer

Showing 12 of 12 citing papers.

Enhancing Instruction Prefetching via Cache and TLB Management cs.AR · 2026-05-12 · unverdicted · none · ref 59
IP-CaT jointly optimizes TLB and cache management for L1I prefetching via a translation prefetch buffer and trimodal replacement policy, yielding 8.7% geomean speedup over EPI across 105 server workloads.
Sublime: Sublinear Error & Space for Unbounded Skewed Streams cs.DS · 2026-03-15 · unverdicted · none · ref 13
Sublime generalizes Count-Min and Count Sketch with dynamically elongating counters and expanding counter arrays to deliver sublinear error growth and lower memory use on skewed unbounded streams.
CLIPGen: A Chiplet Link IP Modeling and Generation Framework for 2.5D Architecture Exploration cs.AR · 2026-05-26 · unverdicted · none · ref 1
CLIPGen is a framework for automated generation of chiplet interconnect IP with PPA estimates to support 2.5D SiP architecture exploration.
TLX: Hardware-Native, Evolvable MIMW GPU Compiler for Large-scale Production Environments cs.AR · 2026-05-11 · unverdicted · none · ref 9 · 2 links
TLX introduces MIMW-based extensions to Triton that let developers orchestrate warp-group execution and asynchronous hardware features while preserving blocked programming productivity, with kernels deployed in large-scale training and inference.
Affinity Tailor: Dynamic Locality-Aware Scheduling at Scale cs.OS · 2026-04-30 · unverdicted · none · ref 20
Affinity Tailor improves per-CPU throughput by 12% on chiplet systems and 3% on non-chiplet systems over Linux CFS by using dynamic compact affinity hints derived from online demand estimates.
ASTRA-sim 3.0: Next-Level Distributed Machine Learning Simulations via High-Fidelity GPU and Infrastructure Modeling cs.DC · 2026-06-09 · unverdicted · none · ref 26
ASTRA-sim 3.0 introduces cache-line load-store simulation, a detailed GPU execution model, and InfraGraph to support high-fidelity distributed machine learning infrastructure simulations.
Taking Cryptography Out of the Data Path via Near-Memory Processing in DRAM cs.CR · 2026-05-19 · unverdicted · none · ref 32
Real-world PIM on UPMEM accelerates cryptographic algorithms when computation is distributed across multiple DRAM ranks, outperforming CPUs at full scale.
A complete discussion on fully reconfigurable, digital, scalable, graph and sparsity-aware near-memory accelerator for graph neural networks cs.AR · 2026-05-19 · unverdicted · none · ref 30 · 2 links
NEM-GNN proposes a scalable DAC/ADC-less PIM architecture for GNNs with early termination and CAR execution, claiming 80-230x performance and 850-1134x energy gains over prior accelerators.
ROA-Based Subharmonic Injection Locking for Oscillator-Based Ising Machines cs.AR · 2026-05-18 · unverdicted · none · ref 32 · 2 links
ROA brick topology supplies PVT-robust 2.31 GHz SHIL that preserves 93-97% accuracy in 324-node OIM max-cut while ROSC-SHIL loses locking.
PIM-CACHE: High-Efficiency Content-Aware Copy for Processing-In-Memory cs.ET · 2026-03-24 · unverdicted · none · ref 32
PIM-CACHE reduces mandatory coarse-grained transfers in UPMEM-style PIM by dynamically staging only non-redundant data via content-aware copy that exploits workload similarity.
Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference cs.AR · 2025-09-11 · unverdicted · none · ref 32
PLENA introduces a co-designed system with three optimization pathways for long-context agentic LLM inference, claiming up to 2.23x throughput over A100 and 4.04x energy efficiency.
Energy-Aware Computing in the Year 2026 cs.DC · 2026-05-23 · unverdicted · none · ref 273
The paper reviews energy-aware computing literature and constructs a taxonomy organized by hardware/software aspects, measurement, optimizations, scheduling, scaling, consolidation, federated learning, and cooling.

John, and Jaydeep P

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer