arxiv: 2604.12165 · v1 · submitted 2026-04-14 · 💻 cs.OS

Recognition: unknown

Hybrid Adaptive Tuning for Tiered Memory Systems

Dong Li, Jie Liu, Jongryool Kim, Pengfei Su, Shuangyan Yang, Xi Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:36 UTC · model grok-4.3

classification 💻 cs.OS

keywords memory tieringparameter tuningreinforcement learninghybrid offline-onlinepage migrationadaptive systemsoperating systemsperformance optimization

0 comments

The pith

PTMT automates runtime parameter tuning for memory tiering using an offline performance database paired with online reinforcement learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PTMT, a framework that automatically adjusts system parameters in memory tiering solutions while applications run. Memory tiering depends on parameters that control profiling, hot-page detection, and page migration, and these settings strongly affect performance yet are difficult to choose correctly for different workloads. PTMT addresses this by pre-building an offline database of performance data from representative workloads, which the online phase queries to keep overhead low, while a customized reinforcement learning agent selects and adapts parameters dynamically. This hybrid method yields measurable gains over default settings on several existing tiering implementations.

Core claim

PTMT uses a hybrid offline-plus-online method to tune memory tiering parameters: the offline phase constructs a performance database that supports fast queries and lowers runtime cost, while the online phase employs a reinforcement-learning agent tailored to memory tiering constraints to select better parameter values at each step.

What carries the argument

PTMT's hybrid offline database and customized online reinforcement-learning agent, which together enable low-overhead, workload-adaptive selection of memory tiering parameters such as migration thresholds and profiling intervals.

If this is right

Memory tiering solutions such as TPP, UPM, Colloid, and AutoNUMA can deliver higher throughput without requiring manual or workload-specific parameter configuration.
Applications running on tiered memory hardware experience automatic adaptation to shifting access patterns with only modest added system cost.
The same hybrid database-plus-RL structure can be applied to other tunable components inside operating systems that manage memory movement.
Overall system utilization and effective memory capacity increase because page migrations occur more frequently at the right times.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the representative workloads capture the dominant access patterns found in production, the approach could lower the expertise barrier for deploying tiered memory across varied cloud and HPC environments.
The technique may generalize to other hardware tiers that emerge in future systems, such as additional levels of storage-class memory.
Developers could combine PTMT with online profiling improvements to further reduce the size of the offline database needed.

Load-bearing premise

A performance database built once from representative workloads will stay accurate enough to guide effective parameter choices for arbitrary new applications without adding unacceptable overhead or instability.

What would settle it

Run PTMT on a workload whose memory-access pattern lies outside the offline database's coverage and check whether the resulting parameter choices produce performance below the default configuration or cause instability.

Figures

Figures reproduced from arXiv: 2604.12165 by Dong Li, Jie Liu, Jongryool Kim, Pengfei Su, Shuangyan Yang, Xi Wang.

**Figure 2.** Figure 2: The overview of PTMT. ments and narrow applicability. For example, Kleio [25] trains an RNN-based model (i.e., LSTM) per page for memory access prediction, resulting in consumption of tens of GBs of memory and 2 hours of training per 100 models. Google’s warehouse-scale computing (WSC) [41] uses a Gaussian Process Bandit model to find the best parameter configuration, requiring one week of WSC’s memory t… view at source ↗

**Figure 3.** Figure 3: k-means clustering of WSs in the benchmark NPBLU. Each dot represents a WS. Red dots are the cluster centroids. Different colors indicate different clusters. Initialization Iterations [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Map WSs and clusters in Figure [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: WSS (a): Performance speedup over NoBalance performance. better [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: WSS (b): Performance speedup over NoBalance performance. better [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: WSS (b): Evaluation of application-specific RL. The performance speedup is measured over the default configuration. [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: (a) Memory access heatmap of Graph500. (b) Tun [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Page migration when tuning Graph500. ter for a specific memory tiering solution (see Sec. 6.1), we improve IDT in three ways (see our repo [16]): (1) enabling IDT to tune multiple system parameters rather than a single parameter, (2) incorporating WSs as the RL state representation, and (3) making the decision epoch configurable as a tuning period. We pre-train four separate models, each tailored to one … view at source ↗

**Figure 10.** Figure 10: Sensitivity study to hyper-parameters. Performance speedup is over that of the default configuration. [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: Overall ANTT-STP and per-application throughput [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

read the original abstract

Memory tiering provides a cost-effective solution to increase memory capacity, utilization, and even bandwidth. Memory tiering relies on system software for memory profiling, detection of frequently accessed pages, and page migration. Such a system software often comes with system parameters. The configurations of those parameters impact application performance. We comprehensively classify system parameters, and characterize the sensitivity of application performance to them using representative memory tiering solutions. Furthermore, we introduce a lightweight and user-friendly framework PTMT, which automates tuning of parameters at runtime for various memory tiering solutions. We identify major challenges for online tuning of memory tiering. PTMT uses a hybrid "offline + online" tuning method: while the offline phase builds a performance database for online queries and reduces runtime overhead, the online phase uses reinforcement learning (customized to memory tiering) to tune. PTMT improves performance by 30%, 26%, 21%, and 14%, on four memory tiering solutions (TPP, UPM, Colloid, and AutoNUMA), compared to using the default configurations. PTMT outperforms the state-of-the-art by 32% on average.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PTMT gives a workable hybrid offline-plus-RL tuner for memory tiering parameters with clear reported gains, but the offline database's ability to cover unseen workloads is the main untested piece.

read the letter

The main point is that the authors built PTMT to tune parameters in existing memory tiering systems. They first sort the parameters and measure how much each one affects performance on representative workloads, then use an offline database to feed a lightweight online RL agent that adjusts at runtime. This hybrid setup is meant to keep overhead low while still adapting. They show 14-30% gains over defaults on TPP, UPM, Colloid, and AutoNUMA, plus 32% better than prior tuning work on average.

Referee Report

2 major / 2 minor

Summary. The paper presents PTMT, a hybrid offline+online framework for automatic runtime tuning of system parameters in memory tiering solutions. It classifies parameters, characterizes their sensitivity via representative workloads, builds an offline performance database, and employs a customized reinforcement-learning agent for online adaptation. The central empirical claim is that PTMT yields 30%, 26%, 21%, and 14% performance gains over default configurations on TPP, UPM, Colloid, and AutoNUMA respectively, while outperforming prior state-of-the-art tuning by 32% on average.

Significance. If the reported gains are reproducible and the offline database generalizes, PTMT would provide a practical, low-overhead method for improving memory-tiering efficiency across multiple existing systems. The hybrid design that amortizes profiling cost offline while retaining online adaptability is a concrete engineering contribution that could influence both research prototypes and production memory-management stacks.

major comments (2)

[§4] §4 (Evaluation) and the abstract: the reported percentage improvements lack error bars, workload counts, statistical significance tests, or explicit description of how the four baseline systems were configured and measured. Without these, it is impossible to assess whether the gains are robust or sensitive to post-hoc workload selection.
[§3.2] §3.2 (Offline Database Construction) and §4.3 (Generalization): the central claim that the offline performance database supplies accurate priors for unseen applications is not supported by held-out workload testing or sensitivity analysis. If the representative workloads do not cover the access patterns or migration-cost surfaces of arbitrary new applications, the RL policy can select suboptimal parameters; this assumption is load-bearing for the 14–30% gains and the 32% SOTA comparison.

minor comments (2)

[Abstract] The abstract and §1 state concrete percentage improvements without citing the corresponding evaluation tables or figures; cross-references should be added.
[§3.3] Notation for the RL state/action space and reward function in §3.3 is introduced without a compact summary table; a single table would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. The suggestions will improve the rigor of our evaluation section and strengthen the generalization claims. We address each major comment below and commit to the corresponding revisions.

read point-by-point responses

Referee: [§4] §4 (Evaluation) and the abstract: the reported percentage improvements lack error bars, workload counts, statistical significance tests, or explicit description of how the four baseline systems were configured and measured. Without these, it is impossible to assess whether the gains are robust or sensitive to post-hoc workload selection.

Authors: We agree that the current presentation would benefit from greater statistical detail. Although §4 reports results from repeated runs on representative workloads, error bars, explicit workload counts, and formal significance tests are not included. In the revised manuscript we will add error bars (standard deviation across 5+ runs) to all performance figures and tables, state that 12 workloads were used (categorized by access intensity and migration cost), and report statistical significance via paired Wilcoxon tests. We will also add a dedicated paragraph in §4 describing the exact default configurations and measurement protocol for each baseline system (TPP, UPM, Colloid, AutoNUMA), including parameter values, warm-up periods, and repetition counts. These additions will be reflected in the abstract as well. revision: yes
Referee: [§3.2] §3.2 (Offline Database Construction) and §4.3 (Generalization): the central claim that the offline performance database supplies accurate priors for unseen applications is not supported by held-out workload testing or sensitivity analysis. If the representative workloads do not cover the access patterns or migration-cost surfaces of arbitrary new applications, the RL policy can select suboptimal parameters; this assumption is load-bearing for the 14–30% gains and the 32% SOTA comparison.

Authors: We acknowledge that §4.3 currently lacks explicit held-out testing, which limits the strength of the generalization argument. The workloads used for the offline database were selected after the sensitivity characterization in §3.2 to span key dimensions of access patterns and migration costs. To directly address the concern, the revision will add held-out evaluation results on workloads excluded from database construction and include a sensitivity analysis quantifying how well the database priors transfer. These additions will provide empirical support for the claim while preserving the hybrid offline-online design. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on measured outcomes against external baselines

full rationale

The paper describes PTMT as a hybrid offline+online tuning framework for memory tiering parameters, with an offline performance database built from representative workloads and an online RL agent for adaptation. All reported gains (30%, 26%, 21%, 14% over defaults; 32% over SOTA) are presented as direct experimental measurements on four specific systems (TPP, UPM, Colloid, AutoNUMA) rather than as outputs of any closed-form derivation, fitted model, or self-referential prediction. No equations, uniqueness theorems, ansatzes, or self-citations appear as load-bearing steps in the provided abstract or described evaluation; the central claims therefore do not reduce to their own inputs by construction and remain externally falsifiable via independent runs on the same workloads and systems.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract supplies no explicit free parameters, axioms, or invented entities; the contribution is an empirical systems framework built on standard reinforcement-learning techniques.

pith-pipeline@v0.9.0 · 5504 in / 1177 out tokens · 34700 ms · 2026-05-10T14:36:48.600290+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

79 extracted references · 4 canonical work pages · 2 internal anchors

[1]

https://www.scikit-yb.org/ en/latest/api/cluster/elbow.html

Elbow Method, 2022. https://www.scikit-yb.org/ en/latest/api/cluster/elbow.html

2022
[2]

https://cloud

Model Garden on Vertex AI, 2024. https://cloud. google.com/model-garden

2024
[3]

https://www.nas

NAS Parallel Benchmarks, 2024. https://www.nas. nasa.gov/software/npb.html

2024
[4]

https://cloud.google.com/solutions/ running-computer-aided-engineering-workloads

Running Computer-Aided Engineer- ing Workloads on Google Cloud, 2024. https://cloud.google.com/solutions/ running-computer-aided-engineering-workloads

2024
[5]

https://www.simscale.com/

AI-Native Engineering Simulation in the Cloud, 2025. https://www.simscale.com/

2025
[6]

https://azure.microsoft.com/en-us/products/ ai-foundry/models

Azure AI Foundry Model Catalog, 2025. https://azure.microsoft.com/en-us/products/ ai-foundry/models

2025
[7]

https://aws

Computational Fluid Dynamics, 2025. https://aws. amazon.com/hpc/cfd

2025
[8]

utk.edu/projectsfiles/hpcc/RandomAccess/

GUPS (Giga Updates Per Second), 2025.https://icl. utk.edu/projectsfiles/hpcc/RandomAccess/

2025
[9]

High Performance Computing for Healthcare & Life Sci- ences, 2025.https://aws.amazon.com/hpc/hcls

2025
[10]

https://rescale.com/ platform/hpc-as-a-service/

HPC as a Service, 2025. https://rescale.com/ platform/hpc-as-a-service/

2025
[11]

https://en.wikipedia

k-means clustering, 2025. https://en.wikipedia. org/wiki/K-means_clustering

2025
[12]

https://en.wikipedia

Normal distribution, 2025. https://en.wikipedia. org/wiki/Normal_distribution

2025
[13]

https://github.com/ DLR-RM/stable-baselines3

Stable Baselines3, 2025. https://github.com/ DLR-RM/stable-baselines3

2025
[14]

https:// anonymous.4open.science/r/UPM

User-Space Page Management, 2025. https:// anonymous.4open.science/r/UPM

2025
[15]

Analysis with Instruction Based Sam- pling

AMD. Analysis with Instruction Based Sam- pling. https://docs.amd.com/r/en-US/57368-uProf-user- guide, 2025

2025
[16]

PTMT, 2025

Anonymous. PTMT, 2025. https://anonymous. 4open.science/r/PTMT

2025
[17]

D. H. Bailey, L. Dagum, E. Barszcz, and H. D. Simon. Nas parallel benchmark results. InSupercomputing ’92: Proceedings of the 1992 ACM/IEEE conference on Supercomputing, pages 386–393, Los Alamitos, CA, USA, 1992. IEEE Computer Society Press

1992
[18]

A rule-based distributed system for self-optimization of constrained devices

Javier Baliosian, Jorge Visca, Eduardo Grampin, Leonardo Vidal, and Martin Giachino. A rule-based distributed system for self-optimization of constrained devices. In2009 IFIP/IEEE International Symposium on Integrated Network Management, 2009

2009
[19]

Reconsidering OS Mem- ory Optimizations in the Presence of Disaggregated Memory

Shai Bergman, Priyank Faldu, Boris Grot, Lluís Vi- lanova, and Mark Silberstein. Reconsidering OS Mem- ory Optimizations in the Presence of Disaggregated Memory. InProceedings of the 2022 ACM SIGPLAN In- ternational Symposium on Memory Management, 2022

2022
[20]

{Config-Snob}: Tuning for the best configurations of networking protocol stack

Manaf Bin-Yahya, Yifei Zhao, Hossein Shafieirad, An- thony Ho, Shijun Yin, Fanzhao Wang, and Geng Li. {Config-Snob}: Tuning for the best configurations of networking protocol stack. In2024 USENIX Annual Technical Conference (USENIX ATC 24), 2024

2024
[21]

IDT: Intelligent Data Placement for Multi-tiered Main Memory with Reinforcement Learn- ing

Juneseo Chang, Wanju Doh, Yaebin Moon, Eojin Lee, and Jung Ho Ahn. IDT: Intelligent Data Placement for Multi-tiered Main Memory with Reinforcement Learn- ing. InInternational Symposium on High-Performance Parallel and Distributed Computing (HPDC), 2024

2024
[22]

Optimizing data placement on hierar- chical storage architecture via machine learning

Peng Cheng, Yutong Lu, Yunfei Du, Zhiguang Chen, and Yang Liu. Optimizing data placement on hierar- chical storage architecture via machine learning. In Network and Parallel Computing: 16th IFIP WG 10.3 International Conference, 2019

2019
[23]

Dancing in the dark: Profiling for tiered memory

Jinyoung Choi, Sergey Blagodurov, and Hung-Wei Tseng. Dancing in the dark: Profiling for tiered memory. In2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2021

2021
[24]

LRU-list manipulation with DAMON

Jonathan Corbet. LRU-list manipulation with DAMON. 2022.https://lwn.net/Articles/905370/

2022
[25]

Kleio: A hybrid memory page scheduler with machine intelligence

Thaleia Dimitra Doudali, Sergey Blagodurov, Abhinav Vishnu, Sudhanva Gurumurthi, and Ada Gavrilovska. Kleio: A hybrid memory page scheduler with machine intelligence. InProceedings of the 28th International Symposium on High-Performance Parallel and Dis- tributed Computing, 2019

2019
[26]

Coeus: Clustering (a) like patterns for practical machine intelli- gent hybrid memory management

Thaleia Dimitra Doudali and Ada Gavrilovska. Coeus: Clustering (a) like patterns for practical machine intelli- gent hybrid memory management. In2022 22nd IEEE International Symposium on Cluster , Cloud and Internet Computing (CCGrid), 2022

2022
[27]

Cronus: Computer vision-based machine intelligent hybrid mem- ory management

Thaleia Dimitra Doudali and Ada Gavrilovska. Cronus: Computer vision-based machine intelligent hybrid mem- ory management. InProceedings of the 2022 Interna- tional Symposium on Memory Systems, 2022. 13

2022
[28]

Cori: Dancing to the right beat of periodic data movements over hybrid memory systems

Thaleia Dimitra Doudali, Daniel Zahka, and Ada Gavrilovska. Cori: Dancing to the right beat of periodic data movements over hybrid memory systems. In2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2021

2021
[29]

Data tier- ing in heterogeneous memory systems

Subramanya R Dulloor, Amitabha Roy, Zheguang Zhao, Narayanan Sundaram, Nadathur Satish, Rajesh Sankaran, Jeff Jackson, and Karsten Schwan. Data tier- ing in heterogeneous memory systems. InProceedings of the Eleventh European Conference on Computer Sys- tems, 2016

2016
[30]

To- wards an adaptable systems architecture for memory tiering at warehouse-scale

Padmapriya Duraisamy, Wei Xu, Scott Hare, Ravi Ra- jwar, David Culler, Zhiyi Xu, Jianing Fan, Christopher Kennelly, Bill McCloskey, Danijela Mijailovic, et al. To- wards an adaptable systems architecture for memory tiering at warehouse-scale. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Oper...

2023
[31]

Adaptive software cache management

Gil Einziger, Ohad Eytan, Roy Friedman, and Ben Manes. Adaptive software cache management. InPro- ceedings of the 19th International Middleware Confer- ence, 2018

2018
[32]

System-level per- formance metrics for multiprogram workloads.IEEE micro, 28(3):42–53, 2008

Stijn Eyerman and Lieven Eeckhout. System-level per- formance metrics for multiprogram workloads.IEEE micro, 28(3):42–53, 2008

2008
[33]

Liblinear: A library for large linear classification.Journal of machine Learning research, 9:1871–1874, 2008

Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang- Rui Wang, and Chih-Jen Lin. Liblinear: A library for large linear classification.Journal of machine Learning research, 9:1871–1874, 2008

2008
[34]

Hetero- visor: Exploiting resource heterogeneity to enhance the elasticity of cloud platforms.ACM SIGPLAN Notices, 50(7):79–92, 2015

Vishal Gupta, Min Lee, and Karsten Schwan. Hetero- visor: Exploiting resource heterogeneity to enhance the elasticity of cloud platforms.ACM SIGPLAN Notices, 50(7):79–92, 2015

2015
[35]

Adaptive page migration policy with huge pages in tiered memory systems.IEEE Transac- tions on Computers, 71(1):53–68, 2020

Taekyung Heo, Yang Wang, Wei Cui, Jaehyuk Huh, and Lintao Zhang. Adaptive page migration policy with huge pages in tiered memory systems.IEEE Transac- tions on Computers, 71(1):53–68, 2020

2020
[36]

autonuma: Optimize page place- ment for memory tiering system, 2020

Ying Huang. autonuma: Optimize page place- ment for memory tiering system, 2020. https: //patchwork.kernel.org/project/linux-mm/ patch/20201027063217.211096-2-ying.huang@ intel.com/

work page arXiv 2020
[37]

Intel® Performance Counter Monitor (Intel® PCM).https://github.com/intel/pcm

Intel. Intel® Performance Counter Monitor (Intel® PCM).https://github.com/intel/pcm
[38]

Heteroos: Os design for heterogeneous memory management in datacenter

Sudarsun Kannan, Ada Gavrilovska, Vishal Gupta, and Karsten Schwan. Heteroos: Os design for heterogeneous memory management in datacenter. InProceedings of the 44th Annual International Symposium on Computer Architecture, 2017

2017
[39]

Exploring the design space of page management for {Multi-Tiered} memory systems

Jonghyeon Kim, Wonkyo Choe, and Jeongseob Ahn. Exploring the design space of page management for {Multi-Tiered} memory systems. In2021 USENIX Annual Technical Conference (USENIX ATC 21), 2021

2021
[40]

hats: A heterogeneity-aware tiered storage for hadoop

KR Krish, Ali Anwar, and Ali R Butt. hats: A heterogeneity-aware tiered storage for hadoop. In2014 14th IEEE/ACM International Symposium on Cluster , Cloud and Grid Computing, 2014

2014
[41]

Software-defined far memory in warehouse-scale computers

Andres Lagar-Cavilla, Junwhan Ahn, Suleiman Souhlal, Neha Agarwal, Radoslaw Burny, Shakeel Butt, Jichuan Chang, Ashwin Chaugule, Nan Deng, Junaid Shahid, et al. Software-defined far memory in warehouse-scale computers. InProceedings of the Twenty-F ourth Inter- national Conference on Architectural Support for Pro- gramming Languages and Operating Systems, 2019

2019
[42]

Memtis: Efficient memory tiering with dynamic page classification and page size deter- mination

Taehyung Lee, Sumit Kumar Monga, Changwoo Min, and Young Ik Eom. Memtis: Efficient memory tiering with dynamic page classification and page size deter- mination. InProceedings of the 29th Symposium on Operating Systems Principles, 2023

2023
[43]

Qtune: A query-aware database tuning system with deep reinforcement learning.Proceedings of the VLDB En- dowment, 12(12):2118–2130, 2019

Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. Qtune: A query-aware database tuning system with deep reinforcement learning.Proceedings of the VLDB En- dowment, 12(12):2118–2130, 2019

2019
[44]

Capes: Unsupervised storage performance tuning using neural network-based deep reinforcement learning

Yan Li, Kenneth Chang, Oceane Bel, Ethan L Miller, and Darrell DE Long. Capes: Unsupervised storage performance tuning using neural network-based deep reinforcement learning. InProceedings of the inter- national conference for high performance computing, networking, storage and analysis, 2017

2017
[45]

{AutoSys}: The design and operation of {Learning-Augmented} sys- tems

Chieh-Jan Mike Liang, Hui Xue, Mao Yang, Lidong Zhou, Lifei Zhu, Zhao Lucis Li, Zibo Wang, Qi Chen, Quanlu Zhang, Chuanjie Liu, et al. {AutoSys}: The design and operation of {Learning-Augmented} sys- tems. In2020 USENIX Annual Technical Conference (USENIX ATC 20), 2020

2020
[46]

An energy-efficient tuning method for cloud servers combining dvfs and parameter optimization.IEEE Transactions on Cloud Computing, 2023

Weiwei Lin, Xiaoxuan Luo, ChunKi Li, Jiechao Liang, Guokai Wu, and Keqin Li. An energy-efficient tuning method for cloud servers combining dvfs and parameter optimization.IEEE Transactions on Cloud Computing, 2023

2023
[47]

Tiered memory management beyond hotness

Jinshu Liu, Hamid Hadian, Hanchen Xu, and Huaicheng Li. Tiered memory management beyond hotness. In 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25), 2025. 14

2025
[48]

Learning-based memory allocation for c++ server workloads

Martin Maas, David G Andersen, Michael Isard, Mo- hammad Mahdi Javanmard, Kathryn S McKinley, and Colin Raffel. Learning-based memory allocation for c++ server workloads. InProceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, 2020

2020
[49]

Multi- clock: Dynamic tiering for hybrid memory systems

Adnan Maruf, Ashikee Ghosh, Janki Bhimani, Daniel Campello, Andy Rudoff, and Raju Rangaswami. Multi- clock: Dynamic tiering for hybrid memory systems. In2022 IEEE International Symposium on High- Performance Computer Architecture (HPCA), 2022

2022
[50]

TPP: Transparent page placement for CXL-enabled tiered-memory

Hasan Al Maruf, Hao Wang, Abhishek Dhanotia, Jo- hannes Weiner, Niket Agarwal, Pallab Bhattacharya, Chris Petersen, Mosharaf Chowdhury, Shobhit Kanaujia, and Prakash Chauhan. TPP: Transparent page placement for CXL-enabled tiered-memory. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating...

2023
[51]

A feature-weighted rule for the k-nearest neighbor

Tsvetelina Mladenova. A feature-weighted rule for the k-nearest neighbor. InInternational Symposium on Mul- tidisciplinary Studies and Innovative Technologies (ISM- SIT), 2021

2021
[52]

Introducing the graph 500.Cray Users Group (CUG), 19(45-74):22, 2010

Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. Introducing the graph 500.Cray Users Group (CUG), 19(45-74):22, 2010

2010
[53]

Tmc: Near-optimal resource allocation for tiered- memory systems

Yuanjiang Ni, Pankaj Mehra, Ethan Miller, and Heiner Litz. Tmc: Near-optimal resource allocation for tiered- memory systems. InProceedings of the 2023 ACM Symposium on Cloud Computing, 2023

2023
[54]

Maphea: A lightweight memory hierarchy-aware profile-guided heap allocation framework

Deok-Jae Oh, Yaebin Moon, Eojin Lee, Tae Jun Ham, Yongjun Park, Jae W Lee, and Jung Ho Ahn. Maphea: A lightweight memory hierarchy-aware profile-guided heap allocation framework. InProceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems, 2021

2021
[55]

Daos: Data access-aware operating system

SeongJae Park, Madhuparna Bhowmik, and Alexandru Uta. Daos: Data access-aware operating system. In Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, 2022

2022
[56]

Hemem: Scalable tiered memory management for big data applications and real nvm

Amanda Raybuck, Tim Stamler, Wei Zhang, Mattan Erez, and Simon Peter. Hemem: Scalable tiered memory management for big data applications and real nvm. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, 2021

2021
[57]

Machine learning-guided memory optimization for dlrm infer- ence on tiered memory

Jie Ren, Bin Ma, Shuangyan Yang, Benjamin Francis, Ehsan K Ardestani, Min Si, and Dong Li. Machine learning-guided memory optimization for dlrm infer- ence on tiered memory. In2025 IEEE International Symposium on High Performance Computer Architec- ture (HPCA), pages 1631–1647. IEEE, 2025

2025
[58]

MTM: Rethinking Memory Profiling and Migration for Multi-Tiered Large Memory Systems

Jie Ren, Dong Xu, Junhee Ryu, Kwangsik Shin, Daewoo Kim, and Dong Li. MTM: Rethinking Memory Profiling and Migration for Multi-Tiered Large Memory Systems. InEuropean Conference on Computer Systems, 2024

2024
[59]

Archivist: A machine learning assisted data placement mechanism for hybrid storage systems

Jinting Ren, Xianzhang Chen, Yujuan Tan, Duo Liu, Moming Duan, Liang Liang, and Lei Qiao. Archivist: A machine learning assisted data placement mechanism for hybrid storage systems. In2019 IEEE 37th Interna- tional Conference on Computer Design (ICCD), 2019

2019
[60]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimiza- tion algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[61]

Automating the application data placement in hybrid memory systems

Harald Servat, Antonio J Peña, Germán Llort, Estanis- lao Mercadal, Hans-Christian Hoppe, and Jesús Labarta. Automating the application data placement in hybrid memory systems. In2017 IEEE International Confer- ence on Cluster Computing (CLUSTER), 2017

2017
[62]

Hybridtier: an adaptive and lightweight cxl-memory tiering system

Kevin Song, Jiacheng Yang, Zixuan Wang, Jishen Zhao, Sihang Liu, and Gennady Pekhimenko. Hybridtier: an adaptive and lightweight cxl-memory tiering system. In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2025

2025
[63]

Workload-aware performance tuning for multimodel databases based on deep reinforcement learning.International Journal of Intelligent Systems, 2023(1):8835111, 2023

Jun Sun, Feng Ye, Nadia Nedjah, Ming Zhang, and Dong Xu. Workload-aware performance tuning for multimodel databases based on deep reinforcement learning.International Journal of Intelligent Systems, 2023(1):8835111, 2023

2023
[64]

arXiv preprint arXiv:1805.01954 , year=

Faraz Torabi, Garrett Warnell, and Peter Stone. Be- havioral cloning from observation.arXiv preprint arXiv:1805.01954, 2018

work page arXiv 2018
[65]

Speedy transactions in multicore in-memory databases

Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy transactions in multicore in-memory databases. InProceedings of the Twenty-F ourth ACM Symposium on Operating Systems Principles, 2013

2013
[66]

Automatic database management system tuning through large-scale machine learning

Dana Van Aken, Andrew Pavlo, Geoffrey J Gordon, and Bohan Zhang. Automatic database management system tuning through large-scale machine learning. InPro- ceedings of the 2017 ACM international conference on management of data, 2017. 15

2017
[67]

Hybrid2: Combining caching and migration in hybrid memory systems

Evangelos Vasilakis, Vassilis Papaefstathiou, Pedro Trancoso, and Ioannis Sourdis. Hybrid2: Combining caching and migration in hybrid memory systems. In 2020 IEEE International Symposium on High Perfor- mance Computer Architecture (HPCA), 2020

2020
[68]

Tiering-0.8

Vishal Verma. Tiering-0.8. 2022. https: //git.kernel.org/pub/scm/linux/kernel/ git/vishal/tiering.git/log/?h=tiering-0.8

2022
[69]

Tiered mem- ory management: Access latency is the key! InProceed- ings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

Midhul Vuppalapati and Rachit Agarwal. Tiered mem- ory management: Access latency is the key! InProceed- ings of the ACM SIGOPS 30th Symposium on Operating Systems Principles, 2024

2024
[70]

Performance characteri- zation of cxl memory and its use cases

Xi Wang, Jie Liu, Jianbo Wu, Shuangyan Yang, Jie Ren, Bhanu Shankar, and Dong Li. Performance characteri- zation of cxl memory and its use cases. In2025 IEEE International Parallel and Distributed Processing Sym- posium (IPDPS), pages 1048–1061. IEEE, 2025

2025
[71]

cmpi: Using cxl memory sharing for mpi one-sided and two-sided inter-node communica- tions

Xi Wang, Bin Ma, Jongryool Kim, Byungil Koh, Hoshik Kim, and Dong Li. cmpi: Using cxl memory sharing for mpi one-sided and two-sided inter-node communica- tions. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2025

2025
[72]

Tmo: Transparent memory offloading in datacenters

Johannes Weiner, Niket Agarwal, Dan Schatzberg, Leon Yang, Hao Wang, Blaise Sanouillet, Bikash Sharma, Tejun Heo, Mayank Jain, Chunqiang Tang, et al. Tmo: Transparent memory offloading in datacenters. InPro- ceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022

2022
[73]

Enabling and exploiting flexible task assignment on gpu through sm-centric program trans- formations

Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, and Jeffrey Vetter. Enabling and exploiting flexible task assignment on gpu through sm-centric program trans- formations. InProceedings of the 29th ACM on Inter- national Conference on Supercomputing, 2015

2015
[74]

Unimem: Run- time data managementon non-volatile memory-based heterogeneous main memory

Kai Wu, Yingchao Huang, and Dong Li. Unimem: Run- time data managementon non-volatile memory-based heterogeneous main memory. InProceedings of the International Conference for High Performance Com- puting, Networking, Storage and Analysis, 2017

2017
[75]

Nomad: Non- Exclusive Memory Tiering via Transactional Page Mi- gration

Lingfeng Xiang, Zhen Lin, Weishu Deng, Hui Lu, Jia Rao, Yifan Yuan, and Ren Wang. Nomad: Non- Exclusive Memory Tiering via Transactional Page Mi- gration. InUSENIX Symposium on Operating Systems Design and Implementation (OSDI), 2024

2024
[76]

CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling

Dong Xu, Han Meng, Xinyu Chen, Dengcheng Zhu, Wei Tang, Fei Liu, Liguang Xie, Wu Xiang, Rui Shi, Yue Li, et al. Cccl: Node-spanning gpu collectives with cxl memory pooling.arXiv preprint arXiv:2602.22457, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[77]

FlexMem: Adaptive Page Profiling and Migration for Tiered Memory

Dong Xu, Junhee Ryu, Jinho Baek, Kwangsik Shin, Pengfei Su, and Dong Li. FlexMem: Adaptive Page Profiling and Migration for Tiered Memory. In30th USENIX Annual Technical Conference (ATC), 2024

2024
[78]

Parameters tuning of multi-model database based on deep reinforcement learning.Jour- nal of Intelligent Information Systems, 61(1):167–190, 2023

Feng Ye, Yang Li, Xiwen Wang, Nadia Nedjah, Peng Zhang, and Hong Shi. Parameters tuning of multi-model database based on deep reinforcement learning.Jour- nal of Intelligent Information Systems, 61(1):167–190, 2023

2023
[79]

An end-to-end automatic cloud database tuning system using deep reinforcement learning

Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, et al. An end-to-end automatic cloud database tuning system using deep reinforcement learning. In Proceedings of the 2019 international conference on management of data, 2019. 16

2019