


default search action
ACM Transactions on Architecture and Code Optimization, Volume 21
Volume 21, Number 1, March 2024
- Longfei Luo
, Dingcui Yu
, Yina Lv
, Liang Shi
:
Critical Data Backup with Hybrid Flash-Based Consumer Devices. 1:1-1:23 - Peng Chen
, Hui Chen
, Weichen Liu
, Linbo Long
, Wanli Chang
, Nan Guan
:
DAG-Order: An Order-Based Dynamic DAG Scheduling for Real-Time Networks-on-Chip. 2:1-2:24 - Zhang Jiang
, Ying Chen
, Xiaoli Gong
, Jin Zhang
, Wenwen Wang
, Pen-Chung Yew
:
JiuJITsu: Removing Gadgets with Safe Register Allocation for JIT Code Generation. 3:1-3:26 - Hayfa Tayeb
, Ludovic Paillat
, Bérenger Bramas:
Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations. 4:1-4:25 - Xueying Wang
, Guangli Li
, Zhen Jia
, Xiaobing Feng
, Yida Wang
:
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs. 5:1-5:26 - Hao Fan, Yiliang Ye
, Shadi Ibrahim
, Zhuo Huang
, Xingru Li
, Weibin Xue
, Song Wu
, Chen Yu
, Xuanhua Shi
, Hai Jin
:
QoS-pro: A QoS-enhanced Transaction Processing Framework for Shared SSDs. 6:1-6:25 - Yunping Zhao
, Sheng Ma
, Hengzhu Liu, Libo Huang
, Yi Dai
:
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs. 7:1-7:26 - Tong-Yu Liu
, Jianmei Guo
, Bo Huang
:
Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping. 8:1-8:26 - Lei Liu
, Xinglei Dou
:
QuCloud+: A Holistic Qubit Mapping Scheme for Single/Multi-programming on 2D/3D NISQ Quantum Computers. 9:1-9:27 - Lingxi Wu
, Minxuan Zhou
, Weihong Xu
, Ashish Venkat
, Tajana Rosing
, Kevin Skadron
:
Abakus: Accelerating k-mer Counting with Storage Technology. 10:1-10:26 - Seokwon Kang
, Jongbin Kim
, Gyeongyong Lee
, Jeongmyung Lee
, Jiwon Seo
, Hyungsoo Jung
, Yong Ho Song
, Yongjun Park
:
ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization Opportunities. 11:1-11:24 - Prasoon Mishra
, V. Krishna Nandivada
:
COWS for High Performance: Cost Aware Work Stealing for Irregular Parallel Loop. 12:1-12:26 - Joongun Park
, Seunghyo Kang
, Sanghyeon Lee
, Taehoon Kim
, Jongse Park
, Youngjin Kwon
, Jaehyuk Huh
:
Hardware-hardened Sandbox Enclaves for Trusted Serverless Computing. 13:1-13:25 - Tyler N. Allen
, Bennett Cooper
, Rong Ge
:
Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory. 14:1-14:24 - Zhonghua Wang
, Yixing Guo
, Kai Lu
, Jiguang Wan
, Daohui Wang
, Ting Yao
, Huatao Wu
:
Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL. 15:1-15:26 - Linbo Long
, Shuiyong He
, Jingcheng Shen
, Renping Liu
, Zhenhua Tan
, Congming Gao
, Duo Liu
, Kan Zhong
, Yi Jiang
:
WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDs. 16:1-16:23 - Zhihua Fan
, Wenming Li
, Zhen Wang
, Yu Yang
, Xiaochun Ye
, Dongrui Fan
, Ninghui Sun
, Xuejun An
:
Improving Utilization of Dataflow Unit for Multi-Batch Processing. 17:1-17:26 - Dunbo Zhang
, Qingjie Lang
, Ruoxi Wang
, Li Shen
:
Extension VM: Interleaved Data Layout in Vector Memory. 18:1-18:23 - Can Firtina
, Kamlesh R. Pillai
, Gurpreet S. Kalsi
, Bharathwaj Suresh
, Damla Senol Cali
, Jeremie S. Kim
, Taha Shahroodi
, Meryem Banu Cavlak
, Joël Lindegger
, Mohammed Alser
, Juan Gómez-Luna
, Sreenivas Subramoney
, Onur Mutlu
:
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis. 19:1-19:29 - Khalid Ahmad
, Cris Cecka
, Michael Garland
, Mary W. Hall
:
Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs. 20:1-20:20
Volume 21, Number 2, June 2024
- Chandra Sekhar Mummidi
, Victor da Cruz Ferreira
, Sudarshan Srinivasan
, Sandip Kundu
:
Highly Efficient Self-checking Matrix Multiplication on Tiled AMX Accelerators. 21 - Zhonghua Wang
, Chen Ding
, Fengguang Song
, Kai Lu
, Jiguang Wan
, Zhihu Tan
, Changsheng Xie
, Guokuan Li
:
WIPE: A Write-Optimized Learned Index for Persistent Memory. 22 - Gino A. Chacon
, Charles Williams
, Johann Knechtel
, Ozgur Sinanoglu
, Paul V. Gratz
, Vassos Soteriou
:
Coherence Attacks and Countermeasures in Interposer-based Chiplet Systems. 23 - Yan Wei
, Xingjun Zhang
:
A Concise Concurrent B+-Tree for Persistent Memory. 24 - Fareed Qararyah
, Muhammad Waqar Azhar
, Pedro Trancoso
:
An Efficient Hybrid Deep Learning Accelerator for Compact and Heterogeneous CNNs. 25 - Fernando Fernandes dos Santos
, Luigi Carro
, Flavio Vella
, Paolo Rech
:
Assessing the Impact of Compiler Optimizations on GPUs Reliability. 26 - Valentin Isaac-Chassande
, Adrian Evans
, Yves Durand
, Frédéric Rousseau:
Dedicated Hardware Accelerators for Processing of Sparse Matrices and Vectors: A Survey. 27 - Benyi Xie
, Yue Yan
, Chenghao Yan
, Sicheng Tao
, Zhuangzhuang Zhang
, Xinyu Li
, Yanzhi Lan
, Xiang Wu
, Tianyi Liu
, Tingting Zhang
, Fuxin Zhang
:
An Instruction Inflation Analyzing Framework for Dynamic Binary Translators. 28 - Samuel Rac
, Mats Brorsson
:
Cost-aware Service Placement and Scheduling in the Edge-Cloud Continuum. 29 - Feng Xue
, Chenji Han
, Xinyu Li
, Junliang Wu
, Tingting Zhang
, Tianyi Liu
, Yifan Hao
, Zidong Du
, Qi Guo
, Fuxin Zhang
:
Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses. 30 - Kunpeng Xie
, Ye Lu
, Xinyu He
, Dezhi Yi
, Huijuan Dong
, Yao Chen
:
Winols: A Large-Tiling Sparse Winograd CNN Accelerator on FPGAs. 31 - Ke Liu
, Kan Wu
, Hua Wang
, Ke Zhou
, Peng Wang
, Ji Zhang
, Cong Li
:
SLAP: Segmented Reuse-Time-Label Based Admission Policy for Content Delivery Network Caching. 32 - Panagiotis Miliadis
, Dimitris Theodoropoulos
, Dionisios N. Pnevmatikatos
, Nectarios Koziris
:
Architectural Support for Sharing, Isolating and Virtualizing FPGA Resources. 33 - Haitao Du
, Yuhan Qin
, Song Chen
, Yi Kang
:
FASA-DRAM: Reducing DRAM Latency with Destructive Activation and Delayed Restoration. 34 - Michael Canesche
, Vanderson Martins do Rosário
, Edson Borin
, Fernando Magno Quintão Pereira
:
The Droplet Search Algorithm for Kernel Scheduling. 35 - Asmita Pal
, Keerthana Desai
, Rahul Chatterjee
, Joshua San Miguel
:
Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces. 36 - Chengying Huan
, Yongchao Liu
, Heng Zhang
, Shuaiwen Song
, Santosh Pandey
, Shiyang Chen
, Xiangfei Fang
, Yue Jin
, Baptiste Lepers
, Yanjun Wu
, Hang Liu
:
TEA+: A Novel Temporal Graph Random Walk Engine with Hybrid Storage Architecture. 37 - Soojin Hwang
, Daehyeon Baek
, Jongse Park
, Jaehyuk Huh
:
Cerberus: Triple Mode Acceleration of Sparse Matrix and Vector Multiplication. 38 - Siddhartha Raman Sundara Raman
, Lizy Kurian John
, Jaydeep P. Kulkarni
:
NEM-GNN: DAC/ADC-less, Scalable, Reconfigurable, Graph and Sparsity-Aware Near-Memory Accelerator for Graph Neural Networks. 39 - Yan Chen
, Qiwen Ke
, Huiba Li
, Yongwei Wu
, Yiming Zhang
:
xMeta: SSD-HDD-hybrid Optimization for Metadata Maintenance of Cloud-scale Object Storage. 40 - Vidush Singhal
, Laith Sakka
, Kirshanthan Sundararajah
, Ryan Newton
, Milind Kulkarni
:
Orchard: Heterogeneous Parallelism and Fine-grained Fusion for Complex Tree Traversals. 41
Volume 21, Number 3, September 2024
- Hajar Falahati
, Mohammad Sadrosadati
, Qiumin Xu
, Juan Gómez-Luna
, Banafsheh Saber Latibari
, Hyeran Jeon
, Shaahin Hessabi
, Hamid Sarbazi-Azad
, Onur Mutlu
, Murali Annavaram
, Massoud Pedram
:
Cross-core Data Sharing for Energy-efficient GPUs. 42:1-42:32 - Ching-Jui Lee
, Tsung Tai Yeh
:
ReSA: Reconfigurable Systolic Array for Multiple Tiny DNN Tensors. 43:1-43:24 - Ziheng Wang
, Xiaoshe Dong
, Yan Kang
, Heng Chen
, Qiang Wang
:
An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPU. 44:1-44:25 - Jiang Wu
, Zhuo Zhang
, Deheng Yang
, Jianjun Xu
, Jiayu He
, Xiaoguang Mao
:
Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design Code. 45:1-45:26 - Chen Ding
, Jian Zhou
, Kai Lu
, Sicen Li
, Yiqin Xiong
, Jiguang Wan
, Ling Zhan
:
D2Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage. 46:1-46:22 - Zhuohao Wang
, Lei Liu
, Limin Xiao
:
iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments. 47:1-47:24 - Junkaixuan Li
, Yi Kang
:
GraphSER: Distance-Aware Stream-Based Edge Repartition for Many-Core Systems. 48:1-48:25 - Ke Wu
, Dezun Dong
, Weixia Xu
:
COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol Codesign. 49:1-49:26 - Qunyou Liu
, Darong Huang
, Luis Costero
, Marina Zapater
, David Atienza
:
Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads. 50:1-50:23 - Dongmoon Min
, Ilkwon Byun
, Gyu-hyeon Lee
, Jangwoo Kim
:
CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling. 51:1-51:27 - Hai Zhou
, Dan Feng
:
Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star Networks. 52:1-52:24 - Bobin Deng
, Bhargava Nadendla
, Kun Suo
, Chloe Yixin Xie
, Dan Chia-Tien Lo
:
Fixed-point Encoding and Architecture Exploration for Residue Number Systems. 53:1-53:27 - Yizhuo Wang
, Fangli Chang
, Bingxin Wei
, Jianhua Gao
, Weixing Ji
:
Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs. 54:1-54:27 - Luming Wang
, Xu Zhang
, Songyue Wang
, Zhuolun Jiang
, Tianyue Lu
, Mingyu Chen
, Siwei Luo
, Keji Huang
:
Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access. 55:1-55:28 - Yunping Zhao
, Sheng Ma
, Hengzhu Liu
, Dongsheng Li
:
SAL: Optimizing the Dataflow of Spin-based Architectures for Lightweight Neural Networks. 56:1-56:27 - Kai Lu
, Siqi Zhao
, Haikang Shan
, Qiang Wei
, Guokuan Li
, Jiguang Wan
, Ting Yao
, Huatao Wu
, Daohui Wang
:
Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory. 57:1-57:26 - Wangqi Peng
, Yusen Li
, Xiaoguang Liu
, Gang Wang
:
Lavender: An Efficient Resource Partitioning Framework for Large-Scale Job Colocation. 58:1-58:23 - Feng Zhang
, Fulin Nan
, Binbin Xu
, Zhirong Shen
, Jiebin Zhai
, Dmitrii I. Kaplun
, Jiwu Shu
:
Achieving Tunable Erasure Coding with Cluster-Aware Redundancy Transitioning. 59:1-59:24 - Ataberk Olgun
, F. Nisa Bostanci
, Geraldo Francisco de Oliveira Junior
, Yahya Can Tugrul
, Rahul Bera
, Abdullah Giray Yaglikçi
, Hasan Hassan
, Oguz Ergin
, Onur Mutlu
:
Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture. 60:1-60:29 - Xiaohui Wei
, Chenyang Wang
, Hengshan Yue
, Jingweijia Tan
, Zeyu Guan
, Nan Jiang
, Xinyang Zheng
, Jianpeng Zhao
, Meikang Qiu
:
ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error Detection. 61:1-61:26 - Qiao Li
, Yu Chen
, Guanyu Wu
, Yajuan Du
, Min Ye
, Xinbiao Gan
, Jie Zhang
, Zhirong Shen
, Jiwu Shu
, Chun Xue
:
Characterizing and Optimizing LDPC Performance on 3D NAND Flash Memories. 62:1-62:26 - Jiahong Xu
, Haikun Liu
, Zhuohui Duan
, Xiaofei Liao
, Hai Jin
, Xiaokang Yang
, Huize Li
, Cong Liu
, Fubing Mao
, Yu Zhang
:
ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators. 63:1-63:26 - Jiang Wu
, Zhuo Zhang
, Deheng Yang
, Jianjun Xu
, Jiayu He
, Xiaoguang Mao
:
Time-Aware Spectrum-Based Bug Localization for Hardware Design Code with Data Purification. 64:1-64:25
Volume 21, Number 4, December 2024
- Zhuoran Song
, Zhongkai Yu
, Xinkai Song
, Yifan Hao
, Li Jiang
, Naifeng Jing
, Xiaoyao Liang
:
Environmental Condition Aware Super-Resolution Acceleration Framework in Server-Client Hierarchies. 65:1-65:26 - Georgia Antoniou
, Davide B. Bartolini
, Haris Volos
, Marios Kleanthous
, Zhe Wang
, Kleovoulos Kalaitzidis
, Tom Rollet
, Ziwei Li
, Onur Mutlu
, Yiannakis Sazeides
, Jawad Haj-Yahya
:
Agile C-states: A Core C-state Architecture for Latency Critical Applications Optimizing both Transition and Cold-Start Latency. 66:1-66:26 - Xinbiao Gan
, Tiejun Li
, Feng Xiong
, Bo Yang
, Xinhai Chen
, Chunye Gong
, Shijie Li
, Kai Lu
, Qiao Li
, Yiming Zhang
:
MST: Topology-Aware Message Aggregation for Exascale Graph Processing of Traversal-Centric Algorithms. 67:1-67:22 - Yujie Cui
, Wei Chen
, Xu Cheng
, Jiangfang Yi
:
Hyperion: A Highly Effective Page and PC Based Delta Prefetcher. 68:1-68:27 - Jianhua Gao
, Weixing Ji
, Yizhuo Wang
:
Optimization of Large-Scale Sparse Matrix-Vector Multiplication on Multi-GPU Systems. 69:1-69:24 - Zhengding Hu
, Jingwei Sun
, Zhongyang Li
, Guangzhong Sun
:
AG-SpTRSV: An Automatic Framework to Optimize Sparse Triangular Solve on GPUs. 70:1-70:25 - Wenbo Zhang
, Yiqi Liu
, Tianhao Zang
, Zhenshan Bao
:
EA4RCA: Efficient AIE accelerator design framework for regular Communication-Avoiding Algorithm. 71:1-71:24 - Arun Thangamani
, Vincent Loechner
, Stéphane Genaud
:
A Survey of General-purpose Polyhedral Compilers. 72:1-72:26 - Junqing Lin
, Jingwei Sun
, Xiaolong Shi
, Honghe Zhang
, Xianzhi Yu
, Xinzhi Wang
, Jun Yao
, Guangzhong Sun
:
LO-SpMM: Low-cost Search for High-performance SpMM Kernels on GPUs. 73:1-73:25 - Chenglong Yi
, Jintong Liu
, Shenggang Wan
, Juntao Fang
, Bin Sun
, Liqiang Zhang
:
Data Deduplication Based on Content Locality of Transactions to Enhance Blockchain Scalability. 74:1-74:24 - Joshua Dennis Booth
, Phillip Allen Lane
:
A NUMA-Aware Version of an Adaptive Self-Scheduling Loop Scheduler. 75:1-75:22 - Yu Tang
, Qiao Li
, Lujia Yin
, Dongsheng Li
, Yiming Zhang
, Chenyu Wang
, Xingcheng Zhang
, Linbo Qiao
, Zhaoning Zhang
, Kai Lu
:
DELTA: Memory-Efficient Training via Dynamic Fine-Grained Recomputation and Swapping. 76:1-76:25 - Zhenhua Tan
, Linbo Long
, Jingcheng Shen
, Renping Liu
, Congming Gao
, Kan Zhong
, Yi Jiang
:
Optimizing Garbage Collection for ZNS SSDs via In-storage Data Migration and Address Remapping. 77:1-77:25 - Xiang Li
, Qiong Chang
, Aolong Zha
, Shijie Chang
, Yun Li
, Jun Miyazaki
:
An Optimized GPU Implementation for GIST Descriptor. 78:1-78:24 - Xiaobo Lu
, Jianbin Fang
, Lin Peng
, Chun Huang
, Zidong Du
, Yongwei Zhao
, Zheng Wang
:
Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product. 79:1-79:25 - Yu Feng
, Weikai Lin
, Zihan Liu
, Jingwen Leng
, Minyi Guo
, Han Zhao
, Xiaofeng Hou
, Jieru Zhao
, Yuhao Zhu
:
Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture. 80:1-80:25 - Changxi Liu
, Alen Sabu
, Akanksha Chaudhari
, Qingxuan Kang
, Trevor E. Carlson
:
Pac-Sim: Simulation of Multi-threaded Workloads using Intelligent, Live Sampling. 81:1-81:26 - Saurabh Raje
, Yufan Xu
, Atanas Rountev
, Edward F. Valeev
, P. Sadayappan
:
CoNST: Code Generator for Sparse Tensor Networks. 82:1-82:24 - Danlin Jia
, Geng Yuan
, Yiming Xie
, Xue Lin
, Ningfang Mi
:
A Data-Loader Tunable Knob to Shorten GPU Idleness for Distributed Deep Learning. 83:1-83:25 - Shaobu Wang
, Guangyan Zhang
, Junyu Wei
, Yang Wang
, Jiesheng Wu
, Qingchao Luo
:
Understanding Silent Data Corruption in Processors for Mitigating its Effects. 84:1-84:27 - Yen-Yu Lu
, Chin-Hsien Wu
, Shih-Jen Li
, Cheng-Tze Lee
, Cheng-Yen Wu
:
A Stable Idle Time Detection Platform for Real I/O Workloads. 85:1-85:23 - Lingyu Sun
, Xiaofeng Hou
, Chao Li
, Jiacheng Liu
, Xinkai Wang
, Quan Chen
, Minyi Guo
:
A2: Towards Accelerator Level Parallelism for Autonomous Micromobility Systems. 86:1-86:20 - Manojna Sistla
, Yiding Liu
, Xin Fu
:
Towards High Performance QNNs via Distribution-Based CNOT Gate Reduction. 87:1-87:22 - Fubing Mao
, Xu Liu
, Yu Zhang
, Haikun Liu
, Xiaofei Liao
, Hai Jin
, Wei Zhang
, Jian Zhou
, Yufei Wu
, Longyu Nie
, Yapu Guo
, Zihan Jiang
, Jingkang Liu
:
PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs. 88:1-88:25 - Wentong Li
, Yina Lv
, Longfei Luo
, Yunpeng Song
, Liang Shi
:
Access Characteristic-Guided Remote Swapping Across Mobile Devices. 89:1-89:25 - Yinan Zhang
, Shun Yang
, Huiqi Hu
, Chengcheng Yang
, Peng Cai
, Xuan Zhou
:
SuccinctKV: a CPU-efficient LSM-tree Based KV Store with Scan-based Compaction. 90:1-90:26 - Siyuan Ma
, Kaustubh Manohar Mhatre
, Jian Weng
, Bagus Hanindhito
, Zhengrong Wang
, Tony Nowatzki
, Lizy K. John
, Aman Arora
:
PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation. 91:1-91:27

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.