Computer Architecture

Publication title Conference/Journal Authors
Date 
Further resources
Eliminating Fences via ISA Support for Instruction-Level Execution Dependencies
ISCA 2021
T.Shull,
I. Vougioukas,
N. Nikoleris,
W. Elsasser,
J. Torrellas
June 2021  
Speculative Vectorization with Selective Replay
ISCA 2021
P. Sun,
G. Gabrielli,
T. Jones
June 2021  
Bandwidth Utilization Side-Channel on ML Inference Accelerators
ISCA 2021
S. Banerjee,
S. Wei,
P. Ramrakhyani,
M. Tiwari
June 2021  
Re-establishing Fetch-Directed Instruction Prefetching: An Industry Perspective
ISPASS 2021
Y. Ishii,
J. Lee,
K. Nathella,
D. Sunwoo
March 2021 Watch presentation
BBB: Simplifying Persistent Programming using Battery-Backed Buffers
HPCA 21
M. Alshboul,
P. Ramarakhyani,
W. Wang,
J. Tuck,
and Y. Solihin
February 2021 Read blog
Whitepaper: Understanding Write Combining on Arm
Arm Whitepaper
P. Shamis
November 2020  
Energy-aware HW/SW Co-modeling of Batteryless Wireless Sensor Nodes
ENSsys
S. Wong
S. Sliper
W. Wang 
A. Weddell
S. Gauthier
G. Merrett
November 2020  
AOS: Hardware-based Always-On Heap Memory Safety
MICRO

Y. Kim
J. Lee
H. Kim

October 2020  
Towards Data-Flow Parallelization for Adaptive Mesh Refinement Applications
CLUSTER K. Sala
A. Rico
V. Beltran
September 2020

Read blog

Watch presentation

Securing Branch Predictors With Two-Level Encryption   J. Lee
Y. Ishii
D. Sunwoo
July 2020  
Relaxed Persist Ordering Using Strand Persistency   V. Gogte
W. Wang
S. Diestelhorst
P. Chen
S. Narayanasamy
T. Wenisch
June 2020 Read blog
Implementation of a flexible cache coherency protocol for the Ruby memory system   T. Muck
P. Benedicte
May 2020  
DRAM Memory Controller Updates   W. Elsasser
N. Nikoleris
May 2020 Watch presentation
The Non-Uniform Compute Device (NUCD) Architecture for Lightweight Accelerator Offload
PDP M. Asri
C. Dunham
R. Rusitoru
A. Gerstlauer
J Beard
March 2020  
Shredder: Learning Noise Distributions to Protect Inference Privacy   P. Ramrakhyani
H. Esmaeilzadeh
F. Mireshghallah
 March 2020 Watch presentation
Rebasing Instruction Prefetching: An Industry Perspective
CAL Y. Ishii (CE-CPU)
J. Lee
K. Nathella
D. Sunwoo
February 2020
Fused: Closed-loop Performance and Energy Simulation of Embedded Systems   S. T. Sliper
W. Wang
N. Nikoleris
A. S. Weddell
G. V. Merrett
January 2020  
Temporal Prefetching Without the Off-Chip Metadata   H. Wu
K. Nathella
J. Pusdesris
D. Sunwoo
A. Jain
C. Lin
October 2019  
Directed Statistical Warming through Time Traveling   N. Nikoleris
L. Eeckhout
E. Hagersten
T. E. Carlson
October 2019  
DynaSprint: Microarchitectural Sprints with Dynamic Utility and Thermal Management   Z. Huang
J. Joao
A. Rico
A. Hilton
B. Lee
October 2019  
SPiDRE: Accelerating Sparse Memory Access Patterns
PACT A. Barredo
J. Beard
M. Moreto
September 2019 See PDF
Multi-spectral Reuse Distance: Diving Spatial Information from Temporal Data
HPEC A. Cabrera
R. Chamberlain
J. Beard
September 2019
Reducing Data Movement and Energy in Multilevel Cache Hierarchies Without Losing Performance: Can you have it all?   J. Wang
P. Ramrakhyani
W. Elsasser
L. K. John
September 2019  
Efficient Metadata Management for Irregular Data Prefetching   H. Wu
K. Nathella
D. Sunwoo
A. Jain
C. Lin
June 2019  
Sampled Simulation of Task-Based Programs   T. Grass
T. Carlson
A. Rico
G. Ceballos
E. Ayguadé
M. Casas
M. Moreto
February 2019  
BRB: Mitigating Branch Predictor Side-Channels   I.Vougioukas
N.Nikoleris
A.Sandberg
S.Diestelhorst
G.V. Merrett
B.M. Al-Hashimi
February 2019 Read blog
Whitepaper: Introducing the new Armv8.1-M architecture   T. Grocutt February 2019 Read blog
Nucleus: Finding the Sharing Limit of Heterogeneous Cores   I. Vougioukas
A. Sanberg
B. M. Al-Hashimi
Geoff V. Merrett
October 2017  
A Triple Core Lock-Step (TCLS) ARM® Cortex®-R5 Processor for Safety-Critical and Ultra-Reliable Applications   X. Iturbe
B. Venu
E. Ozer
S. Das
October 2017  
The Arm Scalable Vector Extension   N. Stephens
S. Biles
M. Boettcher
J. Eapen
M. Eyole
G. Gabrielli
M. Horsnell
G. Magklis
A. Martinez
M. Premillieu
A. Reid
A. Rico
P. Walker
April 2017 Read blog

Devices, Circuits, and Materials

Publication title Conference/Journal Authors
Date 
Further resources
A Fokker-Planck Solver to Model MTJ Stochasticity
ESSDERC 2021 F. Garcia-Redondo,
P. Prabhat,
M. Bhargava
September 2021  
A Compact Model for Scalable MTJ Simulation
SMACD 2021 F. García-Redondo,
P. Prabhat,
M. Bhargava,
C. Dray
July 2021  
A Natively Flexible 32-bit Arm Microprocessor
Nature J. Biggs,
J. Myers,
J. Kufel,
E. Ozer,
S. Craske,
A. Sou,
C. Ramsdale,
K. Williamson,
R. Price,
S. White
July 2021  
A High-Density Logic-On 3DIC Design Using Face-to-Face Hybrid Wafer-Bonding on 12nm FinFET Process
IEDM S. Sinha et al December 2020  
Buried Bitline for sub-5nm SRAM Design
IEDM R. Mathur et al December 2020  
Plasmonics: Breaking the Barriers of Silicon Photonics for High-Performance Chip-to-Chip Interconnects
IEDM 20 C. Lin,
D. Prasad;
S. Sinha,
B. Cline,
A.S. Helmy
December 2020  
A Supply Voltage Control Method for Performance Guaranteed Ultra-Low-Power Microcontroller
IEEE Journal of Solid-State Circuits
B. Labbe
P. Fan
T. Achuthan
P. Prabhat
G. Knight
J. Myers
September 2020 Read blog
A Spike-Latency Transceiver with Tuneable Pulse Control for Low-Energy Wireless 3D Integration   B. Fletcher
S. Das
Te. Mak
September 2020  
Bespoke Machine Learning Processor Development Framework on Flexible Substrates
IEEE FLEPS
Emre Ozer
Jedrzej  Kufel 
John Biggs 
Gavin  Brown 
James  Myers
Anjit Rana 
Antony Sou 
Catherine  Ramsdale
July 2020
A hardwired machine learning processing engine fabricated with sub-micron metal-oxide thin-film transistors on a flexible substrate   E. Ozer
J. Kufel
J. Myers
J. Biggs
G. Brown
A. Rana
A. Sou
C. Ramsdale
S. White
July 2020 Read blog
Enhanced 3D Implementation of an Arm® Cortex®-A   X. Xu
M. Bhargava
S. Moore
S. Sinha
B. Cline
July 2020  
GeST: An automatic framework for generating CPU stress-tests
ISPASS
Z. Hadjilambrou
S. Das
P.N. Whatmough 
D. Bull
Y. Sazeides
March 2020  
M0N0: A Performance-Regulated 0.8-to-38MHz DVFS ARM Cortex-M33 SIMD MCU with 10nW Sleep Power
  Pranay Prabhat
Benoît Labbe
Graham Knight
Anand Savanth
Jonas Svedas
Matthew J Walker
Supreet Jeloka
Philex Ming-Yan Fan
Fernando García-Redondo
Thanusree Achuthan
James Myers
February 2020 Read blog
Buried Power Rails and Back-side Power Grids: Arm® CPU Power Delivery Network Design Beyond 5nm   D. Prasad
S. S. T. Nibhanupudi
S. Das
O. Zografos
B. Chehab
S. Sarkar
R. Baert
A. Robinson
A. Gupta
A. Spessot
P. Debacker
D. Verkest
J. Kulkarni
B. Cline
S. Sinha
December 2019 Read blog
Error Correlation Prediction in Lockstep Processors for Safety-Critical Systems   E. Ozer
B. Venu
X. Iturbe
S. Das
S. Lyberis
J. Biggs
P. Harrod
J. Penton
October 2018 Read blog
Standard Cell Library Design and Optimization Methodology for ASAP7 PDK   X. Xu
N. Shah
A. Evans
S. Sinha
B. Cline
G. Yeric
July 2018 Read blog

HPC

Publication title Conference/Journal Authors
Date 
Further resources
PLANAR: a programmable accelerator for near-memory data rearrangement
ICS 2021 A. Barredo,
A. Armejach,
J. Beard,
M. Moreto
June 2021  
Asvie: A Timing-Agnostic SVE Optimization Methodology   M. T. Cruz
D. Ruiz
R. Rusitoru
November 2019  
Cache Line Sharing and Communication in ECP Proxy Applications   J. Randall
A. Rico
J. A. Joao
September 2019 Read blog  
On the Benefits of Tasking with OpenMP   A. Rico
I. S. Barrera
J. A. Joao
J. Randall
M. Casas
M. Moretó
September 2019 Read blog
On the Maturity of Parallel Applications for Asymmetric Multi-core Processors   K. Chronaki
M.Moretó
M. Casas
A. Rico
R. M. Badia
E. Ayguadé
M. Valero
May 2019  

IoT

Publication title Conference/Journal Authors
Date 
Further resources
Reliability-Driven Deployment in Energy-Harvesting Sensor Networks 
CNSM X. Yu
X. Song
L. Cherkasova
T. Simunic Rosing
November 2020 Slides
Optimizing Sensor Deployment and Maintenance Costs for Large-Scale Environmental Monitoring ISSS X. Yu
K. Ergun
L. Cherkasova
T. S. Rosing
September 2020  
Combining Individual and Joint Networking Behavior for Intelligent IoT Analytics ICIOT J. V. Jeyakumar
L. Cherkasova
S. Lajevardi
M. Allan
Y. Zhao
J. Fry
M. Srivastava
August 2020  
RelIot: Reliability Simulator for IoT Networks ICIOT K. Ergun
X. Yu
N. Nagesh
L. Cherkasova
P. Mercati
R. Rayoub
T. Rosing
August 2020  
Simulating Reliability of IoT Networks with RelIoT DSN-S K. Ergun
X. Yu
N. Nagesh
L. Cherkasova
P. Mercati
R. Rayoub
T. Rosing
July 2020  
IOTGAZE: IoT Security Enforcement via Wireless Context Analysis INFOCOM T. Gu
Z. Fang
A. Abhishek
H. Fu
P. Hu
P. Mohapatra
July 2020  
Applications of Computation-In-Memory Architectures based on Memristive Devices DATE Said Hamdioui
Hoang Anh Du Nguyen
Mottaqiallah Taouil
Abu Sebastian
Manuel Le Gallo
Sandeep Pande
Siebren Schaafsma
Francky Catthoor
Shidhartha Das
Fernando García-Redondo
G. Karunaratne
Abbas Rahimi
Luca Benini
March 2020  
Analysis Demand Forecasting of Residential Energy Consumption at Multiple Time Scales   P. Amin
L. Cherkasova
R. Aitken
V. Kache
November 2019  
A Time-Domain Current-Mode MAC Engine for Analogue Neural Networks in Flexible Electronics   Matt Douthwaite
Fernando García-Redondo
Pantelis Georgiou
Shidhartha Das
October 2019  
A 0.98-nW/kHz 33-kHz Fully Integrated Subthreshold-Region Operation RC Oscillator With Forward-Body-Biasing   P. Fan
A. Savanth
B. Labbé
P. Prabhat
J. Myers
September 2019 Read blog
A 10.8pJ/bit Pulse-Position Inductive Transceiver for Low-Energy Wireless 3D Integration   B. Fletcher
S. Das
T. Mak
September 2019  
A Low-Energy Inductive Transceiver using Spike-Latency Encoding for Wireless 3D Integration   B. Fletcher
S. Das
T. Mak
July 2019  
A 65nm switched source line sub-threshold ROM using data encoding, with 0.3V Vmin and 47fJ/b access energy   Supreet Jeloka
Pranay Prabhat
Graham Knight
James Myers
July 2019  
Automating Energy Demand Modeling and Forecasting Using Smart Meter Data   P. Amin
L. Cherkasova
R. Aitken
V. Kache
July 2019  
Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications   D. Gope
G. Dasika
M. Mattina
March 2019  
Integrated Reciprocal Conversion with Selective Direct Operation for Energy Harvesting Systems   A. Savanth
A. Weddell
J. Myers
D. Flynn
B. M. Al-Hashimi
September 2017 Read blog

Machine Learning

Publication title Conference/Journal Authors
Date  
Further resources
Keyword Transformer: A Self-Attention Model for Keyword Spotting
INTERSPEECH 2021 A. Berg,
M. O'Connor,
M. Tairum-Cruz
August 2021  
Strong Data Processing Inequality in Neural Networks with Noisy Neurons and its Implications
ISIT 2021 C. Zhou,
Q. Zhuang,
M. Mattina,
P. N. Whatmough
July 2021  
Federated Learning Based on Dynamic Regularization
ICLR 2021 D. A. E. Acar,
Y. Zhao,
R. Matas,
M. Mattina,
P. Whatmough,
V. Saligrama
May 2021  
MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers
MLSys 2021
C. Banbury
C. Zhou
I. Fedorov
R. M. Navarro
U. Thakker
D. Gope
V. J. Reddi
M. Mattina
P. Whatmough
February 2021  
Stochastic-YOLO: Efficient Probabilistic Object Detection under Dataset Shifts
NeurIPS 2020
T. Azevedo
R. De Jong
M. Mattina
P. Maji.
December 2020
Understanding the Impact of Dynamic Channel Pruning on Conditionally Parameterized Convolutions
AIChallengeIoT
R. Raju
D. Gope
U. Thakker
J. Beu
November 2020  
Pushing the Envelope of Dynamic Channel Spatial Gating Technologies
AIChallengeIoT
X. Huang
U. Thakker
D. Gope
J. Beu
November 2020  
Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration
  Z. Liu
P. Whatmough
M. Mattina
September 2020  
Efficient Residue Number System Based Winograd Convolution
Z. Liu
M Mattina
August 2020 Read blog
Tiny LSTMs: Efficient Neural Speech Enhancement for Hearing Aids
InterSpeech 2020
I. Fedorov
M. Stamenovic
C. Jensen
L. Yang
A. Mandell
Y. Gan
M. Mattina
P. Whatmough
May 2020  
Mango: A Python Library for Parallel Hyperparameter Tuning   S. Sandha
M. Aggarwal
I. Fedorov
M. Srivastava
May 2020  
Benchmarking TinyML Systems: Challenges and Direction   C. Banbury
V. Reddi
M. Lam
W. Fu
A. Fazel
J. Holleman
X. Huang
R. Hurtado
D. Kanter
A. Lokhmotov
D. Patterson
D. Pau
J. Seo
J. Sieracki
U. Thakker
M. Verheist
P. Yadav
March 2020 Workshop
Compressing Language Models using Doped Kronecker Products   U. Thakker
P. Whatmough
M. Mattina
J. Beu
March 2020 Workshop
Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference   ZG. Liu
P. Whatmough
M. Mattina
March 2020  
Aggressive Compression of MobileNets Using Hybrid Ternary Layers   D. Gope
J. Beu
U. Thakker
M. Mattina
February 2020  
Improving Accuracy of Neural Networks Compressed using Fixed Structures via Doping   U. Thakker
P. Whatmough
M. Mattina
J. Beu
February 2020 Summit
Compressing RNNs for IoT devices by 15-38x using Kronecker Products   U. Thakker
J. Beu
D. Gope
C. Zhou
I. Fedorov
G. Dasika
M. Mattina
January 2020  
Run-Time Efficient RNN Compression for Inference on Edge Devices   U. Thakker
J. Beu
D. Gope
G. Dasika
M. Mattina
January 2020  
SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers   I. Fedorov
R. P. Adams
M. Mattina
P. Whatmough
December 2019 Read blog
RNN Compression using Hybrid Matrix Decomposition   U. Thakker
J. Beu
D. Gope
C. Zhou
I. Fedorov
G. Dasika
M. Mattina
December 2019 Workshop
Skipping RNN State Updates without Retraining the Original Model   J. Tao
U. Thakker
G. Dasika
J. Beu
November 2019 Read blog
A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning   N. Ardalani
U. Thakker
A. Albarghouthi
K. Sankaralingam
June 2019 Workshop
2019 Evolutionary Algorithms Review   A. Sloss
S. Gustafson
June 2019 Read book
Learning low-precision neural networks without Straight-Through Estimator (STE)   Z-G. Liu
M. Mattina
May 2019 Read blog
RNN Compression using Hybrid Matrix Decomposition   U. Thakker
J. Beu
D. Gope
G. Dasika
M. Mattina
March 2019 Summit
Measuring scheduling efficiency of RNNs for NLP applications   U. Thakker
G. Dasika
J. Beu
M. Mattina
March 2019 Workshop
SCALE-Sim: Systolic CNN Accelerator Simulator   A. Samajdar
Y. Zhu
P. Whatmough
M. Mattina
T. Krishna
February 2019  
FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning   P. Whatmough
C. Zhou
P. Hansen
S. Kolala Venkataramanaiah
J-S. Seo
M. Mattina
February 2019 Read blog
Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning   P. Whatmough
C. Zhou
P. Hansen
M. Mattina
February 2019  
Efficient and Robust Machine Learning for Real-World Systems   F. Pernkopf
W. Roth
M. Zoehrer
L. Pfeifenberger
G. Schindler
H. Froening
S. Tschiatschek
R. Peharz
M. Mattina
Z. Ghahramani
December 2018  
DNN Engine: A 28-nm Timing-Error Tolerant Sparse Deep Neural Network Processor for IoT Applications   P. Whatmough
S. Kyu-Lee
D. Brooks
G. Wei
September 2018  
Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision   Y. Zhu
A. Samajdar
M. Mattina
P. Whatmough
March 2018 Read blog
Mobile Machine Learning Hardware at Arm: A Systems-on-Chip (SoC) Perspective   Y. Zhu
M. Mattina
P. Whatmough
February 2018  

Security

Publication title Conference/Journal Authors
Date  
Further resources
Talk: It’s time to tame the wild west of IoT device security
HOST
R. Aitken
December 2020  
Whitepaper: Post-Quantum Cryptography
Arm Whitepaper
H. Becker
October 2020  
Talk: How can hardware security contribute to the fight against Covid-19 and post-pandemic life?
HOST
R. Aitken
July 2020  
ISA Semantics for ARMv8-A, RISC-V, and CHERI-MIPS   A. Armstrong
T. Bauereiss
B. Campbell
A. Reid
K. E. Gray
R. M. Norton
P. Mundkur
M. Wassell
J. French
C. Pulte
S. Flur
I. Stark
N. Krishnaswami
P. Sewell
January 2019  
BRB: Mitigating Branch Predictor Side-Channels.   I. Vougioukas
N. Nikoleris
A. Sandberg
S. Diestelhorst
B. M. Al-Hashimi
G. V. Merrett
November 2018 Read blog
The semantics of transactions and weak memory in x86, Power, ARM, and C++   N. Chong
T. Sorensen
J. Wickerson
June 2018 Read blog

Software and Services

Publication title   Authors
Date  
Further resources
Virtual-Link: A Scalable Multi-Producer, Multi-Consumer Message Queue Architecture for Cross-Core Communication
IPDPS 2021
Q. Wu
J. Beard
A. Ekanayake
A. Gerstlauer
L. John
January 2021  
IoTSPY: Uncovering Human Privacy Leakage in IoT Networks via Mining Wireless Context
PIMRC
T. Gu
Z. Fang
A. Abhishek
P. Mohapatra
September 2020  
eWASM: Practical Software Fault Isolation for Reliable Embedded Devices   G. Peach
R. Pan
Z. Qu
G. Parmer
C. Haster
L. Cherkasova
September 2020 Watch presentation
OpenSHMEM I/O Extensions for Fine-grained Access to Persistent Memory Storage   M. Grodowitz
P. Shamis
S. Poole
August 2020  
Sledge: A Serverless-First, Light-Weight Wasm Runtime for the Edge
Middleware
P. K. Gadepalli
S. McBride
G. Peach
L. Cherkasova
G. Parmer
July 2020  
Talk: Bending the climate curve: Enabling Sustainable Growth of Big Data, AI, and Cloud Computing
SEMICON West
R. Aitken
July 2020 Read blog
Towards Learning-automation IoT Attack Detection through Reinforcement Learning
WoWMoM
T. Gu
Z Fang
A. Abhishek
P. Mohapatra
June 2020 Read blog
Using Arm Scalable Vector Extension to Optimize Open MPI   D. Zhong
P. Shamis
Q. Cao
G. Bosilca
S. Sumimoto
K. Miura
J. Dongarra
May 2020  
SMARTER: Experiences with Cloud Native on Edge   A. Ferreira
E. V. Hensbergen
C. Adeniyi-Jones
E. Grimely-Evans
J. Minor
M. Nutter
L. E. Peña
K. Agarwal
J. Hermes
April 2020  
Talk: Is an Open-source Hardware Revolution on the Horizon?
ISSCC 20
R. Aitken
February 2020  
Challenges and Opportunities for Efficient Serverless Computing at the Edge   P.K. Gadepalli
G. Peach
L. Cherkasova
R. Aitken
G. Parmer
October 2019  
Breaking Band: A Breakdown of High-performance Communication   R. Zambre
M. Grodowitz
A. Chandramowlishwaran
P. Shamis
August 2019  
Analysis Demand Forecasting of Residential Energy Consumption at Multiple Time Scales 
IM 2019
P. Amin
L. Cherkasova
R. Aitken
V. Kach
April 2019  
Stretch: Balancing QoS and Throughput for Colocated Server Workloads on SMT Cores   A. Margaritov
S. Gupta
R. Gonzalez-Alberquilla
B. Grot
February 2019 Read blog
Open-Source Shared Memory implementation of the HPCG benchmark: analysis, improvements and evaluation on Cavium ThunderX2   D. Ruiz
F. Mantovani
M. Casas
F. Spiga
J. Labarta
July 2018  
Persistency for Synchronization-Free Regions   V. Gogte
S. Diestelhorst
W. Wang
S. Narayanasamy
P. M. Chen
T. F. Wenisch
June 2018 Read blog
SynchroTrace: Synchronization-aware Architecture-agnostic Traces for Light-Weight Multicore Simulation of CMP and HPC Workloads   K. Sangaiah
M. Lui
R. Jagtap
S. Diestelhorst
S. Nilakantan
A. More
B. Taskin
M. Hempstead
March 2018 Read blog
Crossing the Architectural Barrier: Evaluating Representative Regions of Parallel HPC Applications   A. Ferrerón
R. Jagtap
S. Bischoff
R. Ruşitoru
March 2018 Read blog
Integrating DRAM Power-Down Modes in gem5 and Quantifying their Impact   R. Jagtap
M. Jung
W. Elasser
C. Weis
A. Hansson
N. When
March 2018 Read blog