The 6th Workshop on Machine Learning and Systems (EuroMLSys)

co-located with EuroSys '26

April 27, 2026, Edinburgh, United Kingdom


The recent wave of research focusing on machine learning and artificial intelligence and its applications has been fuelled by both hardware improvements, ML Compilers and deep learning frameworks that simplify the design, training, inference of neural models. Over the past five years, EuroMLSys was held in conjunction with EuroSys (https://www.euromlsys.eu). The program included topics such as neural model and inference performance optimisation, and compilation for ML workloads. The past workshops were successful, with high quality papers and good size audiences.

This year’s workshop will provide a platform for discussing emerging trends in building frameworks, programming models, optimisation algorithms, and software engineering to support AI/ML applications. At the same time, using ML for building such frameworks or optimisation tools will be discussed. Recent emergence of LLM and agentic systems is remarked by their substantial computational requirements, and optimisation in every possible part of the system will be important. EuroMLSys aims to bridge the gap between AI research and practice, through a technical program of fresh ideas on software infrastructure, tools, design principles, and theory/algorithms, from a systems perspective. We will also explore potential applications that will take advantages of ML.

News

  • Keynote speaker is announced! Laurent Bindschaedler (the Max Planck Institute) will give a talk on What Survives When Code Doesn't?.
  • The workshop program is up! It will start at 9:00 am.

Key dates

  • Paper submission deadline: February 15, 2026 (23:59 AoE) February 24, 2026 (23:59 AoE)
  • Acceptance notification: March 17, 2026 March 22, 2026
  • Final paper due: April 2, 2026 April 10, 2026
  • Workshop: April 27, 2026 (full-day workshop)

Past Editions

Call for Papers

A growing area of interest in machine intelligence is at the intersection of AI/ML and systems design. At the same time, applications of ML are growing in complexity and so is the volume of data they produce/consume. For computer systems to scale, new learning approaches and advanced optimisation techniques are needed. We also need to understand better the current AI/ML frameworks, in terms of their functionality, limitations, and target applications. This will clarify potential desired functions and future architectures. Novel machine learning methods to optimise and accelerate software and hardware systems must also be developed.

EuroMLSys is an interdisciplinary workshop that brings together researchers in computer architecture, systems and machine learning, along with practitioners who are active in these emerging areas.

Topics of interest include, but are not limited to, the following:

  • Scheduling algorithms for data processing clusters
  • Custom hardware for machine learning
  • Hardware-efficient ML methods
  • Accelerators/GPU optimisation
  • LLM-based hardware design or system optimisation techniques
  • Programming languages for machine learning
  • Benchmarking systems (for machine learning algorithms)
  • Synthetic input data generation for training
  • Systems for training and serving machine learning models at scale
  • Graph neural networks
  • Neural network compression and pruning in systems
  • Large scale distributed learning algorithms in practice
  • Database systems for large scale learning
  • Systems for model-free and model-based Reinforcement Learning
  • Optimisation in end-to-end deep learning
  • System optimisation using Bayesian Optimisation
  • Use of probabilistic models in ML/AI application
  • Analysis of distributed ML algorithms
  • Probabilistic modelling for distributed ML algorithms
  • Synchronisation and state control of distributed ML algorithms
  • ML Compiler Optimisation
  • Optimisation in Large Language Model (LLM)
  • Agentic Systems

Accepted papers will be published in the ACM Digital Library (you can opt out from this).

Program

Program timezone is BST (UTC+1.00).

08:50 Opening
09:00 Session 1: LLM 1: Inference, Memory, agent (11mins x 8) Chair (Eiko Yoneki - Unicersity of Cambridge)
ClawVM: Harness-Managed Virtual Memory for Stateful Tool-Using LLM Agents Mofasshara Rafique (Ferring/MPI)
Sampling Where It Matters: Predicting LLM Serving Performance Emile Aydar (IBM Research)
PROTEUS: SLA-Aware Routing via Lagrangian RL for Multi-LLM Serving Systems Amit Singh Bhatti (Phi Labs, Quantiphi)
Accuracy Is Speed: Towards Long-Context-Aware Routing for Distributed LLM Serving Takeshi Yoshimura (IBM Research Tokyo)
Dealing With The Elephant in the KV Cache: Video Frame Sampling for Multimodal LLM Inference Konstantinos Papaioannou (U. Madrid & IMDEA)
Towards a Solution to the Management Scaling Paradox in Distributed LLM Inference Amir Noohi (U. of Edinburgh)
Asynchronous Verified Semantic Caching for Tiered LLM Architectures Asmit Kumar Singh (Apple)
Pooling Engram Conditional Memory in Large Language Models using CXL Ruiyang Ma (Peking University)
10:30 Coffee Break / Poster Session (Browsing)
11:00 Session 2: LLM2: Inference, Hardware (11mins x 8) Chair (Paul Patras - University of Edinburgh)
SwiftNPU: Scalable Shape-Flexible Allocation for Inter-Core Connected NPUs Gangmin Lee (KAIST)
Cost-Efficient Training and Checkpointing for Large Models on Preemptible Cloud VMs Omkar Desai (Syracuse Univ.)
Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching Hiroki Matsutani (Keio University)
GPU Memory and Utilization Estimation for Training-Aware Resource Management: Opportunities and Limitations Ehsan Yousefzadeh-Asl-Miandoab (IT Univ. Copenhagen)
GPU Acceleration of Sparse Fully Homomorphic Encrypted DNNs Lara D’Agata (U. Glasgow)
Bridging CPU and GPU Autoscaling for Cost-Efficient Inference Serving Mehran Salmani (TU Ilmenau)
Where the Time Goes: Analysis of a Public LLM Serving System Büsra Kataray Demiray (HES-SO)
Characterizing Energy and Performance for Distributed Training of Large Language Models Zhendong Zhang (ETH)
12:30 Introduction to Poster Exhibition (Full day)
12:30 Lunch Break / Poster Session (Browsing)
14:00 Keynotes : What Survives When Code Doesn't? Laurent Bindschaedler (MPI)
14:55 Session 3: Security, Privacy (11mins x 3) Chair (TBD)
Systems-Level Attack Surface of Edge Agent Deployments on IoT Zhonghao Zhan (Imperial College London)
Peeling the Layers of Privacy-Utility Onion on Tabular Data Jiawei Wang (U. Southampton)
Towards Practically-Secure Tools for AI Agents Justus Adam (Brown University)
15:35 Coffee Break / Poster Session
16:00 Session 4: RL, Post training, Edge (11mins x 7) Chair (Luo Mai - University of Edinburgh)
EARL: Efficient Agentic RL Post-Training for LLMs under Dynamic Context Lengths Zheyue Tan (Aalto University)
Why Smaller Is Slower? Dimensional Misalignment in Compressed LLMs Jihao Xin (KAUST)
LayoutBench: Performance Benchmarking of Cloud Storage Layouts for Multimedia Data Debopam Sanyal (Georgia Tech)
Hardware-Aware Co-Design of Multi-Chip LLM Serving via Performance Modeling Suyeol Lee (FuriosaAI)
STEER: Software Toolkit for Edge Efficient Retraining Konstantina Orfanou (Univ. Crete)
From Code to Execution: Multi-Agent GraphRAG for Automated Artifact Generation Amirhossein Layegh (KTH)
SHARD: A Compatibility Framework for Deploying Transformer Models on Edge NPUs Adhitya Mohan (U. Colorado Boulder)
17:17 Closing
Posters Session
Opinion Depolarization in Social Networks using GNNs
DisCEdge: Distributed Context Management for Large Language Models at the Edge
GeoServe: Leveraging Disaggregated Data Processing for Scalable Geospatial Model Serving
Towards On-the-Fly Snapshot Memory Compression for Low-Latency Elastic Inference Serving Systems
Probabilistic Sampling-Enhanced Temporal-Spatial GCN: A Scalable Framework for Transaction Anomaly Detection in Ethereum Networks
Robust Ultra Low-Bit Post‑Training Quantization via Stable Diagonal Curvature Estimate
A Case for a Simulation-Driven Exploration of Distributed GenAI Platforms
Reducing Language Model Inference Latency using CPU-Assisted Serving
Towards Graph-Based Detection of Jailbreak and Prompt-Leakage Attacks in LLMs
All is Not Lost: LLM Recovery without Checkpoints
Both Ends Count! Just How Good are LLM Agents at Text-to-"Big SQL"?
Block-Aware Distributed Data Pipelines for Out-of-Core Tabular Machine Learning
With a Hop, Skip, and a Prefill: How Benchmark Volatility Distorts the Accuracy of Long-Context Benchmarks and How To Combat It
ClawMobile: Rethinking Smartphone-Native Agentic Systems
DFS: Dynamic Flow Spraying with Bounded Reordering for AI Training Clusters
Orbit: Efficient Agentic Inference using Priority Scheduling
Revisiting Disaggregated Large Language Model Serving for Performance and Energy Implications
Harnessing Idle Compute at the Edge for Foundation Model Training
Cost-Aware Model Orchestration for LLM-based Systems
LLM-based AIOps via Log Prioritization in Air-Gapped Systems
Balancing Compute in LLM Inference: Model Selection, Quantization, and Test-Time Scaling
Scalable Federated Learning for Scientific Foundation Models on Leadership-Class Systems
Before the First Token: Benchmarking Data Preprocessing in Vision-Language Models
OpenMCP: an open-source self-hosted benchmarking harness for MCP-enabled computer use agents
The Cost of Expertise: Understanding MoE Decode Performance
AgenTEE: Confidential LLM Agent Execution on Edge Devices
SCALER: Sensitivity-Centric Adaptive Layer Execution & Runtime Mapping for Hybrid Analog-Digital Accelerator
Dynamically Adaptable Ensemble Proxies for Training-Free Neural Architecture Search
RIVA: Leveraging LLM Agents for Reliable Configuration Drift Detection

Submission

Papers must be submitted electronically as PDF files, formatted for 8.5x11-inch paper. Submissions will be up to 6 pages long, including figures, and tables, with 10-point font, in a two-column format. Bibliographic references are not included in the 6-page limit. Submitted papers must use the official SIGPLAN Latex / MS Word templates.

Submissions will be single-blind.
You may include an appendix (there is no page limit). However, it is optional, and we cannot guarantee that the reviewers will read it.
The appendix should be included in the same PDF as the main paper, starting on a new page immediately after the bibliography

Submit your paper at: https://euromlsys26.hotcrp.com/paper/new

Keynote

  • Laurent Bindschaedler

    14:00 Laurent Bindschaedler Research Group Leader at the Max Planck Institute for Software Systems

    What Survives When Code Doesn't?

    Large language models have significantly reduced the cost of code generation. An increasing share of code is now produced with AI assistance, and developers increasingly treat implementations as disposable rather than precious. Code is moving away from its traditional role as the primary durable artifact in software development. Yet code was never purely about implementation. The maintained codebase has historically served as the concrete foundation for four essential guarantees: the system’s intended behavior (intent), what it carries forward across executions (state), how behavior is organized at runtime (composition), and what it may change in the external environment (effect). A skilled developer using AI still upholds these guarantees through review and expertise. But as human oversight diminishes, whether through autonomous agents, no-code tools, or contexts where no expert review layer exists, the guarantees become fragmented and difficult to enforce. Current agent frameworks seldom provide unified contracts for them. This talk argues that these four guarantees must be made explicit as enforceable contracts, forming the outline of a computational model for agentic software. The ML-systems community, with its roots in operating systems, databases, and distributed computing, is well positioned to lead this effort, and relevant components are already emerging across the field. Many of the pieces exist. The blueprint does not.

    Bio: Laurent Bindschaedler is a Research Group Leader at the Max Planck Institute for Software Systems, where he leads the Data Systems Group. His research sits at the intersection of operating systems, databases, and machine learning, with recent work on abstractions for long-horizon LLM agents, transactional semantics for agent tool use, and benchmarks for agentic workflows. He holds a PhD from EPFL and was a postdoctoral fellow at MIT CSAIL. His work has been published at SOSP, ASPLOS, EuroSys, EMNLP, and NDSS. Website

Sponsors


Committees

Workshop and TPC Chairs

Technical Program Committee

  • Aaron Zhao, Imperial College London
  • Abhishek Dharmaratnakar, Google
  • Ahmed Sayed, Queen Mary University of London
  • Alec Diallo, University of Edinburgh
  • Alexandros Koliousis, Northeastern University London and Institute for Experiential AI
  • Amir Payberah, KTH
  • Amitabha Roy, Google
  • Andy Twigg, Google
  • Bo Zhao, Aalto University
  • Chi Zhang, Meta
  • Christos Bouganis, Imperial College London
  • Chunwei Xia, University of Leeds
  • Daniel Goodman, Orcle
  • Daniel Mendoza, Stanford University
  • Dawei Li, Amazon
  • Debanshu Das, Google
  • Deepak George Thomas, Tulane University
  • Dimitris Chatzopoulos, University College Dublin
  • Fiodar Kazhamiaka, Microsoft
  • Guilherme H. Apostolo, Vrije Universiteit Amsterdam
  • Jiayi Nie, University of Cambridge
  • Jiwon Seo, Seoul National University
  • Joana Tirana, University College Dublin
  • Jon Crowcroft, University of Cambridge
  • Jose Cano Reyes, University of Glasgow
  • Laurent Bindschaedler, MPI
  • Luo Mai, University of Edinburgh
  • Mengying Zhou, Shanghai University of Finance and Economics
  • Nikolas Ioannou, Google
  • Pedro Gimenes, Imperial College London
  • Pedro Silvestre, Imperial College London
  • Peter Pietzuch, Imperial College London
  • Peter Triantafillou, University of Warwick
  • Pinar Tözün, IT University of Copenhagen
  • Pouya Hamadanian, MIT
  • Sam Ainsworth, University of Edinburgh
  • Sami Alabed, Deepmind
  • Smit Hinsu, Google
  • Srivaths Ranganathan, Google
  • Swapnil Gandhi, Stanford University
  • Taiyi Wang, University of Cambridge
  • Thaleia Dimitra Doudali, IMDEA
  • Tobias Grosser, University of Cambridge
  • Valentin Radu, University of Sheffield
  • Veljko Pejovic, University of Ljubljana
  • Xupeng Miao, Peking University
  • Youhe Jiang, University of Cambridge
  • Zheng Wang, University of Leeds
  • Zhihao Jia, CMU
  • Zhiqiang Xie, Stanford University

Web Chair

  • Alexis Duque, Net AI

Contact

For any question(s) related to EuroMLSys 2025, please contact the TPC Chairs Eiko Yoneki and Paul Patras.

Follow us on Twitter: @euromlsys