Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/35102

Title: Zeroing memory deallocator to reduce checkpoint sizes in virtualized HPC environments
Authors: Gad, Ramy
Pickartz, Simon
Suss, Tim
Nagel, Lars
Lankes, Stefan
Monti, Antonello
Brinkmann, Andre
Keywords: Virtualization
Checkpoint/restart
Migration
HPC
Issue Date: 2018
Publisher: © Springer
Citation: GAD, R. ... et al, 2018. Zeroing memory deallocator to reduce checkpoint sizes in virtualized HPC environments. The Journal of Supercomputing, 74 (11), pp.6236–6257.
Abstract: Virtualization has become an indispensable tool in data centers and cloud environments to flexibly assign virtual machines (VMs) to resources. Virtualization also becomes more and more attractive for high-performance computing (HPC). This is mainly due to the strong isolation of VMs which enables: (1) the sharing of cluster nodes and optimization of the system’s overall utilization; (2) load balancing by means of migrations due to the reduction of residual dependencies; and (3) the creation of system-level checkpoints increasing the fault tolerance in an application-transparent way. On the downside, the additional virtualization layer conceals information that is only available on the process level. This information has a direct influence on the checkpoint size which should be kept as small as possible. In this paper, we propose a novel technique for checkpoint size reduction in virtualized environments. We exploit the fact that the hypervisor detects zero pages which are omitted when capturing a checkpoint. Moreover, compression techniques are applied for a further reduction of the checkpoint size. We therefore fill freed memory regions with zeros supporting both the zero-page detection and the compression. We evaluate our approach by taking the example of HPC applications. The results reveal a reduction of the checkpoint size by up to 9% when compression is disabled in the hypervisor and up to 49% with compression enabled. Furthermore, memory zeroing is able to reduce VM migration time by up to 10% when compression is disabled and by up to 60% when compression is enabled.
Description: This paper is closed access until 25th August 2019.
Sponsor: This research and development was supported by the Federal Ministry of Education and Research (BMBF) under Grant 01IH13004 (Project FAST) and Grant 01IH16010B (Project Envelope).
Version: Accepted for publication
DOI: 10.1007/s11227-018-2548-6
URI: https://dspace.lboro.ac.uk/2134/35102
Publisher Link: https://doi.org/10.1007/s11227-018-2548-6
ISSN: 0920-8542
Appears in Collections:Closed Access (Computer Science)

Files associated with this item:

File Description SizeFormat
paper.pdfAccepted version649.97 kBAdobe PDFView/Open

 

SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.