Show simple item record

dc.date.accessioned2024-04-12T11:16:54Z
dc.date.available2024-04-12T11:16:54Z
dc.date.issued2024-01-18
dc.identifierdoi:10.17170/kobra-202404109953
dc.identifier.urihttp://hdl.handle.net/123456789/15657
dc.description.sponsorshipGefördert im Rahmen des Projekts DEAL
dc.language.isoeng
dc.rightsNamensnennung 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectMalleable runtime systemeng
dc.subjectMalleable job schedulingeng
dc.subjectAPGAS Introductioneng
dc.subject.ddc004
dc.titleOn the Performance of Malleable APGAS Programs and Batch Job Schedulerseng
dc.typeAufsatz
dcterms.abstractMalleability—the ability for applications to dynamically adjust their resource allocations at runtime—presents great potential to enhance the efficiency and resource utilization of modern supercomputers. However, applications are rarely capable of growing and shrinking their number of nodes at runtime, and batch job schedulers provide only rudimentary support for such features. While numerous approaches have been proposed to enable application malleability, these typically focus on iterative computations and require complex code modifications. This amplifies the challenges for programmers, who already wrestle with the complexity of traditional MPI inter-node programming. Asynchronous Many-Task (AMT) programming presents a promising alternative. In AMT, computations are split into many fine-grained tasks, which are processed by workers. This makes transparent task relocation via the AMT runtime system possible, thus offering great potential for enabling efficient malleability. In this work, we propose an extension to an existing AMT system, namely APGAS for Java. We provide easyto-use malleability programming abstractions, requiring only minor application code additions from programmers. Runtime adjustments, such as process initialization and termination, are automatically managed by our malleability extension. We validate our malleability extension by adapting a load balancing library handling multiple benchmarks. We show that both shrinking and growing operations cost low execution time overhead. In addition, we demonstrate compatibility with potential batch job schedulers by developing a prototype batch job scheduler that supports malleable jobs. Through extensive realworld job batches execution on up to 32 nodes, involving rigid, moldable, and malleable programs, we evaluate the impact of deploying malleable APGAS applications on supercomputers. Exploiting scheduling algorithms, such as FCFS, Backfilling, Easy-Backfilling, and one exploiting malleable jobs, the experimental results highlight a significant improvement regarding several metrics for malleable jobs. We show a 13.09% makespan reduction (the time needed to schedule and execute all jobs), a 19.86% increase in node utilization, and a 3.61% decrease in job turnaround time (the time a job takes from its submission to completion) when using 100% malleable job in combination with our prototype batch job scheduler compared to the bestperforming scheduling algorithm with 100% rigid jobs.eng
dcterms.accessRightsopen access
dcterms.creatorFinnerty, Patrick
dcterms.creatorPosner, Jonas
dcterms.creatorBürger, Janek
dcterms.creatorTakaoka, Leo
dcterms.creatorKanzaki, Takuma
dc.relation.doidoi:10.1007/s42979-024-02641-7
dc.subject.swdArbeitsplanungger
dc.subject.swdFormänderungsvermögenger
dc.subject.swdFlexibilitätger
dc.subject.swdLaufzeitsystemger
dc.type.versionpublishedVersion
dcterms.source.identifiereissn:2661-8907
dcterms.source.issueIssue 4
dcterms.source.journalSN Computer Scienceeng
dcterms.source.volumeVolume 5
kup.iskupfalse
dcterms.source.articlenumberArticle: 349


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Namensnennung 4.0 International
Except where otherwise noted, this item's license is described as Namensnennung 4.0 International