A runtime estimation framework for ALICE

Journal article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listPumma S., Feng W.-C., Phunchongharn P., Chapeland S., Achalakul T.

PublisherElsevier

Publication year2017

JournalFuture Generation Computer Systems: The International Journal of eScience (0167-739X)

Volume number72

Start page65

End page77

Number of pages13

ISSN0167-739X

eISSN1872-7115

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85014604122&doi=10.1016%2fj.future.2017.02.040&partnerID=40&md5=274445d2d39d91217e8507e518ece8d9

LanguagesEnglish-Great Britain (EN-GB)


View in Web of Science | View on publisher site | View citing articles in Web of Science


Abstract

The European Organization for Nuclear Research (CERN) is the largest research organization for particle physics. ALICE, short for ALarge Ion Collider Experiment, serves as one of the main detectors at CERN and produces approximately 15 petabytes of data each year. The computing associated with an ALICE experiment consists of both online and offline processing. An online cluster retrieves data while an offline cluster farm performs a broad range of data analysis. Online processing occurs as collision events are streamed from the detector to the online cluster. This process compresses and calibrates the data before storing it in a data storage system for subsequent offline processing, e.g., event reconstruction. Due to the large volume of stored data to process, offline processing seeks to minimize execution time and data-staging time of the applications via a two-tier offline cluster — the Event Processing Node (EPN) as the first tier and the World LHC Grid Computing (WLGC) as the second tier. This two-tier cluster requires a smart job scheduler to efficiently manage the running of the application. Thus, we propose a runtime estimation method for this offline processing in the ALICE environment. Our approach exploits application profiles to predict the runtime of a high-performance computing (HPC) application without the need for any additional metadata. To evaluate our proposed framework, we performed our experiment on the actual ALICE applications. In addition, we also test the efficacy of our runtime estimation method to predict the run times of the HPC applications on the Amazon EC2 cloud. The results show that our approach generally delivers accurate predictions, i.e., low error percentages. © 2017 Elsevier B.V.


Keywords

ALICE experimentBerkeley DwarfsOffline schedulingRuntime estimationScheduler


Last updated on 2023-25-09 at 07:35