Failure prediction of data centers using time series and Fault Tree Analysis
Conference proceedings article
Authors/Editors
Strategic Research Themes
No matching items found.
Publication Details
Author list: Chalermarrewong T., Achalakul T., See S.C.W.
Publication year: 2012
Start page: 794
End page: 799
Number of pages: 6
ISBN: 9780769549033
ISSN: 1521-9097
eISSN: 1521-9097
Languages: English-Great Britain (EN-GB)
View in Web of Science | View on publisher site | View citing articles in Web of Science
Abstract
This paper proposes a framework for online failure prediction of data centers. A data center often has a high failure rate as it features a number of servers and components. Moreover, long running applications and intensive workloads are common in such facilities. Performance of the system depends on the availability of the machines, which can be easily compromised if failure cannot be handled gracefully. The main idea of this paper is to create an effective prediction model focusing on hardware failure. Accurate prediction may enhance the overall system performance. In this work, we employ two methods, namely, ARMA (Auto Regressive Moving Average) and Fault Tree Analysis. Experiments were then performed on a simulated cluster built based on Simics platform. The results show prediction accuracy of 97%, which is very high. We thus believe that our framework is practical and can be adapted to use in data centers in the future. ฉ 2012 IEEE.
Keywords
Fault management, Fault Tree Analysis, Performance enhancement, Time series prediction