Peer-to-peer fault tolerance framework for a grid computing system

Conference proceedings article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listTangmankhong T., Siripongwutikorn P., Achalakul T.

PublisherHindawi

Publication year2012

Start page379

End page384

Number of pages6

ISBN9781467319218

ISSN0146-9428

eISSN1745-4557

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84866384303&doi=10.1109%2fJCSSE.2012.6261983&partnerID=40&md5=5730abab8180cbe12fe7177b51e07b39

LanguagesEnglish-Great Britain (EN-GB)


View on publisher site


Abstract

A grid computing system provides high performance computing power, large storage space, or high communication bandwidth, to suit user requirements. The major concern in a grid computing system is the reliability, as a single node failure fails all running applications on the node. We proposed a fault-tolerance framework to improve the reliablity of a grid system. The proposed framework is novel in the sense that it uses the peer-to-peer replication model instead of a traditional client-server replication model, which reduces the replication time overhead and provides better degree of resiliency. Essentially, the checkpoint data file is split into chunks and distributed among a number of backup peers in parallel such that each chunk is replicated at two backup nodes. Moreover, the survival of the backup with the backup data redundancy in case of any one of the backup nodes in the group fails is also maintained. Detailed algorithms of modules of the complete framework are provided including group-forming, fault detection, replication, and fault recovery. Comparative performance evaluation of the replication time between the proposed peer-to-peer model and the client-server model has been conducted by using simulation over a wide range of chunk sizes and checkpoint data size. Our results show that, for a large enough chunk size, the replication time of the peer-to-peer replication model is reduced by half compared to that of the client-server model. ฉ 2012 IEEE.


Keywords

Fault toleranceGrid reliabilityP2P backup


Last updated on 2023-17-10 at 07:35