The /spl phi/ accrual failure detector

作者: N. Hayashibara , X. Defago , R. Yared , T. Katayama

DOI: 10.1109/RELDIS.2004.1353004

关键词: Flexibility (engineering)Binary numberService (systems architecture)AccrualComputer scienceBlock (data storage)Reliability engineeringAbstraction (linguistics)DetectorPrincipal (computer security)

摘要: The detection of failures is a fundamental issue for fault-tolerance in distributed systems. Recently, many people have come to realize that failure ought be provided as some form generic service, similar IP address lookup or time synchronization. However, this has not been successful so far; one the reasons being fact classical detectors were designed satisfy several application requirements simultaneously. We present novel abstraction, called accrual detectors, emphasizes flexibility and expressiveness can serve basic building block implementing Instead providing information binary nature (trust vs. suspect), output suspicion level on continuous scale. principal merit approach it favors nearly complete decoupling between monitoring environment. In paper, we describe an implementation such detector, call /spl phi/ detector. particularity detector dynamically adjusts current network conditions scale which expressed. analyzed behavior our over intercontinental communication link week. Our experimental results show if performs equally well other known adaptive mechanisms, with improved flexibility.

参考文章(27)
Xavier Defago, Naohiro Hayashibara, Peter Urban, Takuya Katayama, On Accrual Failure Detectors The annual research report. ,vol. 2004, pp. 1- 14 ,(2004)
François JN Cosquer, Luís ET Rodrigues, Paulo Veríssimo, None, Using Tailored Failure Suspectors to Support Distributed Cooperative Applications. Parallel and distributed computing and systems. pp. 352- 358 ,(1995)
I. Sotoma, E.R. Mauro Madeira, ADAPTATION - Algorithms to Adaptive Fault Monitoring and their implementation on CORBA international symposium on distributed objects and applications. pp. 219- 228 ,(2001) , 10.1109/DOA.2001.954087
Roy Friedman, Fuzzy group membership Lecture Notes in Computer Science. pp. 114- 118 ,(2003) , 10.1007/3-540-37795-6_21
Francis Chu, Reducing &Ω to ◊ W Information Processing Letters. ,vol. 67, pp. 289- 293 ,(1998) , 10.1016/S0020-0190(98)00122-7
Robbert van Renesse, Yaron Minsky, Mark Hayden, A gossip-style failure detection service Middleware '98 Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing. pp. 55- 70 ,(2009) , 10.1007/978-1-4471-1283-9_4
A. Casimiro, P. Verissimo, Using the timely computing base for dependable QoS adaptation symposium on reliable distributed systems. pp. 208- 217 ,(2001) , 10.1109/RELDIS.2001.970771
P. Stelling, I. Foster, C. Kesselman, C. Lee, G. Von Laszewski, A fault detection service for wide area distributed computations high performance distributed computing. pp. 268- 278 ,(1998) , 10.1109/HPDC.1998.709981