Subtleties in tolerating correlated failures in wide-area storage systems

作者： Srinivasan Seshan , Suman Nath , Haifeng Yu , Phillip B. Gibbons

DOI:

关键词:

摘要: High availability is widely accepted as an explicit requirement for distributed storage systems. Tolerating correlated failures a key issue in achieving high today's wide-area environments. This paper systematically revisits previously proposed techniques addressing failures. Using several real-world failure traces, we qualitatively answer four important questions regarding how to design systems tolerate such Based on our results, identify set of principles that system builders can use We show these lessons be effectively used by incorporating them into IrisStore, read-write layer provides availability. Our results using IrisStore the PlanetLab over 8-month period demonstrate its ability withstand large and meet preconfigured targets.

microsoft.com 本地加速

uni-trier.de 本地加速

usenix.org 本地加速

microsoft.com PDF 下载加速

usenix.org PDF 下载加速

参考文章(39)

William J. Bolosky, John R. Douceur, David Ely, Marvin Theimer, Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs measurement and modeling of computer systems. ,vol. 28, pp. 34- 43 ,(2000) , 10.1145/339331.339345

Andreas Haeberlen, Peter Druschel, Alan Mislove, Glacier: highly durable, decentralized storage despite massive correlated failures networked systems design and implementation. pp. 143- 158 ,(2005) , 10.5555/1251203.1251214

Jianbo Shi, J. Malik, Normalized cuts and image segmentation IEEE Transactions on Pattern Analysis and Machine Intelligence. ,vol. 22, pp. 888- 905 ,(2000) , 10.1109/34.868688

Antony Rowstron, Peter Druschel, Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility symposium on operating systems principles. ,vol. 35, pp. 188- 201 ,(2001) , 10.1145/502034.502053

Jian Yin, Jean-Philippe Martin, Arun Venkataramani, Lorenzo Alvisi, Mike Dahlin, Separating agreement from execution for byzantine fault tolerant services symposium on operating systems principles. ,vol. 37, pp. 253- 267 ,(2003) , 10.1145/1165389.945470

Will E. Leland, Murad S. Taqqu, Walter Willinger, Daniel V. Wilson, On the self-similar nature of Ethernet traffic acm special interest group on data communication. ,vol. 25, pp. 183- 193 ,(1993) , 10.1145/166237.166255

Haifeng Yu, Amin Vahdat, Consistent and automatic replica regeneration ACM Transactions on Storage. ,vol. 1, pp. 3- 37 ,(2005) , 10.1145/1044956.1044958

D. Tang, R.K. Iyer, Analysis and modeling of correlated failures in multicomputer systems IEEE Transactions on Computers. ,vol. 41, pp. 567- 577 ,(1992) , 10.1109/12.142683

H. Weatherspoon, T. Moscovitz, J. Kubiatowicz, Introspective failure analysis: avoiding correlated failures in peer-to-peer systems symposium on reliable distributed systems. pp. 362- 367 ,(2002) , 10.1109/RELDIS.2002.1180211

10.

Srinivasan Seshan, Suman Nath, Exploiting redundancy for robust sensing Carnegie Mellon University. ,(2005)

Subtleties in tolerating correlated failures in wide-area storage systems

来源期刊

我的账户

Subtleties in tolerating correlated failures in wide-area storage systems

来源期刊

相似文章 10

我的账户