MBFS: a parallel metadata search method based on Bloomfilters using MapReduce for large-scale file systems

作者: Zhisheng Huo , Limin Xiao , Qiaoling Zhong , Shupan Li , Ang Li

DOI: 10.1007/S11227-015-1464-2

关键词:

摘要: The metadata search is an important way to access and manage file systems. Many solutions have been proposed tackle performance issue of search. However, the existing build a separate index at internal or external system through related data structure database use semantics event-notification method construct structure, utilize sampling-based conduct direct on namespace, face problems high I/O overhead for maintaining consistency between indexes metadata, enormous space storing low accuracy results so on. To address these problems, this paper presents MBFS, fast, accurate lightweight based multi-dimensional Bloomfilters. We create Bloomfilter basis directory entry that can prune sub-trees narrow scope namespace. MBFS capable producing fast answers class complex over after consuming small number disk accesses. residing in does not need additional maintain consistency. consists Bloomfilters which are composed bits, it consumes marginal overhead. Moreover, employs MapReduce speeding up under environment multiple servers. Extensive experiments conducted prove effectiveness MBFS. experimental show achieve excellent only latency, but also with time

参考文章(43)
Philip Schwan, Andrew J. Hutton, Lustre: Building a File System for 1,000-node Clusters ,(2003)
Quan Zhang, Dan Feng, Fang Wang, Sen Wu, Mlock: building delegable metadata service for the parallel file systems Science China Information Sciences. ,vol. 58, pp. 1- 14 ,(2015) , 10.1007/S11432-014-5194-5
Shankar Pasupathy, Andrew W. Leung, Timothy Bisson, Ethan L. Miller, Minglong Shao, Spyglass: fast, scalable metadata search for large-scale storage systems file and storage technologies. pp. 153- 166 ,(2009)
Masanori Takata, Atsushi Sutoh, Event-notification-based inactive file search for large-scale file systems asia pacific magnetic recording conference. pp. 1- 7 ,(2012)
Scientific and Statistical Database Management Lecture Notes in Computer Science. ,vol. 5566, ,(2009) , 10.1007/978-3-642-02279-1
Alexander Szalay, New Challenges in Petascale Scientific Databases statistical and scientific database management. pp. 1- 1 ,(2008) , 10.1007/978-3-540-69497-7_1
Andrei Broder, Michael Mitzenmacher, Network Applications of Bloom Filters: A Survey Internet Mathematics. ,vol. 1, pp. 485- 509 ,(2004) , 10.1080/15427951.2004.10129096
Yu Hua, Hong Jiang, Dan Feng, FAST: near real-time searchable data analytics for the cloud ieee international conference on high performance computing data and analytics. pp. 754- 765 ,(2014) , 10.1109/SC.2014.67
Ajahar Ismailkha Pathan, Amit Sinhal, None, Encode Decode Linux based Partitions to Hide and Explore File System International Journal of Computer Applications. ,vol. 75, pp. 40- 45 ,(2013) , 10.5120/13167-0896