摘要: The Big Data problem is characterized by the so called 3V features: Volume - a huge amount of data, Velocity high data ingestion rate, and Variety mix structured semi-structured unstructured data. state-of-the-art solutions to are largely based on MapReduce framework (aka its open source implementation Hadoop). Although Hadoop handles volume challenge successfully, it does not deal with variety well since programming interfaces associated processing model inconvenient inefficient for handling graph data.This paper presents epiC, an extensible system tackle Data's challenge. epiC introduces general Actor-like concurrent model, independent models, specifying parallel computations. Users process multi-structured datasets appropriate extensions, best suited type auxiliary code mapping that into epiC's model. Like Hadoop, programs written in this way can be automatically parallelized runtime takes care fault tolerance inter-machine communications. We present design also two customized optimized extension relational top epiC. Experiments demonstrate effectiveness efficiency our proposed