作者: Zachary G. Ives , Val Tannen , Todd J. Green
DOI:
关键词: Relational database 、 Information retrieval 、 Materialized view 、 World Wide Web 、 Data modeling 、 Query optimization 、 Data exchange 、 XML 、 Computer science 、 Data sharing 、 Data integration
摘要: A key challenge in science today involves integrating data from databases managed by different collaborating scientists. In this dissertation, we develop the foundations and applications of collaborative sharing systems (CDSSs), which address challenge. CDSS allows collaborators to define loose confederations heterogeneous databases, relating them through schema mappings that establish how should flow one site next. addition simply propagating along mappings, it is critical record provenance (annotations describing where originated) support policies allowing scientists specify whose they trust, when. Since a large confederation certain evolve over time, must also efficiently handle incremental changes data, schemas, mappings. We focus dissertation on formal CDSSs, as well practical issues its implementation prototype called Orchestra. We propose novel model appropriate for based framework semiring-annotated relations. This elegantly generalizes number other important database semantics involving annotated relations, including ranked results, prior models, probabilistic databases. describe design Orchestra prototype, supports update propagation across while maintaining filtering according trust policies. investigate fundamental questions query containment equivalence context information. use results these investigations approaches CDSS. Our highlight unexpected connections between two problems with problem optimizing queries using materialized views. Finally, show semiring annotations make sense XML nested relational paving way towards future extension richer models.