作者: Martin Grohe , Peter Lindner , Nofar Carmeli , Christoph Standke
DOI:
关键词:
摘要: Probabilistic databases (PDBs) are probability spaces over database instances. They provide a framework for handling uncertainty in databases, as occurs due to data integration, noisy data, from unreliable sources or randomized processes. Most of the existing theory literature investigated finite, tuple-independent PDBs (TI-PDBs) where occurrences tuples independent events. Only recently, Grohe and Lindner (PODS '19) introduced independence assumptions beyond finite domain assumption. In major argument discussing theoretical properties TI-PDBs is that they can be used represent any PDB via views. This no longer case once number countably infinite. this paper, we systematically study representability infinite terms related block-independent disjoint PDBs. The central question which representable first-order views We give necessary condition sufficient criterion distribution PDB. With various examples, explore limits our criteria. show conditioning on first order yields additional power expressivity. Finally, discuss relation between purely logical arithmetic reasons (non-)representability.