PhD Thesis Title:
Towards a Generalized Theory for Deductive Databases with Uncertainty
Abstract:
Uncertainty management has been identified as one of the important
future challenges in database research:
``Further research in uncertainty is essential, as we must learn not only
to cope with data of limited reliability , but do so efficiently , with
massive amounts of data"[Silberschatz et al '91].
Practical considerations
dictate that the
framework used for management of uncertain knowledge
admit efficient implementation and efficient computations. Logic database
programming, with its advantages of modularity and its powerful top-down
and bottom-up query processing techniques has attracted the attention of
researchers, and numerous frameworks
have been proposed in recent years for deductive databases with uncertainty.
These frameworks differ in
(i) their underlying notion of uncertainty,
(ii) the way in which uncertainties are manipulated, and
(iii) the way in which uncertainty is associated with the facts and
rules of a program.
On the basis of (iii), we can classify these
frameworks into implication based (IB)
and annotation based (AB) frameworks. In this research, we propose a generic
framework called the parametric framework as a unifying umbrella for known
IB frameworks. We developed the declarative, fixpoint, and proof-theoretic
semantics of programs in the parametric framework and showed their equivalence.
The semantics developed uses multisets as the underlying structure. Using
the parametric framework as a basis, we studied query optimization problem
of containment of conjunctive queries in this framework, and established
necessary and sufficient conditions for containment for several classes
of parametric conjunctive queries . Our results yield tools for use in
the query optimization for large classes of query programs in known IB
deductive databases with uncertainty. We also used this language as a basis
for studying termination and data complexity of evaluating programs in
the parametric framework. We have identified large classes of programs
whose evaluation is similar to datalog programs,
i.e. PTIME in the size of the input database.
In terms of expressive power , we compare the two approaches to uncertainty
classified in the above. It is shown that the annotated logics are in general
more powerful than the IB frameworks. For instance, the generalized annotated
logic programming framework (GAP) proposed by Kifer and Subrahmanian '92
can simulate the language proposed by van Emden '86, through the annotations.
As the semantics developed for GAP is based on (ideal of) lattices, we
note that it cannot simulate an IB framework
whose semantics is based on multisets. We also note that the parametric
framework with its unifying capability lacks two useful operations of selection
by certainty and join by certainty . To incorporate these operations, we
observed that allowing certainty constraints is all that is needed. We
introduce the extended parametric framework (EP, for short). We established
in [7] that the EP framework is equivalent to a generic AB framework (GAB)
extended with constraints, provided annotation functions allowed in GAB
are subject to certain postulates similar to those imposed on the combination
functions in the (extended) parametric framework. The proof is based on
simulation. We remark that the GAB framework introduced is strictly more
powerful than the existing AB frameworks. This is because of the stronger
form of the selection by join operation provided by the certainty
constraints.
Finally, we have studied efficient fixpoint computations with uncertainty. We
proposed a semi-naive method and established its equivalence with the naive method.
August 1997 -- Concordia University, Dept. of Computer Science