Knowledge Representation Languages
Representing Boolean functions
A Propositional Knowledge Base
is a subset of
.
This is a Boolean function:
How can we represent Boolean functions?
The SAT Problem
CNF formulas are extremely simple yet can encode many interesting
problems.
Cook, Levin, 1971: The problem SAT of deciding
whether a CNF formula is satisfiable is NP-complete.
Valiant 1979: The problem #SAT of counting the
satisfying assignment of a CNF formula is #P-complete.
- Very unlikely that efficient algorithms exists for solving SAT /
#SAT
- Thriving community nevertheless addresses this problem in
practice
- SAT Solver very efficient in many applications
Circuit Based Representations
Research has focused on factorized representation.
Taken from SMBC Comics
Carl von LinnΓ© (1707-1778)
An example
Data structure based on decision nodes to represent
β
is evenβ.
Path for
,
and
is accepting.
OBDDs
Previous data structure are Ordered Binary Decision
Diagrams.
- Directed Acyclic graphs with one source
- Sinks are labeled by
or
- Internal nodes are decision nodes on a variable in
- Variables tested in order.
Row of 1
Letβs draw an OBDD that detects whether a matrix
with
has a row full of
.
Row of 1 (Continued)
How many
-matrices
have a row full of ones?
- Case Analysis:
:
matrices
:
matrices
matrices
Total:
Tractability of OBDDs
This idea can be generalized to any OBDDs:
Let
be a function computed by an OBDD having
edges. We can compute
with
arithmetic operations.
Generalises to many tasks:
- Evaluate
if probabilities
are given for each
- Enumerate
- Find the
element of
in lexicographical orderβ¦
Good candidate for representing Boolean functions!
Limits of OBDDs
Orders of variables matters a lot:
Every OBDD computing
has size
.
FBDD
Same as OBDD but variables may be tested in different order on
different path as long as they are tested at most once on every
path.
Advantages: more succinct
Drawbacks:
- cannot be minimized canonically, nor applied etc.
- actually, not that powerful:
cannot be represented by polynomial size FBDDs.
One Minute to Cool Down
Wrap up:
CNF :
are natural, powerful but not tractable
OBDD are more tractable but may be
exponentially large for simple function
FBDD are more succinct than OBDD but are
less tractable
Relational Databases and queries
Data stored as relations (tables):
|
People
|
Id
|
Name
|
City
|
|
|
1
|
Alice
|
Paris
|
|
2
|
Bob
|
Lens
|
|
|
3
|
Carole
|
Lille
|
|
|
4
|
Djibril
|
Berlin
|
|
Capital
|
City
|
Country
|
|
|
Berlin
|
Germany
|
|
|
Paris
|
France
|
|
|
Roma
|
Italy
|
Query language:
SELECT * FROM People
JOIN Capital ON People.City=Capital.City
|
Results
|
Id
|
Name
|
City
|
Country
|
|
|
1
|
Alice
|
Paris
|
France
|
|
|
4
|
Djibril
|
Berlin
|
Germany
|
SELECT COUNT(*) FROM People
WHERE City NOT IN (SELECT City FROM Capital)
Conjunctive queries
SQL is a full fledge language, hard to study.
Large class of queries are expressed by a smaller class:
conjunctive queries.
|
People
|
Id
|
Name
|
City
|
|
|
1
|
Alice
|
Paris
|
|
2
|
Bob
|
Lens
|
|
|
3
|
Carole
|
Lille
|
|
|
4
|
Djibril
|
Berlin
|
|
Capital
|
City
|
Country
|
|
|
Berlin
|
Germany
|
|
|
Paris
|
France
|
|
|
Roma
|
Italy
|
-
because
-
because:
-
AND
- .
Conjunctive Queries (continued)
Conjunctive queries are queries of the form:
where
-
is a tuple of variables from
-
are relation symbols
Database
:
list of relations
filled with values in domain
Defines a new table
where
if each part of
on variables
are in
.
CQ correspond to doing JOIN queries in SQL.
Hardness of solving conjunctive queries
Bad new: given a conjunctive query
and a database
,
it is NP-complete to decide whether
!
And yet databases systems solve this kind of queries all the
time!
- Query
is usually small wrt
- Join tables following an optimized query plan
- Leverage clever indexing algorithm
- Use clever heuristics based on statistics gathered
earlier
Acyclic queries
Central class of conjunctive queries because of their
tractability.
Every CQ is not acyclic
Yannakakis Algorithm
Start from the join tree
Load data
Filter every tuple in a relation that
cannot be extended to a solution below
Proceed bottom up
Treat every edge of the join
tree
Tuples in the root can be extended to
full solutions
Reconstructing solution x=1, y=1, z=1,
u=0, t=0, v=0
Twisting Yannakakis for Counting
Start from the join tree and
data
Bottom up computation of the number of
extensions
Total of
solutions.
Factorized Databases
Datastructures known as βFactorized Databasesβ.
For every acyclic query
and database
,
one can build a decision-DNNF computing
of size
.
Knowledge compilation style approach. One can efficently:
- decide whether
- compute
- enumerate
Unify existing results and push the hardness in the compilation
part.
Going further
This compilation results can be used to recover many other
results:
- Ranked access: given
and some order on
,
output
in time
- Optimization: find the tuple of
that maximizes a linear function
- Aggregation over a semi-ring where
Knowledge Compilation meets Optimization
Boolean Optimization Problem
BPO problem:
where
is a polynomial.
Observation:
may be assumed to be multilinear since
over
where
Example
is maximal.
Algebraic Model Counting
Semi ring:
-
commutative, associative
- ,
- .
Boolean function and
:
AMC Examples
- If
on
:
- Arctic semi-ring
Allows to encode optimization problems on Boolean functions.
BPO and Boolean Functions
For
define:
where
encodes
!
and
on
as:
-
and
-
for
.
Encoding BPO as Boolean function: an example
Example:
- ,
and
.
BPO as a Boolean Function
where
.
Try using Algebraic Model Counting for BPO:
- compile
into, e.g., OBDD
- compute
in time
.
Rich connection
- Good practical results (e.g.Β using
D4)
- Leverage known tractable classes
of CNF to BPO
- Allows for solving more complex
optimization problems
Example: solve
such that
:
How?
- Construct OBDD
that computes
- Transform
into
so that it computes
- Compute