Florent Capelli
UniversitΓ© dβArtois - CRIL
Back to School Conference
October 06, 2023
Specialized in AI, from GOFAI to ML.
http://www.cril.univ-artois.fr/
Interns, PhD students welcome!
Reach out at capelli@cril.fr.
Lens is roughly the same distance from Paris Gare du Nord than Saclay in the SNCF topology.
Knowledge is a central notion for AI:
Rich fields of research:
Data + Knowledge = Knowledge Base
Propositional Knowledge Bases:
Reasoning tasks of various nature:
This talk is about a specific topic tagged as AI but which has almost nothing to do with ChatGPT.
While ChatGPT represents knowledge and reasons on it in a way, it is merely an illusion:
Knowledge base $\mathcal{K} \subseteq 2^\mathcal{P}$ for a finite set of propositions $\mathcal{P}$.
Implicit representation
Explicit representation
Looking for tradeoffs
60s
Wrap up:
A Propositional Knowledge Base $\mathcal{K}$ is a subset of $2^{\mathcal{P}}$.
This is a Boolean function: $\{0,1\}^{\mathcal{P}} \rightarrow \{0,1\}$
How can we represent Boolean functions?
$F = \bigwedge (\bigvee \ell)$ where $\ell$ is a literal $x$ or $\neg x$ for some variable $x$.
Examples:
$F_1=(x \vee \neg y) \wedge (\neg x \vee y)$
$x$ | $y$ | $F_1$ |
$0$ | $0$ | $1$ |
$0$ | $1$ | $0$ |
$1$ | $0$ | $0$ |
$1$ | $1$ | $1$ |
$F_2=(x \vee \neg z) \wedge (\neg x \vee y) \wedge (x \vee y \vee z)$
$x$ | $y$ | $z$ | $F_2$ |
$1$ | $1$ | $1$ | $1$ |
$0$ | $1$ | $0$ | $1$ |
$1$ | $1$ | $0$ | $1$ |
$*$ | $*$ | $*$ | $0$ |
CNF formulas are extremely simple yet can encode many interesting problems.
Cook, Levin, 1971: The problem SAT of deciding whether a CNF formula is satisfiable is NP-complete.
Valiant 1979: The problem #SAT of counting the satisfying assignment of a CNF formula is #P-complete.
Not very interesting for reasoning tasks.
Research has focused on factorized representation.
Data structure based on decision nodes to represent β$(x+y+z)$ is evenβ.
Path for $x=1$, $y=0$ and $z=1$ is accepting.
Previous data structure are Ordered Binary Decision Diagrams.
Letβs draw an OBDD that detects whether a matrix $x_{i,j}$ with $1 \leq i, j \leq 3$ has a row full of $1$.
How many $3 \times 3$ $\{0,1\}$-matrices have a row full of ones?
$Row_1=111$: $2^6=64$ matrices
$Row_1 \neq 111, Row_2=111$: $(2^3-1) \times 2^3=56$ matrices
$Row_1 \neq 111, Row_1 \neq 111, Row_3=111$ $(2^3-1) \times (2^3-1) = 49$ matrices
Total: $169$
This idea can be generalized to any OBDDs:
Let $f \subseteq \{0,1\}^X$ be a function computed by an OBDD having $E$ edges. We can compute $\#f$ with $O(E)$ arithmetic operations.
Generalises to many tasks:
Good candidate for representing Boolean functions!
Orders of variables matters a lot:
$f_n(M,s) = (s \wedge ROW_n(M)) \vee (\neg s \wedge COL_n(M))$
Every OBDD computing $f_n$ has size $\geq 2^{O(n)}$.
Same as OBDD but variables may be tested in different order on different path as long as they are tested at most once on every path.
Advantages: more succinct
Drawbacks:
60s
Wrap up:
CNF : $\bigwedge \bigvee \ell$ are natural, powerful but not tractable
Knowledge compilation: amortize the compilation (offline) phase during the query (online) phase
Many choices are possible: OBDD, FBDD, and many many others. Depends on what we want to do.
Notation | Query | Explanation |
---|---|---|
CO | Consistency check | Is D satisfiable? |
VA | Validity check | Is D a tautology? |
CE | Clause entailment | does D[Ο] is sat? |
SE | Sentential entailment | does D_{1}βββD_{2}? |
CT | Model counting | how many solutions has D? |
ME | Model enumeration | Enumerate the solutions of D. |
Β | CO | VA | CE | SE | CT | ME |
---|---|---|---|---|---|---|
DNNF | β | Γ | β | Γ | Γ | β |
d-DNNF | β | β | β | Γ | β | β |
dec-DNNF | β | β | β | Γ | β | β |
FBDD | β | β | β | Γ | β | β |
OBDD | β | β | β | β | β | β |
Exhaustive DPLL with Caching based on Shannon Expansion:
$F = (x \vee y \vee z) \wedge (x \vee \neg y \vee \neg z) \wedge (\neg x \vee \neg y \vee \neg z) \wedge (\neg x \vee y \vee z)$
This scheme is parameterized by:
For many tasks, such as model counting, it is interesting to detect syntactic decomposable part of the formula, that is:
$F(X) = G(Y) \wedge H(Z)$ and $Y \cap Z = \emptyset$
Cachet
SharpSAT
DSharp
D4
SDD
c2d
CUDD
for manipulating Decision Diagrams.ADDMC
D4
is a top-down compiler as shown earlier:
Is it useful to have $\wedge$-gates in practice?
Yes, exponential gain in circuit size on some instances:
There is a family $(f_n)_{n \in \mathbb{N}}$ of Boolean functions such that any FBDD computing $f_n$ has size at least $2^{n}$ but $f_n$ can be computed by a $poly(n)$-sized dec-DNNF.
60s
Wrap up:
Data structure used in KC can be used in other areas of computer science to leverage existing results.
Data stored as relations (tables):
People | Id | Name | City |
---|---|---|---|
1 | Alice | Paris | |
2 | Bob | Lens | |
3 | Carole | Lille | |
4 | Djibril | Berlin |
Capital | City | Country |
---|---|---|
Berlin | Germany | |
Paris | France | |
Roma | Italy |
SQL is a full fledge language, hard to study.
Large class of queries are expressed by a smaller class: conjunctive queries.
People | Id | Name | City |
---|---|---|---|
1 | Alice | Paris | |
2 | Bob | Lens | |
3 | Carole | Lille | |
4 | Djibril | Berlin |
Capital | City | Country |
---|---|---|
Berlin | Germany | |
Paris | France | |
Roma | Italy |
$Q(Id, Name, City, Country) = People(Id, Name, City) \wedge Capital(City, Country)$
Conjunctive queries are queries of the form: $Q(X)=\bigwedge_i R_i(\vec{x_i})$ where
$People(Id, Name, City) \wedge Capital(City, Country)$
Database $\mathbb{D}$: list of relations $R_1^\mathbb{D}\subseteq D^{\vec{x_1}}, \dots, R_p^\mathbb{D}\subseteq D^{\vec{x_p}}$ filled with values in domain $D$
$People^\mathbb{D}= \{(Id: 1,Name: Alice,City: Paris), (Id: 2,Name: Bob,City: Lens)\}$
$City^\mathbb{D}= \{(City: Paris, Country: France), (City: Berlin, Country: Germany)\}$
Defines a new table $Q(\mathbb{D}) \subseteq D^X$ where $\tau \in Q(\mathbb{D})$ if each part of $\tau$ on variables $\vec{x_i}$ are in $R_i^\mathbb{D}$.
$Q(\mathbb{D}) = \{(Id: 1, Name: Alice, City: Paris, Country: France)\}$
CQ correspond to doing JOIN queries in SQL.
Bad new: given a conjunctive query $Q$ and a database $\mathbb{D}$, it is NP-complete to decide whether $Q(\mathbb{D}) \neq \emptyset$!
And yet databases systems solve this kind of queries all the time!
Central class of conjunctive queries because of their tractability.
$R_1(x,y,z) \wedge R_2(x,z,u) \wedge R_3(x,y,t) \wedge R_4(y,t) \wedge R_5(y,v)$
$R(x,y) \wedge S(y,z) \wedge T(z,x)$
Total of $4$ solutions.
The trace of Yannakakis Algorithm on acyclic CQ is a decision-DNNF (non Boolean domain) of size linear in the data.
Datastructures known as βFactorized Databasesβ.
For every acyclic query $Q$ and database $\mathbb{D}$, one can build a decision-DNNF computing $Q(\mathbb{D})$ of size $O(poly(|Q|)|\mathbb{D}|)$.
Knowledge compilation style approach. One can efficently:
Unify existing results and push the hardness in the compilation part.
This compilation results can be used to recover many other results:
BPO problem: $\max_{x_1,\dots,x_n \in \{0,1\}^n} P(x_1,\dots,x_n)$
where $P$ is a polynomial.
Observation: $P$ may be assumed to be multilinear since $x^2 = x$ over $\{0,1\}$
$P = \sum_{e \in E} \alpha_e \prod_{i \in e} x_i$
where $E \subseteq 2^V$
$P(x_1,x_2,x_3) = x_1x_2x_3 - 2x_1x_3 + 3x_1$
$P(1, 0, 0) = 3$ is maximal.
Semi ring: $\mathbb{K} = (K,\oplus, \otimes, 0_\oplus, 1_\otimes)$
$f \subseteq \{0,1\}^X$ Boolean function and $w : X \times \{0,1\} \rightarrow \mathbb{K}$:
$w(f) = \bigoplus_{\tau \in f} \bigotimes_{x \in X} w(x, \tau(x))$
Allows to encode optimization problems on Boolean functions.
For $P := \sum_{e \in E} \alpha_e \prod_{i \in e} x_i$ define: $f_P := \bigwedge_{e \in E} C_e$ where $C_e := Y_e \Leftrightarrow \bigwedge_{i \in e} X_i$
$C_e$ encodes $y_e = \prod_{i \in e} x_i$!
and $w_P$ on $(\mathbb{Q}, \max, +, -\infty, 0)$ as:
Example: $P(x_1,x_2,x_3) = x_1x_2x_3 - 2x_1x_3 + 3x_1$
$\begin{align*} w_P(f_P) & = \max_{\tau \in f_P} w_P(f_\tau) \\ & = w_P(Y_1=0, Y_2=0, Y_3=1, X_1=1, X_2=0, X_3=0) \\ & = 3 \\ & = P(1,0,0) \end{align*}$
$w_P(f_P) = \max P(x_1,\dots,x_n)$ where $f_P = \bigwedge_{e \in E} C_e$.
Try using Algebraic Model Counting for BPO:
Example: solve $\max P(x)$ such that $L < \sum_{x \in X} x < U$:
How?