Florent Capelli
CRIL, Université d’Artois
ICDT 2026
24 March 2026
Relation : set of tuples, over domain and variables .
| 1 | 2 | 1 |
| 2 | 0 | 1 |
| 0 | 0 | 1 |
| 1 | 2 | 0 |
Table representation
Explicit representation
| 1 | 2 |
| 2 | 0 |
| 0 | 0 |
| 2 | 1 |
| 0 | 1 |
| 2 | 0 |
Result of a (join) query
Implicit
Representation
A relation represents data we want to use.
Stream every tuple to perform an action.
Compute statistics on , compute the size , ratios , aggregate with semiring operation etc.
(Uniform) sampling, enumerate representative tuples etc.
Quickly output the tuple (assuming an order on ), find median tuple etc.
Everything is “easy” if is given as a table, but is it necessary to materialize it?
Complexity varies depending on the representation of .
| Task | Table | Join Query |
|---|---|---|
| Output one tuple in | NP-complete | |
| Output “largest” tuple in | Linear | NP-complete |
| Compute | #P-complete |
We are looking for succinctness/tractability tradeoffs!
| R | x | y | z |
|---|---|---|---|
| 0 | 0 | 0 | |
| 0 | 1 | 0 | |
| 0 | 0 | 1 | |
| 1 | 1 | 1 |
| S | x | y | a |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 2 | |
| 0 | 1 | 1 |
| T | x | z | b |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 3 | |
| 0 | 1 | 2 | |
| 1 | 1 | 1 |
A Join Query is acyclic iff it has a join tree.
At most per tuple in .
At most one tree per tuple in .
At most one tree per tuple in .
| R | x | y | z |
|---|---|---|---|
| 0 | 0 | 0 | |
| 0 | 1 | 0 | |
| 0 | 0 | 1 | |
| 1 | 1 | 1 |
| S | x | y | a |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 2 | |
| 0 | 1 | 1 |
| T | x | z | b |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 3 | |
| 0 | 1 | 2 | |
| 1 | 1 | 1 |
Given an acyclic join query and a join tree for :
Introduced by Dan Olteanu and Jakub Závodný under the name d-representation.
✅ Find a tuple
✅ Find a tuple
✅ Find “optimal” tuple
✅ Enumerate (delay ).
❌ Counting
❌ Many Aggregation tasks.
Unions have disjoint tuples (sometimes called deterministic, or unambiguous unions).
✅ Find a tuple
✅ Enumerate: delay depth of .
We can do better, check [Amarilli, Bourhis, Jachiet, Mengel].
✅ Counting
✅ Counting with semiring weights
❌ coNP-hard to check.
Multi-domain generalization of well-studied Boolean circuits.
| Gates | Boolean Domain | Enum | Count |
|---|---|---|---|
| NNF | ❌ | ❌ | |
| DNNF | ☑ | ❌ | |
| d-DNNF | ☑ | ✅ | |
| dec-DNNF | ✅ | ✅ | |
| FBDD | ✅ | ✅ |
For convenience, we may allow constants:
Design algorithm on the circuit directly:
Consequence for Acyclic Join Queries: counting in linear time (data complexity), enumeration in constant time after linear preprocessing.
Circuits useful to design algorithms, but also for lower bounds.
The answers of a JQ can be enumerated with linear preprocessing and constant delay for every DB
iff is acyclic assuming the Boolean Matrix Multiplication Conjecture.
[Bagan, Durand, Grandjean] Conditional lower bound, for any algorithm
vs
If is not acyclic, there exists s. t. for every , there is a database of size , such that any circuit computing the answers of over is of size at least .
Unconditional lower bound, but targeting a large class of algorithms.
| R | x | y | z |
|---|---|---|---|
| 0 | 0 | 0 | |
| 0 | 1 | 0 | |
| 0 | 0 | 1 | |
| 1 | 1 | 1 |
| S | x | y | a |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 2 | |
| 0 | 1 | 1 |
| T | x | z | b |
|---|---|---|---|
| 0 | 0 | 1 | |
| 0 | 0 | 3 | |
| 0 | 1 | 2 | |
| 1 | 1 | 1 |
Start fresh
Set to the first tuple in
On the remaining variables, has two connected components. Proceed with before
Pick value for and continue with .
Backtrack and pick for , continue with .
Do the same with .
Backtrack and set .
Recursive call on . AGAIN??! Better reuse!
Construct .
Draw the rest of the owl!
Use caching to “reverse” the dynamic programming :
Wait, do we even need the join tree? No! Only an order
Fix an order on :
builds a circuit of size at most for some .
In data complexity, but polynomial in combined complexity.
Similar result using Yannakakis:
Existentially project variables:
If at the end of the order:
d4,
sharpsat-td, ExactMC etc.
Negation interpreted over a given domain :
| 0 | 1 | 0 |
| 0 | 0 | 0 |
| 0 | 0 | 1 |
| 0 | 1 | 1 |
| 1 | 0 | 0 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
| 1 | 1 | 1 |
, domain .
Positive encoding: preprocessing
| 0 | 0 | 0 |
linear sized circuit
non-linear sized circuit ()
Subqueries may be harder to solve than the query itself!
Equivalent to if
non-linear circuit
Circuit construction for implies circuit construction for for every !
Exhaustive DPLL works out of the box in this case, but can be improved:
| 0 | 0 | |
| 1 | 0 | |
| 1 | 1 |
| 0 | 0 | |
| 1 | 0 | |
| 1 | 1 |
| 0 | 0 | |
| 1 | 0 | |
| 1 | 1 |
| 1 | 1 | 1 | 1 |
SJQ. Fix an order on :
builds a circuit of size at most for some .
In data complexity but polynomial wrt combined complexity.
See survey Tractable Circuits in Database Theory, Amarilli and C. shameless plug
Thank you! Enjoy ICDT26!