Abstractions and Pattern Databases

Source slides: E3 Abstractions, E4 Abstraction Heuristics, E5 Orthogonality and Additivity, E6 Pattern Databases.

Delete relaxation simplified planning by changing the operators. Abstraction uses a different idea: keep the transition system shape, but deliberately forget some distinctions between states. If the abstract system is small enough, we can compute exact goal distances there and use them as heuristic estimates in the original system.

Pattern databases are the standard planning version of this idea. They choose a subset of variables, project every state onto those variables, precompute the abstract distances, and then answer heuristic queries by table lookup.

The Question Abstraction Answers

The concrete state space of a planning task is often too large to solve directly. But not every detail of a state is equally relevant to every estimate. If a delivery task has truck locations, package locations, fuel levels, and road conditions, one abstraction might look only at package locations. Another might look only at truck locations. Each abstraction ignores information, but it may still provide a useful lower bound on the real remaining cost.

The key question is:

If we merge several concrete states into one abstract state, why is the resulting distance still safe to use as a heuristic?

The answer is that abstraction preserves concrete transitions. Every concrete move has a corresponding abstract move. Merging states can create extra abstract possibilities, but extra possibilities can only make the abstract problem easier, not harder.

Abstraction Mappings

An abstraction starts with a mapping from concrete states to abstract states.

Definition: Abstraction mapping

An abstraction mapping is a surjective function
$\alpha : S \to S'$
from the concrete state set $S$ onto an abstract state set $S'$ . States $s_1$ and $s_2$ with $\alpha(s_1)=\alpha(s_2)$ are indistinguishable in the abstraction.

Surjective means that every abstract state represents at least one concrete state. The mapping may merge many concrete states into one abstract state.

A simple example is a robot navigation task where concrete states record both location and battery charge:

s = (\mathit{loc}, \mathit{battery}).

An abstraction might keep only the location:

\alpha(s)=\mathit{loc}.

All states with the same location but different battery levels are merged.

The Abstract Transition System

Let the concrete transition system be

\mathcal{T} = \langle S,L,T,s_0,S_G\rangle.

Given an abstraction mapping $\alpha : S \to S'$ , the abstract transition system is

\mathcal{T}^{\alpha} = \langle S',L,T',s'_0,S'_G\rangle.

Its components are defined by projecting the concrete system:

the abstract initial state is $s'_0=\alpha(s_0)$ ;
an abstract state $s'$ is a goal if $s'=\alpha(s)$ for some concrete goal state $s \in S_G$ ;
an abstract transition $(\alpha(s),\ell,\alpha(t))$ exists whenever the concrete transition $(s,\ell,t)$ exists.

This definition is the technical heart of abstraction heuristics. It guarantees that every concrete path maps to an abstract path with the same sequence of labels, possibly with some states collapsed together.

The Abstraction Heuristic

The heuristic value of a concrete state is the optimal distance from its abstract image to an abstract goal.

Definition: Abstraction heuristic

For an abstraction mapping $\alpha$ , define
$h^\alpha(s)=h^*_{\mathcal{T}^{\alpha}}(\alpha(s)).$
Here $h^*_{\mathcal{T}^{\alpha}}$ is the optimal goal distance in the abstract transition system.

The admissibility argument is short but important. Take any concrete plan from $s$ to a goal. By the construction of $\mathcal{T}^{\alpha}$ , the same sequence of transitions exists from $\alpha(s)$ to an abstract goal. Therefore the abstract optimum cannot be more expensive than the concrete plan. Since this holds for every concrete plan, it also holds for the cheapest one:

h^\alpha(s) \le h^*(s).

So every abstraction heuristic constructed this way is admissible.

How to Read the Lower Bound

An abstract distance is not saying, “the real task can be solved this cheaply.” It is saying, “even after forgetting some constraints, at least this much work remains.”

In the robot example, if location-only abstraction says it takes three moves to reach the goal location, then the real task also needs at least three move actions. The real task might need more because of battery constraints that the abstraction ignored. It cannot need fewer, because any real route also changes location through the same projected movement structure.

This is the same lower-bound logic as straight-line distance. The abstract problem is easier because some details have been removed.

Practical Requirements

An abstraction is useful only if two operations are cheap enough:

compute $\alpha(s)$ for a generated search state $s$ ;
retrieve $h^*_{\mathcal{T}^{\alpha}}(\alpha(s))$ quickly.

The usual solution is precomputation. Before search begins, build the abstract transition system and compute all abstract goal distances. During search, heuristic evaluation is then just projection plus lookup.

Pattern databases are exactly this precompute-and-lookup scheme for projection abstractions.

Patterns and Projections

In finite-domain representation, a state assigns values to variables. A projection abstraction chooses some variables to keep and forgets the rest.

Definition: Pattern and projection

A pattern $P$ is a subset of the state variables $V$ . The projection $\pi_P$ maps a state $s$ to its restriction to the variables in $P$ :
$\pi_P(s)=s|_P.$

If $V=\{v_1,v_2,v_3\}$ and $P=\{v_1,v_3\}$ , then the projected state records only the values of $v_1$ and $v_3$ . Any difference in $v_2$ is ignored.

The projected transition system is the abstract transition system induced by $\pi_P$ . Operators may become simpler there: two concrete transitions that differ only in forgotten variables may collapse into the same abstract transition.

Pattern Database Heuristics

A pattern database stores exact goal distances in a projected transition system.

Definition: Pattern database heuristic

For a pattern $P$ , the PDB heuristic is the abstraction heuristic induced by projection:
$h^P(s)=h^*_{\mathcal{T}^{\pi_P}}(\pi_P(s)).$

The name “database” refers to the implementation. The planner performs a backward search from the abstract goal states and stores the computed distance for each abstract state. Later, during forward search in the real task, it projects the current concrete state onto $P$ and looks up the distance in the table.

A small example illustrates the idea. Suppose a puzzle state has variables for several tiles. A pattern $P$ keeps only three tile-position variables and ignores the others. The PDB answers: how many moves are needed, in the projected puzzle, to place those three tiles correctly? The real puzzle must do at least that much work, because any complete solution also solves the projected subproblem.

Constructing a Pattern Database

The basic construction has three steps:

choose a pattern $P \subseteq V$ ;
construct the projected transition system $\mathcal{T}^{\pi_P}$ ;
run a complete backward search from the abstract goal states, storing distances.

Backward search is natural because the table should contain distances to the goal for many possible abstract states. Once built, the table can be reused for every heuristic evaluation in the same planning task.

The central tradeoff is pattern size. If $P$ is small, the database is cheap but the heuristic may be weak. If $P$ is large, the abstraction remembers more of the task and usually gives stronger values, but the number of abstract states grows exponentially with the variables in $P$ .

Combining Abstractions with Maximum

A single pattern may ignore too much. If we have several abstraction heuristics $h^{\alpha_1},\ldots,h^{\alpha_k}$ , the safest general combination is the maximum:

h^{\max}(s)=\max_{i=1}^{k} h^{\alpha_i}(s).

The maximum of admissible heuristics is admissible. If each estimate is a lower bound on $h^*(s)$ , then the largest of them is still a lower bound. This is always valid, regardless of whether the abstractions overlap.

The limitation is that maximum discards complementary information. If one abstraction says that package movement costs at least $4$ , and another says that truck repositioning costs at least $3$ , taking the maximum gives $4$ , not $7$ .

When Sums Are Safe

Summing heuristic estimates is tempting, but it can double-count work. If one real action contributes to two abstractions, charging its full cost in both abstractions can make the sum larger than the real optimal cost.

The course introduces one important case where summation is safe: orthogonal abstractions with unit costs.

Definition: Orthogonal abstractions

Two projection abstractions are orthogonal if they depend on disjoint sets of state variables.

For patterns, this means $P_1 \cap P_2 = \emptyset$ . The abstractions observe different parts of the state.

Theorem: Additivity for orthogonal abstractions

If $\alpha_1$ and $\alpha_2$ are orthogonal abstractions and all actions have unit cost, then
$h^{\alpha_1}(s)+h^{\alpha_2}(s) \le h^*(s).$

The intuition is that, under the theorem’s assumptions, the two abstract estimates account for disjoint parts of the work. Since the abstractions do not look at the same variables, their required changes do not count the same projected progress twice.

For arbitrary action costs, orthogonality alone is not enough. A single costly action might affect variables in both patterns. To sum admissibly in that setting, planners use cost partitioning, where each operator’s cost is split among abstractions so that the total charged cost never exceeds the real operator cost.

The Canonical Heuristic

Given a collection of patterns, we want the strongest admissible estimate allowed by the additive groups. The canonical heuristic takes the maximum over sums of mutually orthogonal pattern heuristics.

Definition: Canonical heuristic

For a pattern collection $\mathcal{C}$ ,
$h^{\mathcal{C}}(s) = \max_{\substack{\mathcal{A} \subseteq \mathcal{C}\\ \mathcal{A} \text{ pairwise orthogonal}}} \sum_{P \in \mathcal{A}} h^P(s).$

Each inner sum is admissible because it uses mutually orthogonal patterns. The outer maximum is also admissible because it chooses the strongest lower bound among admissible sums.

This definition also explains why we cannot simply sum all pattern heuristics. If two patterns overlap, their estimates may count the same work twice. The canonical heuristic sums only compatible groups, then takes the best such sum.

Choosing Patterns

Pattern choice controls the quality and cost of the heuristic. Very small patterns are fast but may forget the interactions that make the task hard. Very large patterns may be too expensive to store or precompute.

Manual pattern design does not scale well, so planners often select patterns automatically. A typical strategy is:

start with small seed patterns, often single variables;
grow a pattern by adding one variable at a time;
keep growth steps that improve estimated heuristic quality enough to justify the larger database;
stop when a memory or time budget is reached.

Adding variables refines the abstraction: it distinguishes states that were previously merged. This can maintain or improve heuristic values, but it also increases the database size. Good pattern selection is therefore a resource-allocation problem.

Common Mistakes

One common mistake is to think that abstraction means deleting transitions. In this framework, abstraction projects concrete transitions into abstract transitions. Merging states may introduce extra abstract paths, but it does not remove the projected image of any concrete path.

Another mistake is to interpret a PDB value as a plan length for the original task. It is a plan length in the projected task. Its usefulness comes from being a lower bound, not from being an executable concrete plan.

A third mistake is to sum PDBs just because each one is admissible. Admissibility is preserved by maximum automatically, but summation requires an additivity argument such as orthogonality or cost partitioning.

Chapter Summary

Abstraction heuristics merge concrete states into smaller abstract state spaces. Because every concrete path projects to an abstract path, optimal abstract goal distance is an admissible estimate of concrete goal distance. Pattern databases instantiate this idea by projecting an FDR task onto a subset of variables, precomputing all abstract goal distances, and using table lookup during search. Multiple PDBs can always be combined by maximum. They can be summed only under additional conditions, such as orthogonality with unit costs or suitable cost partitioning. The canonical heuristic takes the maximum over admissible sums of mutually orthogonal pattern heuristics.

Check Your Understanding

Why does merging states tend to make the abstract problem easier rather than harder?
What information is kept and what is forgotten by a projection $\pi_P$ ?
Why is a PDB usually computed by backward search from abstract goal states?
Why is the maximum of admissible abstraction heuristics always admissible?
What can go wrong if two overlapping pattern databases are simply added together?