Skip to content

Abstractions and Pattern Databases

Source slides: E3 Abstractions, E4 Abstraction Heuristics, E5 Orthogonality and Additivity, E6 Pattern Databases.

Delete relaxation simplified planning by changing the operators. Abstraction uses a different idea: keep the transition system shape, but deliberately forget some distinctions between states. If the abstract system is small enough, we can compute exact goal distances there and use them as heuristic estimates in the original system.

Pattern databases are the standard planning version of this idea. They choose a subset of variables, project every state onto those variables, precompute the abstract distances, and then answer heuristic queries by table lookup.

The concrete state space of a planning task is often too large to solve directly. But not every detail of a state is equally relevant to every estimate. If a delivery task has truck locations, package locations, fuel levels, and road conditions, one abstraction might look only at package locations. Another might look only at truck locations. Each abstraction ignores information, but it may still provide a useful lower bound on the real remaining cost.

The key question is:

If we merge several concrete states into one abstract state, why is the resulting distance still safe to use as a heuristic?

The answer is that abstraction preserves concrete transitions. Every concrete move has a corresponding abstract move. Merging states can create extra abstract possibilities, but extra possibilities can only make the abstract problem easier, not harder.

An abstraction starts with a mapping from concrete states to abstract states.

Definition: Abstraction mapping

An abstraction mapping is a surjective function

α:SS\alpha : S \to S'

from the concrete state set SS onto an abstract state set SS'. States s1s_1 and s2s_2 with α(s1)=α(s2)\alpha(s_1)=\alpha(s_2) are indistinguishable in the abstraction.

Surjective means that every abstract state represents at least one concrete state. The mapping may merge many concrete states into one abstract state.

A simple example is a robot navigation task where concrete states record both location and battery charge:

s=(loc,battery).s = (\mathit{loc}, \mathit{battery}).

An abstraction might keep only the location:

α(s)=loc.\alpha(s)=\mathit{loc}.

All states with the same location but different battery levels are merged.

Let the concrete transition system be

T=S,L,T,s0,SG.\mathcal{T} = \langle S,L,T,s_0,S_G\rangle.

Given an abstraction mapping α:SS\alpha : S \to S', the abstract transition system is

Tα=S,L,T,s0,SG.\mathcal{T}^{\alpha} = \langle S',L,T',s'_0,S'_G\rangle.

Its components are defined by projecting the concrete system:

  • the abstract initial state is s0=α(s0)s'_0=\alpha(s_0);
  • an abstract state ss' is a goal if s=α(s)s'=\alpha(s) for some concrete goal state sSGs \in S_G;
  • an abstract transition (α(s),,α(t))(\alpha(s),\ell,\alpha(t)) exists whenever the concrete transition (s,,t)(s,\ell,t) exists.

This definition is the technical heart of abstraction heuristics. It guarantees that every concrete path maps to an abstract path with the same sequence of labels, possibly with some states collapsed together.

The heuristic value of a concrete state is the optimal distance from its abstract image to an abstract goal.

Definition: Abstraction heuristic

For an abstraction mapping α\alpha, define

hα(s)=hTα(α(s)).h^\alpha(s)=h^*_{\mathcal{T}^{\alpha}}(\alpha(s)).

Here hTαh^*_{\mathcal{T}^{\alpha}} is the optimal goal distance in the abstract transition system.

The admissibility argument is short but important. Take any concrete plan from ss to a goal. By the construction of Tα\mathcal{T}^{\alpha}, the same sequence of transitions exists from α(s)\alpha(s) to an abstract goal. Therefore the abstract optimum cannot be more expensive than the concrete plan. Since this holds for every concrete plan, it also holds for the cheapest one:

hα(s)h(s).h^\alpha(s) \le h^*(s).

So every abstraction heuristic constructed this way is admissible.

An abstract distance is not saying, “the real task can be solved this cheaply.” It is saying, “even after forgetting some constraints, at least this much work remains.”

In the robot example, if location-only abstraction says it takes three moves to reach the goal location, then the real task also needs at least three move actions. The real task might need more because of battery constraints that the abstraction ignored. It cannot need fewer, because any real route also changes location through the same projected movement structure.

This is the same lower-bound logic as straight-line distance. The abstract problem is easier because some details have been removed.

An abstraction is useful only if two operations are cheap enough:

  1. compute α(s)\alpha(s) for a generated search state ss;
  2. retrieve hTα(α(s))h^*_{\mathcal{T}^{\alpha}}(\alpha(s)) quickly.

The usual solution is precomputation. Before search begins, build the abstract transition system and compute all abstract goal distances. During search, heuristic evaluation is then just projection plus lookup.

Pattern databases are exactly this precompute-and-lookup scheme for projection abstractions.

In finite-domain representation, a state assigns values to variables. A projection abstraction chooses some variables to keep and forgets the rest.

Definition: Pattern and projection

A pattern PP is a subset of the state variables VV. The projection πP\pi_P maps a state ss to its restriction to the variables in PP:

πP(s)=sP.\pi_P(s)=s|_P.

If V={v1,v2,v3}V=\{v_1,v_2,v_3\} and P={v1,v3}P=\{v_1,v_3\}, then the projected state records only the values of v1v_1 and v3v_3. Any difference in v2v_2 is ignored.

The projected transition system is the abstract transition system induced by πP\pi_P. Operators may become simpler there: two concrete transitions that differ only in forgotten variables may collapse into the same abstract transition.

A pattern database stores exact goal distances in a projected transition system.

Definition: Pattern database heuristic

For a pattern PP, the PDB heuristic is the abstraction heuristic induced by projection:

hP(s)=hTπP(πP(s)).h^P(s)=h^*_{\mathcal{T}^{\pi_P}}(\pi_P(s)).

The name “database” refers to the implementation. The planner performs a backward search from the abstract goal states and stores the computed distance for each abstract state. Later, during forward search in the real task, it projects the current concrete state onto PP and looks up the distance in the table.

A small example illustrates the idea. Suppose a puzzle state has variables for several tiles. A pattern PP keeps only three tile-position variables and ignores the others. The PDB answers: how many moves are needed, in the projected puzzle, to place those three tiles correctly? The real puzzle must do at least that much work, because any complete solution also solves the projected subproblem.

The basic construction has three steps:

  1. choose a pattern PVP \subseteq V;
  2. construct the projected transition system TπP\mathcal{T}^{\pi_P};
  3. run a complete backward search from the abstract goal states, storing distances.

Backward search is natural because the table should contain distances to the goal for many possible abstract states. Once built, the table can be reused for every heuristic evaluation in the same planning task.

The central tradeoff is pattern size. If PP is small, the database is cheap but the heuristic may be weak. If PP is large, the abstraction remembers more of the task and usually gives stronger values, but the number of abstract states grows exponentially with the variables in PP.

A single pattern may ignore too much. If we have several abstraction heuristics hα1,,hαkh^{\alpha_1},\ldots,h^{\alpha_k}, the safest general combination is the maximum:

hmax(s)=maxi=1khαi(s).h^{\max}(s)=\max_{i=1}^{k} h^{\alpha_i}(s).

The maximum of admissible heuristics is admissible. If each estimate is a lower bound on h(s)h^*(s), then the largest of them is still a lower bound. This is always valid, regardless of whether the abstractions overlap.

The limitation is that maximum discards complementary information. If one abstraction says that package movement costs at least 44, and another says that truck repositioning costs at least 33, taking the maximum gives 44, not 77.

Summing heuristic estimates is tempting, but it can double-count work. If one real action contributes to two abstractions, charging its full cost in both abstractions can make the sum larger than the real optimal cost.

The course introduces one important case where summation is safe: orthogonal abstractions with unit costs.

Definition: Orthogonal abstractions

Two projection abstractions are orthogonal if they depend on disjoint sets of state variables.

For patterns, this means P1P2=P_1 \cap P_2 = \emptyset. The abstractions observe different parts of the state.

Theorem: Additivity for orthogonal abstractions

If α1\alpha_1 and α2\alpha_2 are orthogonal abstractions and all actions have unit cost, then

hα1(s)+hα2(s)h(s).h^{\alpha_1}(s)+h^{\alpha_2}(s) \le h^*(s).

The intuition is that, under the theorem’s assumptions, the two abstract estimates account for disjoint parts of the work. Since the abstractions do not look at the same variables, their required changes do not count the same projected progress twice.

For arbitrary action costs, orthogonality alone is not enough. A single costly action might affect variables in both patterns. To sum admissibly in that setting, planners use cost partitioning, where each operator’s cost is split among abstractions so that the total charged cost never exceeds the real operator cost.

Given a collection of patterns, we want the strongest admissible estimate allowed by the additive groups. The canonical heuristic takes the maximum over sums of mutually orthogonal pattern heuristics.

Definition: Canonical heuristic

For a pattern collection C\mathcal{C},

hC(s)=maxACA pairwise orthogonalPAhP(s).h^{\mathcal{C}}(s) = \max_{\substack{\mathcal{A} \subseteq \mathcal{C}\\ \mathcal{A} \text{ pairwise orthogonal}}} \sum_{P \in \mathcal{A}} h^P(s).

Each inner sum is admissible because it uses mutually orthogonal patterns. The outer maximum is also admissible because it chooses the strongest lower bound among admissible sums.

This definition also explains why we cannot simply sum all pattern heuristics. If two patterns overlap, their estimates may count the same work twice. The canonical heuristic sums only compatible groups, then takes the best such sum.

Pattern choice controls the quality and cost of the heuristic. Very small patterns are fast but may forget the interactions that make the task hard. Very large patterns may be too expensive to store or precompute.

Manual pattern design does not scale well, so planners often select patterns automatically. A typical strategy is:

  1. start with small seed patterns, often single variables;
  2. grow a pattern by adding one variable at a time;
  3. keep growth steps that improve estimated heuristic quality enough to justify the larger database;
  4. stop when a memory or time budget is reached.

Adding variables refines the abstraction: it distinguishes states that were previously merged. This can maintain or improve heuristic values, but it also increases the database size. Good pattern selection is therefore a resource-allocation problem.

One common mistake is to think that abstraction means deleting transitions. In this framework, abstraction projects concrete transitions into abstract transitions. Merging states may introduce extra abstract paths, but it does not remove the projected image of any concrete path.

Another mistake is to interpret a PDB value as a plan length for the original task. It is a plan length in the projected task. Its usefulness comes from being a lower bound, not from being an executable concrete plan.

A third mistake is to sum PDBs just because each one is admissible. Admissibility is preserved by maximum automatically, but summation requires an additivity argument such as orthogonality or cost partitioning.

Abstraction heuristics merge concrete states into smaller abstract state spaces. Because every concrete path projects to an abstract path, optimal abstract goal distance is an admissible estimate of concrete goal distance. Pattern databases instantiate this idea by projecting an FDR task onto a subset of variables, precomputing all abstract goal distances, and using table lookup during search. Multiple PDBs can always be combined by maximum. They can be summed only under additional conditions, such as orthogonality with unit costs or suitable cost partitioning. The canonical heuristic takes the maximum over admissible sums of mutually orthogonal pattern heuristics.

  1. Why does merging states tend to make the abstract problem easier rather than harder?
  2. What information is kept and what is forgotten by a projection πP\pi_P?
  3. Why is a PDB usually computed by backward search from abstract goal states?
  4. Why is the maximum of admissible abstraction heuristics always admissible?
  5. What can go wrong if two overlapping pattern databases are simply added together?