Landmarks and Constraints

Source slides: F1 Heuristic Properties · F2 RTG Landmarks · F3 MHS Heuristic · F4 LM-Cut · F5 LP and IP · F6 Cost Partitioning · F7 Saturated Cost Partitioning

Many planning heuristics work by simplifying the state space and solving the simplified problem. Constraint-based heuristics take a different route. They ask: what must be true of every real solution plan? If every valid plan must achieve some fact, use some action, cross some cut, or satisfy some balance equation, then that necessary condition can become a lower bound on plan cost.

This chapter introduces landmarks first, then shows how landmarks lead to hitting-set and cut-based heuristics. It then connects these ideas to linear programming and cost partitioning, which are the tools used to combine several lower-bound sources without double-counting action costs.

The Constraint View

A constraint-based heuristic does not need to describe a complete plan. It only needs constraints that every complete plan satisfies. The more constraints we add, the more expensive the cheapest “possible plan summary” becomes, and the larger the admissible heuristic can be.

The main examples in this chapter are:

landmarks, which are facts or actions no plan can avoid;
minimum hitting sets, which combine disjunctive action landmarks;
LM-Cut, which extracts disjunctive action landmarks from relaxed task graphs;
cost partitioning, which safely adds several heuristic estimates by splitting operator costs.

The common pattern is always the same: find necessary conditions, optimize under those conditions, and interpret the optimum as a lower bound.

Fact Landmarks

A fact landmark is a proposition that every solution plan must make true at some point.

Definition: Fact landmark

Given a planning task $\Pi$ with initial state $s_0$ and goal $G$ , a fact $f$ is a fact landmark of $\Pi$ if every plan $\pi = \langle a_1,\ldots,a_n\rangle$ for $\Pi$ reaches a state in which $f$ is true. Formally, for every valid plan $\pi$ , there is an index $0 \le i \le n$ such that $f$ holds after applying $\langle a_1,\ldots,a_i\rangle$ from $s_0$ .

The index $i=0$ is allowed. Therefore any fact already true in the initial state is technically a landmark. Goal facts are also landmarks, because a solution must end in a goal state. These landmarks are correct but often not very informative. The useful landmarks are usually intermediate facts that every plan must pass through on the way to the goal.

As a small example, suppose a robot must deliver a package from behind a locked door. If every plan must first have the key, then has-key is a fact landmark. The final goal may not mention the key, but the task structure forces the plan to make that fact true.

Action Landmarks

Landmarks can also refer to actions. An action landmark says that a particular operator must occur in every plan.

Definition: Action landmark

An operator $o$ is an action landmark of $\Pi$ if every valid plan for $\Pi$ contains $o$ .

Action landmarks are directly useful for cost lower bounds. If $o$ must occur and has cost $c(o)$ , then every plan has cost at least $c(o)$ . However, requiring one specific action is often too strong. In many tasks, several alternative actions can serve the same role.

This motivates disjunctive action landmarks.

Definition: Disjunctive action landmark

A set $L \subseteq O$ is a disjunctive action landmark if every valid plan applies at least one operator from $L$ . Equivalently, for every plan $\pi$ , $\pi \cap L \ne \emptyset$ .

A singleton set $\{o\}$ is an ordinary action landmark. Larger sets express alternatives: perhaps one of several doors must be opened, or one of several trucks must move a package into the target city.

Causal and Incidental Landmarks

Not every landmark explains why the plan works. A fact may become true in every plan merely as a side effect. Such a fact is still a landmark, but it may be less useful for search guidance.

Definition: Causal formula landmark

A formula $\lambda$ is a causal formula landmark if, in every plan, $\lambda$ must be true either because it is part of the goal or because it is a precondition of some operator used in the plan.

Causal landmarks are tied to the dependency structure of the task. They are the landmarks we expect to find by reasoning backward from the goal through operator preconditions. Incidental landmarks are unavoidable in the actual executions but are not needed as goal facts or preconditions. They can be real, but they are harder to discover from the causal graph of the task and often less useful for the heuristics in this chapter.

The Relaxed Task Graph

The relaxed task graph, or RTG, is an AND/OR graph built from the delete-relaxed task $\Pi^+$ . Delete relaxation removes negative effects, so once a fact becomes true it stays true. This makes causal reachability easier to analyze.

For a STRIPS task, the simplified RTG contains:

a fact node for each proposition;
an operator node for each action;
edges connecting facts to operators that require them;
edges connecting operators to facts they add;
a dummy goal operator whose preconditions are the goal facts.

The graph is read backward when computing landmarks for the goal. Operator nodes are AND-like: to apply an operator, all its preconditions are needed. Fact nodes are OR-like: to obtain a fact, it may be enough to choose one achiever.

That AND/OR distinction is the reason intersections appear in landmark equations. A landmark of all alternatives survives an OR choice only if it is common to every alternative. A landmark of an AND requirement must account for the requirements of all required parts.

Computing RTG Landmarks

The RTG landmark computation assigns each node $n$ a set $\mathit{LM}(n)$ of nodes that are unavoidable on the way to $n$ . The exact equations depend on the node type in the simplified graph, but the guiding principle is:

when all predecessors must be used, combine their landmark requirements;
when one predecessor may be chosen among alternatives, keep only what all alternatives share;
include the node itself when reaching it is part of the requirement being summarized.

For example, if a goal operator has two goal facts as preconditions, then both goal facts are relevant to the landmark set for the goal operator. If one of those facts can be achieved by several different actions, only landmarks common to all those achievers can be concluded from that fact alone.

The result is computed as a fixed point because landmark information depends recursively on other landmark information. We start with conservative sets and repeatedly apply the equations until no set changes.

Theorem: Landmarks from relaxation are sound

Every causal landmark found for the relaxed task $\Pi^+$ is also a landmark of the original task $\Pi$ .

The converse is not guaranteed. Delete relaxation may introduce new relaxed plans that avoid something every real plan must do. Therefore RTG landmarks are sound but incomplete: what they report is safe to use, but they may miss real landmarks.

Reading RTG Results Correctly

RTG landmark computation uses special nodes such as the initial node and the dummy goal node to organize the fixed-point calculation. These nodes participate in the computation, but they should not automatically be read as ordinary planning-task landmarks in the final report. The useful output is the facts, operators, or disjunctive action sets that correspond to necessary conditions in the original task.

A common mistake is to confuse “appears in the graph computation” with “is an interesting landmark.” Initial facts are landmarks in the formal sense because they hold at time $0$ , but they usually do not give a helpful lower bound. Likewise, the dummy goal operator is an analysis device, not an action from the original domain.

Minimum Hitting Set

Suppose we have a collection of disjunctive action landmarks

\mathcal{L} = \{L_1,\ldots,L_k\}.

Every valid plan must choose at least one action from each $L_i$ . If we select a set of actions that intersects every landmark set, we have a hitting set.

Definition: Minimum hitting set heuristic

A set $H \subseteq O$ is a hitting set for $\mathcal{L}$ if $H \cap L_i \ne \emptyset$ for all $i$ . The minimum hitting set heuristic is
$h^{\text{MHS}}(s) = \min \left\{ \sum_{o \in H} c(o) \;\middle|\; H \text{ is a hitting set for } \mathcal{L} \right\}.$

This is admissible because every real plan contains actions that hit all landmark sets. The cheapest hitting set can only be cheaper than, or equal to, the action set used by a real plan.

The computational issue is that minimum hitting set is NP-hard. In planning heuristics it is often expressed as an integer program, then approximated by dropping integrality and solving the LP relaxation. The relaxed optimum is still a lower bound, so it remains admissible.

LM-Cut

LM-Cut is a landmark heuristic that repeatedly extracts disjunctive action landmarks from the relaxed task graph. It is important because it is admissible, polynomial-time, and usually much stronger than the simple max heuristic $h^{\max}$ .

The intuition is graph-theoretic. In the relaxed graph, the initial facts are the source side and the dummy goal is the target side. A cut separates them. Any real relaxed path from the initial facts to the goal must cross the cut, so the actions on the cut form a disjunctive action landmark.

The algorithm follows this rhythm:

compute relaxed reachability information under the current operator costs;
find a cut that separates reachable zero-cost structure from the goal;
record the cut actions as a disjunctive action landmark;
add the minimum cut-action cost to the heuristic value;
subtract that amount from the costs of the cut actions;
repeat until the relaxed goal has zero remaining cost.

The cost subtraction is what prevents double-counting. Once a piece of operator cost has been charged to one cut, that same cost cannot be charged again later.

Theorem: LM-Cut

The LM-Cut heuristic is admissible and computable in polynomial time. It dominates $h^{\max}$ .

When reading LM-Cut, focus on two guarantees. First, each cut is a disjunctive action landmark. Second, the cost reductions partition operator costs across the cuts that are found. Together these make the accumulated value a valid lower bound.

Linear and Integer Programming

Landmark heuristics often lead to optimization problems. The minimum hitting set formulation is the simplest example.

A linear program minimizes or maximizes a linear objective subject to linear constraints. A standard minimization form is

\min c^\top x \quad\text{subject to}\quad Ax \le b,\; x \ge 0.

A integer program additionally requires some or all variables to be integers. Integer programs can encode discrete choices directly but are NP-hard in general. If we drop the integrality requirement, we get an LP relaxation. For minimization encodings of plan cost, the LP relaxation can only make the optimum smaller, so it gives an admissible lower bound.

This is the main reason LPs appear repeatedly in planning heuristics. They provide a practical way to combine many necessary conditions while preserving admissibility.

Cost Partitioning

If we have several admissible heuristics, we cannot always add them directly. Two heuristics may both charge the same operator cost, and summing them would double-count.

Cost partitioning solves this by giving each heuristic its own share of the operator costs.

Definition: Cost partitioning

Given heuristics $h_1,\ldots,h_n$ and original operator cost function $c$ , a cost partitioning is a family of non-negative cost functions $c_1,\ldots,c_n$ such that
$\sum_{i=1}^{n} c_i(o) \le c(o) \quad\text{for every operator } o.$
The combined heuristic is
$h^{\text{CP}}(s) = \sum_{i=1}^{n} h_i^{c_i}(s),$
where $h_i^{c_i}$ means that $h_i$ is evaluated using the cost function $c_i$ .

The inequality is the whole safety argument. Across all heuristics, the total charged cost of any operator is never more than its real cost. Therefore the sum remains admissible.

Optimal cost partitioning chooses the split that maximizes the combined value:

h^{\text{OCP}}(s) = \max_{c_1,\ldots,c_n} \sum_{i=1}^{n} h_i^{c_i}(s) \quad \text{subject to} \quad \sum_i c_i(o) \le c(o).

When the component heuristics have suitable LP encodings, this can itself be solved by linear programming.

Saturated Cost Partitioning

Optimal cost partitioning can be too expensive when many heuristics are combined. Saturated cost partitioning is a greedy alternative.

Choose an order $h_1,\ldots,h_n$ . Then process the heuristics one by one. For each heuristic, assign it as much of the remaining operator cost as it can use to preserve its current estimate. The unused cost is passed on to later heuristics.

The method is fast and admissible because it still obeys the cost-partitioning inequality. It is not generally optimal, and the order matters. Different orders can produce different estimates, so planners may try several orders and take the maximum of the resulting admissible values.

Network Flow Constraints

Landmarks are not the only source of constraints. Network flow constraints reason about how often facts must be produced and consumed.

For a fact $f$ , a valid plan must balance the number of times $f$ is made true with the number of times it is needed and then undone, adjusted for whether $f$ is initially true or required in the goal. In simplified form, this idea is:

\text{productions of } f - \text{consumptions of } f = \begin{cases} 1 & \text{if } f \text{ is needed in the goal and not initially supplied},\\ 0 & \text{otherwise, after the appropriate boundary adjustment.} \end{cases}

Each production or consumption count can be written as a linear expression over action-count variables. These equations become part of an LP whose objective minimizes total action cost. The optimum is a lower bound because every real plan’s action counts satisfy the balance equations.

This prepares the ground for the next chapter, where operator counting makes this style of reasoning explicit and general.

Chapter Summary

Constraint-based heuristics derive lower bounds from necessary conditions on plans. Fact landmarks must become true, action landmarks must be used, and disjunctive action landmarks require at least one action from a set. RTG landmark computation extracts sound causal landmarks from the delete-relaxed task. Minimum hitting set combines disjunctive landmarks, while LM-Cut repeatedly extracts cut landmarks and partitions their costs. Linear programming provides a way to optimize under these constraints, and cost partitioning safely combines multiple heuristics by ensuring that no operator’s cost is charged more than once.

Check Your Understanding

Why are initial facts technically landmarks, and why are they often not useful heuristically?
What is the difference between an action landmark and a disjunctive action landmark?
Why are landmarks found in $\Pi^+$ safe to use for the original task $\Pi$ ?
How does a hitting set turn disjunctive action landmarks into a lower bound?
What role does cost reduction play in LM-Cut?
Why can cost-partitioned heuristics be added without losing admissibility?
How do network flow constraints differ from landmark constraints?