Search Algorithms

Preface: these notes are primarily based on portions of Chapter 6 of Ertel's text.

Motivation

We wish to consider planning algorithms, starting with discrete, deterministic, fully-observable environments. Classic examples are one-player puzzles such as the "8-tile puzzle", "Rush hour" style block sliding games, Rubik's cube, and so on.

Search Space Definitions

State: An complete description of the domain world at an instant in time.
Search Space: The set of all possible states.
Start State: The initial state in which the search agent begins.
Goal State: One of possibly many states state that represent a successful completion if reached.
Action: A transition that leads from one state to another.
Cost: Each action has an associated cost value. (If not otherwise stated, we assume a uniform cost metric with value 1 for each action.
Solution: An ordered sequence of states (and transitional actions) leading from the initial state to a goal state.

As noted before, we begin by considering the following restrictions:

Discrete: States are countably enumerable (that is, there are either finitely many, or an infinite series that can be produced computationally). (i.e., no tight-rope walking)
Determinstic: There is a unique state s' that results from applying action a from state s. (i.e., no rolling dice)
Fully-Observable: The search agent is fully aware of its current state. (i.e., no hidden cards, not dropped into unknown maze).

Search Tree

The search space is commonly modeled (implicitly or explicitly) as a search tree, with states associated with nodes, and actions associated with directed edges between nodes. The search tree is rooted at the initial state and with directed edges. A solution can be represented as a path in the tree from root to a goal state.

For a specific state s, we let Successors(s) be the set of states that can be reached by a direct action from s. A node associated with s will have children in the search tree associated with each state of Successors(s).

Branching factor, b(s) of a node s is the number of successor states. In some models, there is constant branching factor, b, for all states. In most models, the branching factor varies. But we can define the effective branching factor for a tree with n nodes and depth d as approximately log_d+1 n.

Uninformed Search

We will consider the following strategies:

Breadth-First Search
Start from root, expanding level-by-level.
Advantages:
- finds optimal solution when costs are uniform
Disadvantage:
- needs O(b^d) memory for frontier (although can use that space to cache states and do graph-search rather than tree-search)

Uniform-Cost Search
With a nonuniform cost function, rather than performing straight BFS, we keep a priority queue of frontier states, and take the action that minimizes cost from initial state.
Advantages:
- Discovers states in order of cost, so finds optimal solution
Disadvantage:
- More complex data structure needed for processing frontier

Depth-First Search
Advantages:
- requires only O(bd) memory usage
Disadvantages:
- Incomplete if infinite paths; first solution found may not be optimal

Bidirectional BFS (if we can determine actions leading to a state).
Advantages:
- two trees of height d/2 are much smaller than one tree of height d.
Disadvantages:
- still requires O(b^d/2) memory for frontiers
- With nonuniform costs, more care is needed as first collision of frontiers may not be optimal.
Additional Comments:
- Note that bidirectional is incompatible with DFS because there is need for cache for detecting overlap of frontiers.

DFS with Iterative Deepening
Advantages:
- As with DFS, uses only O(bd) memory.
- With uniform costs, first solution becomes optimal.
Disadvantages:
- Slight waste of time due to passes prior to the successful one (but this causes only linear increase in time as geometric series)
Additional Comments:
- Can be adapted for nonuniform costs, by adjusting threshold on costs, rather than on number of steps.

Informed (Heuristic) Search

With uninformed search model, we presume that the only thing an algorithm can do with a state is to test if its a goal, test if its equivalent to a known state, or to determine actions that can be performed to produce successors. In particular, there is no notion of a state being "close" to a goal. But we will have knowledge of

g(s) : the actual accrued cost in going from the initial state to s along the given tree path

For many problems, we might be able to estimate the distance from a state to a potential goal. In particular, we might define

h(s) : is an estimate of the cost from s to the nearest goal

not that we can necessarily assume that the heuristic is a good estimate!

We can refine a priority-based search strategy as follows:

We maintain a priority queue of "frontier" states, with each state s having some assigned priority f(s).
Initially, the start state is the only entry in the queue, having cost 0.
We remove the state from the PQ with lowest priority, and we "process" it by finding all of its successors, and inserting each of them into the queue with appropriate priority.

Depending on how we define the priority function f(s), we can get any of the following algorithms:

Uniform-Cost Search
If we define priority based on accrued cost, h(s) = g(s) state s as our priority, we get this classic (uninformed) search algorithm. Furthermore, if costs are all uniform, then we have precisely BFS.

Greedy Best-first Search
If we define the priority based only on the heuristic estimate, f(s) = h(s), we call this a greedy search.
However, even if the heuristic were a good estimate, this disregards the different costs that have already been incurred to reach the various intermediate states. So the first solution found may not be the best (although we might do better in minimizing computational time if all we are interested is finding some path to a goal).

A* Search
We choose the next node to expand based on the function
f(s) = g(s) + h(s)

We report the first goal state to be processed (i.e., removed from the queue, not inserted into the queue).

Of course the quality of the A* search algorithm depends greatly on the properties of the heuristic function. Of particular note, a heuristic function is admissible if 0 ≤ h(s) ≤ actual cost from s to the nearest goal. That is, h(s) is a valid lower bound on the true cost.

Note that we could select h(s) = 0 as a trivially admissible heuristic, but then A* reverts to the "uniform cost" uninformed search algorithm.

Theorem: A* is optimal with an admissible heuristic
Proof:
- Any state s added to the queue is assigned a priority f(s) that is at least as large as the neighboring state from which it was discovered.
- Nodes are processed by A* in non-decreasing order of f(s) values.
- When goal state s* is processed, it must be that f(s*) = g(s*), as h(s*) = 0 since it is a goal.
- Any state s that has not yet been processed when goal s* is reached, must have an actual cost g(s) ≥ g(s*).
  (consider best path from start to state s, and the node of that path currently in the frontier.)

Defining Admissible Heuristics

So the A* algorithm is foundational for informed search, but we need to define an admissible heuristic. Let's look at some examples:

For driving directions, can use straight-line distance as an estimate.
For 8-puzzle, can look at different measures based of pieces that are out of place.
- h(s) = number of tiles that are out of place
  This is admissible, since only one tile moves per step, and so with k tiles out of place, we need at least k more steps.
- h(s) = sum of Manhattan distances of all tiles from their goal positions.
  This is admissible, since a tile that has Manhatten distance d will need to be moved in at least d steps, and those moves are independent of steps in which other tiles are moved.

IDA* (Iterative Deepening A*)

As great as A* is, it still requires that we maintain the frontier in a priority queue, and that memory usage can be burdensome (as well as the additional data structure support for managing priorities).

We can adapt A* as a form of DFS, with only O(bd) memory usage, by placing a threshold on the value f(s) = g(s) + h(s) that we are willing to consider, and truncating the DFS on any branch once that threshold is exceeded. We then use iterative deepening, repeating the search from scratch with a higher threshold.

If we have integral costs and heuristic functions, and we increase our threshold by one in each round, then first solution found will be optimal. We can do slighlty better by setting the new threshold to be the minimal f(s) value that was pruned in the previous round (since otherwise we would not proceed any farther).

Michael Goldwasser

Last modified: Monday, 30 September 2013