You are on page 1of 8

A study on n-puzzle solving search algorithms*

Kadir Firat Uyanik


Electrical and Electronics Engineering Department,
Middle East Technical University
kadir@ceng.metu.edu.tr

Abstract— Classical AI search algorithms have been tested


on various problems, such as 8-queens, traveling salesman,
automatic assembly sequencing, and even robot navigation. In
this study, several search algorithms are going to be compared
in n-puzzle solving problem in terms of their computational
complexity and memory usage for different board configuration
scenarios (e.g. board size, tile placement etc.).

I. I NTRODUCTION
It is an intriguing problem to design a software agent that
can find a reasonable amount of action-sequence in order to
reach a particular goal state from an initial state. In general,
search is a process of examining different possible sequences
of actions that yield to a desired state, and choosing the best
action-sequence to realize when a similar query is done in
the future.
Almost all the search algorithms possess the following el-
ements and functionalities: initial state,goal state, successor
function, goal test, and path cost.
These are the common attribute and functionalities that
the algorithms tested in this study, namely depth-first
search(memoizing, iterative deepening), breadth-first search,
A-star search (with manhattan distance, euclidean distance,
and misplaced tiles heuristics).
II. E XPERIMENTAL S ETUP
In order to represent the state of the puzzle a generic data
structure is being utilized 1.
As it is used in Node data structure, object oriented
programming principles are used in design of the search
algorithms as well 2. Whole software setup is consisted of
more than 3000 lines of c++ code, and around 10 classes. It
was very important to consider several performance issues
during the imlementation, such as correct-const-ness issue,
and de-allocation.
Since nodes are allocated dynamically from the heap
region, it is very important to release back the memory
acquired for a particular search operation.
Correct-const-ness is one of the most important good
practices while coding in C++ language. If a variable passed
by value to a function, a new copy of this variable is created
inside the new local scope. Therefore passing the variable by
reference would do the trick. However, one should be very
careful when the objects or data are passed via reference Fig. 1: Nodes are the structures that search algorithms make
since the value of that memory location stores can easily use of. Since nodes are templated stuctures any kind of data
can be stored inside them. Boards are used by nodes in this
*This study is a part of EE586 Artificial Intelligence course offered by problem where they are composed of Tile structures.
Dr. Afsar Saranli
Fig. 2: All of the search algorithms share similar functionalities which are parts of the search class.

be changed in this case. To avoid this a const keyword can


be added to that compiler understands that the value of that 3 4 6
variable will not be changed in that particular scope and initial = 1 0 8
gives error whenever designer tries to change the value of 7 2 5
that variable in the compile time. 1 2 3
goal = 4 5 6
III. E XPERIMENTS 7 8 0
Memory usage and computation time of the search
Several tools are being used to measure memory usage, algrithms are given in the following figures. Please notice
such as valgrind with masif, and the linux system clock that the when valgrind is used to measure the memory
which is precise in the nanoseconds scale. In order to show usage (with the configuration of max.100 snapshots, 10Hz
how different search algorithms make use of the heap and resolution) execution time increases approximately 50 times
stack regions of the memory accross time, several experi- the actual time. However, actual execution times are also
ments have been conducted on the board configuration given given in the figures in addition to the number of nodes being
in the homework sheet that is: expanded and the number of nodes expanded in the optimal
that is found by BFS or A* search algorithms which are 9, average time that a solution takes in figure 10, and total
known to be optimal. number of expanded-nodes (=moves) in figure 11.
In general, algorithms expands a node with some selection 100 experiments for each board configuration (true-
criteria, and decides on what to do next by evaluating distance in the range of 2-14) have been conducted. Results
this node. The way nodes are evaulated is the most crit- shows that the complexity of informed algorithms(A* w/
ical part. For instance, depth-first search(DFS) algorithms manhattan and A* w/ misplaced) grows almost linearly
tries to expand the nodes so that the depth of the search whereas uninformed search algorithms’ blow up exponen-
tree is increased at each iteration, whereas the breadth-first tially due to the branching factor of the problem. In n-puzzle
search(BFS) tries to expand through all the successors of a average branching factor is around 3, uninformed search
particular node. algorithms would be reather useless for the problems having
Since depth first search goes deep down in the solution larger branching factor such as chess-play.
tree, if a duplicate-state check isn’t done, DFS most likely to To sum up, A* w/ manhattan is the best algorithm in
stack in an infinite loop and never finds the goal. To avoid this terms of execution time, memory usage, and the shortest-
problem, all the expanded nodes are stored in an expanded path-length criteria. To see how it performs for different
list. And each candidate node is first checked if it is already sized boards, it is tested on 3x3, 5x5 and 7x7 boards with a
expanded or not (yet, the very first thing is checking if it is initial state of 19 steps further from the goal state. Results
goal). show that A* isn’t being affected from the board size since
Since BFS tries to expand the nodes that are closest to the most critical part is how far a particular state is away
the starting node, it is guaranteed that it will come up from the goal state. According to the simulation results A*
with one of the optimal solutions (there might be several make use of not 300-400 nodes to reach the goal state that
goals at the same depth). However, DFS is not an optimal is 19 steps further from the initial state. Still, it finds the
algorithm in most of the cases. In order to make DFS as an optimal solution, and execution time is less than 15seconds.
optimal algorithm, it is modifed by limiting the max search To be more specific execution time for 3x3, 5x5, and 7x7
depth for each iteration which is so called iterative-deepening board configurations are 1.17sec, 5.2sec, and 15sec, whereas
depth first search(IDDFS) algorithm. IDDFS works similar the number of opened nodes are 298.8, 312.95, and 330.34
to both DFS and BFS algorithms and it finds one of the respectively.
optimal solutions like BFS. Another modification would be V. C ONCLUSIONS AND F UTURE W ORK
expanding whole graph and saving all solution alternatives
and picking the shortest one after expanding whole search In this study, I have investigated several search algorithms
tree, which is what Memoizing DFS does. on the n-puzzle problem domain. DFS appeared to be worst
The search algorithms mentioned above consider only the and the most unreliable among the other algorithms, whereas
actual path cost that it takes until reaching a particular node. A* algorithm with manhattan distance heuristic performed
They expand the nodes without considering how much they most successfully regarding the memory being used and the
are getting closer to the goal state. To do this, informed execution time it requires. Algorithms are implemened by
search methods were developed. One of the most famous using C++ language on a Linux box. Memory usage vs.
informed search algorithm is A* (pronounced as A-star). execution time figures are obtained via Valgrind open source
It takes not only the actual distance of a node from the tool with the Messif extention. Graphs are being drawn via
initial state but also the cost being estimated until reaching Octave’s plot utilities. All of the source code can be obtained
the goal state. If this estimate, that is heuristic, is sound from my personal code repository.
and admissible (never overestimates the distance between a As a future work a GUI implementation can be final-
specific node and the goal node), the algorithm will find the ized that I’ve started by using QT framework, and search
optimal path. In this study, manhattan, euclidean, and number algorithms can be optimized by carefully investigating the
of misplaced tiles heuristics are being used. heap usage expecially when doing Monte-Carlo simulations.
Another important issue is that the search algorithms espe-
Although euclidean and manhattan distance heuristics(A*
cially depth-first and breadth-first can easily be parallelized
w/ manhattan) work almost the sameway, they outperform
and search programs can be designed in a way that each
number of misplaced tiles heuristic (A* w/ misplaced). This
subtree of the search tree can be solved in a different thread.
is due to the fact that the more informative a heuristic is the
This is best done by creating thousands of many threads in
better the algorihm will perform and make more reasonable
a Graphical Processing unit. During this study, I’ve got a
decisions. Therefore it is expected that A* w/ manhattan will
chance to investigate several parallelization methods that are
perform better than the A* w/ misplaced.
done by using CUDA on NVidia graphics cards.
IV. D ISCUSSION
In this study, several monte carlo simulations conducted.
Randomly generated board configurations (with a priorly
known true distance to the goal) have been tested on all
of the algorithms. Optimal path length is shown in figure
Fig. 3: BFS algorithm. Actual solution time:0.981sec, ]nodes expanded:2233, ]nodes expanded in the optimal path 12

Fig. 4: IDDFS algorithm. Actual solution time:0.257sec, ]nodes expanded:318, ]nodes expanded in the optimal path 12
Fig. 5: DFS algorithm. Actual solution time:6457.16sec, ]nodes expanded:168788, ]nodes expanded in the optimal path 9,
but solution is found to be at the 67562nd level. Please notice that the execution is being killed in valgrind case because
it takes about two hours even valgrind is not being activated. Therefore, we can estimate that this execution would take
approximately 3 or 4 days.

Fig. 6: A* algorithm with manhattan distance heuristic. Actual solution 0.016sec, ]nodes expanded:31, ]nodes expanded in
the optimal path is 12.
Fig. 7: A* algorithm with euclidean distance heuristic. Actual solution 0.018sec, ]nodes expanded:30, ]nodes expanded in
the optimal path is 12.

Fig. 8: A* algorithm with total number of misplaced tiles heuristic. Actual solution 0.0298sec, ]nodes expanded:113, ]nodes
expanded in the optimal path is 12.
Fig. 9: All of the algorithms except DFS finds the optimal path. DFS is not included in the experiments since it requires
days to simulate DFS for a problem of true-distance 4.

Fig. 10: Uninformed algorithms require much more larger time to solve problems having larger depth values. It worths
noting that IDDFS requires much more time than the BFS algorithm. Although they should perform theoretically similar,
starting the search from very first -in the case of IDDFS- requires releasing the memory and allocating again which is a
serious overhead. This problem can be overcomed by storing nodes in a hash-table or by keeping track of already allocated
nodes to avoid de-allocation+allocation procedure.
Fig. 11: The difference between IDDFS and BFS is not that conceivable as in it is in the figure 10. It is expected to be less
if the number of simulations is increased.

You might also like