Paper Read: Case-Based Reasoning- Foundational Issues, Methodological Variations, and System Approaches

Agnar Aamodt, Enric Plaza. 1994. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. In AI COMMUNICATIONS.

The paper presents

  • Overview of the field in terms of its underlying foundations, its current state-of the art, future trends.
  • The description of CBR principles, methods and systems in analytical scheme.

Definitions

  • Case Based Reasoning (CBR):
    • CBR is a problem solving paradigm that instead of relying on general knowledge about the problem domain, CBR is able to utilize cases.
    • A new problem is solved by finding a similar past case, and reusing it in the new problem situation.
    • CBR is an approach to incremental learning, since new experience is retained each time a problem has been solved, making it available for future problems.
  • Case:
    • Case is a specific knowledge of previously experienced concrete problem situation.
    • In CBR terms, it denotes a problem situation.
    • A previously experienced situation, which has been captured and learned in a way that it can be reused in solving future problems.
    • Previous experience is known as retained case, solved case, past case.
    • New problem to be solved is called new case, or unsolved case.

Case-based problem solving

  • Reasoning by re-using past cases is powerful and frequently applied way to solve problems for humans.
  • Several studies have given evidence for the dominating role of specific, previously experienced situations in human problem solving.
  • It has been shown that human uses past-cases as models when learning to solve problems, in early learning.
  • Other results indicate that the use of past-cases isa predominant problem solving method among any domain experts as well.
  • Problem solving is not necessarily the finding of a concrete solution to an application problem.
  • The problem can be put by the user, such as criticize user solutions, generate set of solutions, interpret a problem situation, generate expectations in observable data.

Learning in Case-based reasoning

  • CBR is coupled to learning, it is regarded as a subfield of machine learning.
  • CBR denotes a machine learning paradigm that enables sustained and incremental learning by updating the case-base after a problem has been solved.
  • Learning in CBR occurs as a natural by product of problem solving.
  • When an attempt to solve a problem fails, the reason for the failure is identified and remembered in order to avoid the same mistake in the future.
  • CBR favor learning from experience, because it is easier to learn by retaining a concrete problem solving experience, than generalizing from it.
  • Effective learning in CBR requires a well worked methods in order to:
    • Extract relevant knowledge from the experience.
    • Integrate a case into an existing knowledge structure.
    • Index the case for later matching with a similar case.

Combining cases with other knowledge

  • Human learning and problem solving in general are processes that involve the representation and utilization of several types of knowledge, and the combination of several reasoning methods.

Fundamentals of case-based reasoning methods

  • Central tasks to all CBR methods:
    • Identify the current problem situation.
    • Find a past case similar to the new one.
    • Use the case to suggest a solution to the current problem.
    • Evaluate the proposed solution.
    • Update the system by learning from this experience.

Main types of CBR methods

  • The CBR paradigm covers a rage of different methods for organizing, retrieving, and indexing the knowledge retained in the past cases.
  • Exemplar-based reasoning:
    • The term is derived from a concept view (conceptual modeling) called “exemplar view”
    • Exemplar is a specific instance of a certain category, which is used to represent the category.
    • A concept is defined extensionally, as the set of its exemplars.
    • CBR methods that learn concepts are called exemplar-based.
    • Solving a problem is a classification problem by finding the right class for the unclassified exemplar.
    • The class of the most similar past case becomes the solution for the classification problem.
  • Instance-based reasoning:
    • A specialization of the exemplar-based reasoning into a highly syntactic CBR-approach.
    • To overcome the lack of guidance from general background knowledge, a large number of instances are used to go close to the concept definition.
  • Memory-based reasoning:
    • This approach emphasizes a collection of cases as a large memory, and reasoning as a process of accessing and searching this memory.
    • The utilization of parallel processing techniques is a characteristic of this method.
  • Case-based reasoning:
    • Case-based reasoning is sometimes used as a generic term to such systems. A typical case-based reasoning methods have some characteristics that distinguish them from other methods.
      • A typical case is usually assumed to have a certain degree of richness of information contained in it.
      • A certain complexity with respect to its internal organization.
      • Case-based methods are able to modify, or adapt a retrieved solution when applied in different problem solving context.
  • Analogy-based reasoning:
    • The analogy-based reasoning is sometimes used as a synonym to typical case-based reasoning.
    • The term can be used to characterize methods that solve new problems based on past cases from a different domain. While typical case-based methods focus on using past cases from the same domain.
    • Research on analogy reasoning is therefore a subfield concerned with mechanisms for identification and utilization of cross-domain analogies.
    • The major focus of study has been on the reuse of past, that is the mapping of a solution from an identified analogy (source) to the present problem (target).

A descriptive framework

Description of the CBR methods and system has two main parts:

  • A process model of the CBR cycle
    • A dynamic model that identifies the main sub-process of a CBR cycle, their interdependency and products
  • A task-method structure for CBR
    • A task-oriented view, where a task decomposition and related problem solving methods are described.

The CBR Cycle

  1. Retrieve the most similar case or cases.
  2. Reuse the information and knowledge in that case to solve the problem.
  3. Revise the proposed solution.
  4. Retain the parts of this experience likely to be useful for future problem solving.

 

  • An initial description of a problem defines a new case.
  • This new case is used to retrieve a case from the previous cases.
  • The new case and the retrieved case are combined in a solved case through reuse which forms an initial solution to the problem.
  • Through revise the solution tested for success by being tested in the real environment, and repaired if failed.
  • During retain the useful experience is retained for future use, and the case base is updated with the new learned case, or a modification of the existing cases.
  • General domain-dependant knowledge supports the CBR process, this support vary from very weak (may be none) to very strong depending on the CBR method used.

A hierarchy of CBR tasks

  • This is a task-oriented view to the CBR model which gives a description for the detailed mechanisms from the CBR perspective, while the process model gives a global, external view to what is happening.
  • A system is viewed as agent which has goals, and means to achieve this goals.
  • A description to the system can be made from 3 perspectives: tasks, methods, and domain knowledge model.
  • Tasks are set up by the system goals, and a task is performed by applying one or more methods.
  • For a method to accomplish a task, it needs a general knowledge about the application domain as well as information about the current problem and its context.
  • A method is an algorithm that identifies and controls the execution of subtasks, it accesses and utilize the information needed to accomplish this.

CBR Problem Areas

  • In AI, there are no universal CBR methods for all application domains.
  • The challenge in CBR is to come up with methods that are suited for problem solving and learning in a subject domain (e.g Medical), and for a particular application environments (e.g Heart failure diagnosis).
  • Core problems address by CBR research can be grouped into 5 areas, a solution to them constitute a CBR method:
    • Knowledge representation.
    • Retrieval methods.
    • Reuse methods.
    • Revise methods.
    • Retain methods.

Representation of Cases

  • A case-based reasoned is heavily dependent on the structure and content of its collection of cases, refereed as case memory.
  • Problem solving is achieved by recalling a previous experience suitable for solving the current new problem, so the case matching and retaining process should be effective and efficient enough.
  • Problems in case representation:
    • Deciding what to store in a case.
    • Finding an appropriate structure for describing the case content.
    • Deciding how the case memory (case-base) should be organized and indexed for fast retrieval and reuse.
    • How to integrate the case memory into a model of general domain knowledge.

The Dynamic Memory Model

  • In dynamic memory model, the case memory is organized in hierarchical structure of ‘episodic memory organization packets’ also known as generalized episodes.
  • The basic idea is to organize specific cases which share similar properties under a more general structure (generalized episodes).
  • A generalized episodes (GE) contains 3 different objects:
    • Norms: features common to all cases indexed under the GE.
    • Indices: features which discriminate between GE`s cases.
    • Cases
  • An index may point to a more specific GE or to a case.

 

  • Case retrieval
    • When a new case description is given, the input case structure is pushed down the network structure starting from the root.
    • When one or more features of a case match the norms of a GE, the case is further discriminated based on the remaining features.
    • The case with the most common features with the input case is used.
    • As overall:
      • Case retrieval is done by finding the GE with the norms in common with problem description.
      • Indices under the GE are then traversed in order to find the case which contains most of the additional problem features not found in the GE norms.
  • Case retaining
    • It is the same as retrieval in the search method.
    • During storing a new case, when a feature of the new case matches the feature of an existing case, a GE is created and the 2 cases are discriminated under the created GE with different indices.
    • So if during case storage two cases or GEs end under the same index, a new GE is created for them.
    • As overall:
      • Storing a new case is done as the case retrieval, with the additional process of dynamically creating GEs.
  • So the memory structure is dynamic in the sense that similar features of two cases are generalized in a GE, and the cases are indexed under the GE by their different features.
  • The index structure may lead to an explosive growth of indices with increased number of cases.
  • The primary role of a generalized episode is an indexing structure for matching and retrieval of cases.
  • This knowledge organization structure is suitable for learning general and specific knowledge, and it is a simplified model of human reasoning and learning.

The Category and Exemplar Model

  • Cases are referred to as exemplars.
  • The view of this method is that the ‘real world’ natural concept should be defined extensionally.
  • Different features are assigned different importance in describing a case membership to a category.
  • The case-memory is structured as a network of categories, cases, and index pointers.
  • Each case is associated with a category, and each case is stored according to its prototypicality strength to its category.
  • Features are key-value pair.
  • Indices are of three kinds:
    • Feature links: pointing from a feature to cases or categories.
    • Difference links: pointing from a case to its neighbor that differs in only a few numbers of features.
    • Case links: pointing from a category to its associated case (called exemplar links).

 

  • In this organization, the categories are linked within a semantic network, which also contains features and intermediate states.
  • Case retrieval
    • Finding a case in the case base that matches an input description is done by combining the input features of the problem case into a pointer to a case or category that shares most of the features.
    • If a pointer points to a category, the links to its strongest prototypical exemplar are traversed, and these cases are returned.
    • General domain knowledge is used to match cases semantically.
  • Case retaining
    • A new case is stored by searching for a matching case.
    • If a case is found with only minor differences to the input case, the new case may not be retained, or the two cases may be merged.

Case Retrieval

  • The retrieval task starts with a problem description, and ends when a best matching previous case has been found.
  • Subtasks of case retrieval:
    • Identify Features: it is about coming up with a set of relevant problem descriptors.
    • Initially match: return a set of cases that are sufficiently similar to the new cases, given a threshold for case similarity.
    • Search
    • Select: works on the initial set of cases to find out the best match.
  • Some case-based reasoners retrieve a previous case based on superficial(apparent), syntactical similarities among problem descriptors(features). Others retrieve a previous cases based on features that have deeper, semantical similarities.
  • Semantical similarities:
    • In order to retrieve a case based on semantic similarities and relative importance of features, an extensive body of general domain knowledge is required to say how 2 cases match and how strong is the match.
    • It is referred as “knowledge extensive” approach, and it is able to use the contextual meaning for the problem description in matching.
    • For domains where general domain knowledge is available.
  • Syntactical similarities:
    • It is referred as “knowledge poor” approach.
    • For domains where general domain knowledge is very difficult or impossible.
  • It is important to know the purpose of the retrieval task, If the purpose is to be adapted for reuse then this can be accounted for in the retrieval method.
  • Identify Feature
    • Can be done by simply noticing the input features, but a more elaborate way is to understand the problem within its context.
    • Unknown features may be disregarded or requested for explanation by the user.
    • Understanding a problem involves:
      • Filtering noisy problem features (ignore ineffective features).
      • Infer relevant problem features (values of effective features).
      • Check whether the feature values make sense within the problem context.
      • Expect feature values not provided as input.
    • Inferring the values of not provided features can be done by using general knowledge model, or by retrieving a similar case from the case-base and use its features as the expected features.
      • Example on such feature expectation: previous case-1 has the value of feature A, B and C. I have inferred values A and B for the new case, and if the new case and case-1 match in A, and B then I can infer that the value of C in the new case is as of case-1.
  • Initially Match
    • The task of finding a good match is about finding a set of good candidate cases, then from this set of cases, find the best matching one (Select task).
    • The idea of matching is to use the input features as indices to the case-memory in a direct or indirect way. (there should be a mapping function).
    • The 3 principles of indexing:
      • Follow a direct index pointer from the input features to the case-base. (direct mapping).
      • Searching an index structure.
      • Search in a model of general domain knowledge. (search is guided by the underlying model of the domain).
    • Some systems combine between the 3 principles of indexing to achieve a certain goal.
    • Cases may be retrieved directly from input features, or from inferred features from the input.
    • Cases with the most matching features are a good candidates, but sometimes cases matching a fraction of features may be retrieved depending on the retrieval strategy.
    • If case is retrieved on the basis of a subset of features, a relevance test should be done. This test is about finding whether the retrieved solution is of the type as the expected solution for the new problem.
    • Similarity assessment can be more knowledge-intensive by understand the problem in its context more deeply and use the goals, constraints, etc to guide the matching.
    • Another option is to weight the features according to their importance in categorizing the problem.
  • Select
    • The best matching is done by evaluating the degree of initial match more closely.
    • This is done by attempt to generate explanations to justify non-identical features based on the knowledge in the semantic network.
    • If a match is not strong enough, the difference link between cases is followed to reach the best matching case.
    • The selection typical process is to generate consequences and expectations from each retrieved case, and attempts to evaluate consequences and justify expectations.
    • This generation, evaluation and justification of consequences and expectation are based on model of general domain knowledge, or by asking the user.
    • Rankings are made for the cases using some metric or ranking criteria.
    • Knowledge-intensive selection methods generate explanations that support this ranking process.
    • The case with the strongest explanation for being similar to the new case is selected as a solution to the new problem.
    • Other properties in the case that are considered in some CBR systems:
      • Prototypicality of case with its assigned class.
      • Difference links to related cases.
      • Relative importance and discriminately strengths of features.

Case Reuse

  • The retrieval of a previous case solution to be used in the new case context focus on 2 aspects:
    • What parts of the retrieved case should be transferred to the new case.
    • The difference among the past and the current case.
  • Copy
    • In a simple classification task, the differences are abstracted away (ignored), and the similarities are considered relevant.
    • The solution class of the retrieved case is transferred to the new case as its solution.
    • Other systems adapt the retrieved case to the new case context by taking into account the differences, and adaptation of the solution to account for those differences.
  • Adapt
    • There are 2 ways to reuse past cases:
      • Transformational reuse
        • Reuse the past case solution.
        • It doesn’t look how a problem was solved, it focus on the equivalence of solutions.
        • The past case solution is not directly a solution, there exist some knowledge in the form of transformational operator {T} that when applied to the past case solution transform it into a solution for the new case.
        • This requires a strong domain-dependant model in the form of the {T} operator.
      • Derivational reuse
        • Reuse the past methods that constructed the solution.
        • It looks how the problem was solved in the retrieved case.
        • The retrieved method holds information about the methods used for solving the retrieved problem including operators used, sub goals considered, alternatives generated, failed search path, etc…
        • Derivational reuse re-instantiate the retrieved methods to the new case and replays the old plan in the new case context.
        • During replay, the successful alternatives, paths and operators will be explored first, while unsuccessful ones will be avoided.

Case Revision

  • When a case solution generated by the reuse phase is not correct, an opportunity for learning from failure arise.
  • The revision task is divided in 2 subtasks:
    • Evaluate the case solution generated by reuse.
    • If success, then learn and retain it (a separate phase called Retain).
    • If failure, then repair the case solution using domain-specific knowledge.
  • Evaluate Solution
    • Takes the result from applying the reused case in the real environment.
    • Applying a case is a task outside the CBR, this can be done by asking a teacher or by performing it directly in the real world.
    • Results of applying the suggested solution may take time depending on the type of application:
      • In Medical CBR, in diagnosis, a case result may appear after hours, or may be months, so the case must be marked as unevaluated, so that it can be evaluated later when its results are out.
      • A solution may be applied to a model of the domain, using simulation program.
  • Repair Fault
    • Involves detecting the errors of the current solution and retrieving or generating explanation for them.
    • Failure detection in CHEF system:
      • Casual knowledge is used to generate explanations of why goals of the solution plan where not achieved.
      • Learns the general situations that will cause the failures using explanation-based system.
      • These learnt general situations are used during the reuse phase, to predict the shortcomings of using a certain solution plan.
      • This allows for early detection of errors so that they can be predicted, avoided and handled.
    • Solution repair task:
      • uses the explanations to modify the solution in such a way that failures don’t occur.
    • Solution repair in CHEF system:
      • Failed plan is corrected by a repair module, that adds steps to the plan that will assure that the causes of the errors will not occur.
      • The repair module possesses domain-knowledge about how to compensate or disable the cause of errors in the domain.
    • The revised plan can be retained directly if approved to be successful by the revise module or it can be repaired and revised again.

Case Retainment – Learning

  • It is the process of incorporating what is useful to retain from the new problem solving episode in the existing knowledge.
  • The learning from success or failure from the proposed solution is triggered by the result of the evaluation and possible repairs.
  • Involves selecting which information from the case to retain, in what form to retain it, how to index the case for later retrieval for similar problems, and how to integrate the new case in the memory structure.
  • Extract
    • In CBR the case base is updated no matter how the problem was solved.
    • If it was solved by the use of a previous case, then a new case may be built, or the used case for solution can be generalized to include the new case.
    • If the problem was solved without using a previous case, by asking the user may be, then an entirely new case should be created.
    • It is important to know what to use as the source of learning?
      • Relevant problem descriptors (features) and problem solutions are good candidates.
      • Sometimes it is required to include in the new case the explanation of why a solution is a solution for a problem.
      • In some systems:
        • Explanations are included in the retained case, and they are used in later modification of the solution.
        • The explanation structure is used to search up the model for other states that explains the input data of the new case, and look for the causes of this states as answer to the new problem.
        • This focuses the search and speeds up the explanation process, compared to searching in the entire domain model.
      • The last thing that can be extracted for learning is the problem solving method.
    • Failures recognized from the revise task, can be extracted and retained, either as a separate failure case, or within total problem cases.
    • When a failure is encountered, the system gets a reminding to a similar failure case, and uses it to improve its understanding of the failure causes.
  • Index
    • It is a central and much focused problem in case-based reasoning.
    • It is about deciding what type of indices to use for future retrieval, and how to structure the search space of indices.
    • Direct indices skips the structuring of the indices search space, however, the type of indices used should be identified.
    • A trivial solution is to use all the input features as indices.
    • In memory-based reasoning, all the cases are matched in parallel on the relevant features, the cases with the few features in common are filtered out.
  • Integrate
    • It is the final step of updating the knowledge base with new case knowledge.
    • Modifying the indexing of the current cases improves the CBR system, and allows it to become a better similarity assessor.
    • Index strengths or importance relevant to the case are adjusted due to the success or failure of using the case to solve the input problem.
    • Features that are judged as relevant to the case their association with the case is strengthened, while weaken of those that lead to unsuccessful cases.
    • In PATDEX system:
      • There is a relevance matrix that links features to the diagnosis (problem solutions) for which they are relevant, this matrix represents the weights of such links, these weights are strengthened or weakened based on the feedback of success or failure.
    • In knowledge intensive-CBR approaches:
      • Learning can take place within the general conceptual knowledge model.
      • The system incrementally refines and extends its general domain knowledge model.
    • The case just learnt can be tested by re-entering initial problem to the system, and finds whether the system will behave as expected.

Integrated approaches

  • Most CBR systems use general domain knowledge in addition to the knowledge represented by cases.
  • The overall architecture of CBR system has to decide interactions between the CBR method and the other components.
  • CASEY system:
    • It integrates model-based casual reasoning in the diagnosis, when the case-based method fail to find a correct solution, the mode-based method is executed to find a solution, the solution is then stored as a new case for future use. Because the model-based method is complex and slow, the case-based method represents a speed up in reasoning when it can.
  • BOLERO system:
    • It integrates a rule-based system (base-level) with a case-based planner(meta-level).
    • The base-level contains domain knowledge, while the meta-level contains strategic knowledge.
    • The case-based planner is used to control the space searched by the base-level achieving a speed up.
    • The control is passed to the meta-level whenever a new information is know by the base-level assuming that the system will able to deduce a better strategic plan based on the new information.

Example applications and tools

  • Application
    • The CBR used because the knowledge needed to perform the task resides in the head of few people.
    • There is no theory, and very few generally applicable schemes for doing this job.
    • Building up experience in the form of previously successful and unsuccessful situations is important.


Advertisements

Problem Solving Techniques part2

In part1 we have introduced the problem formulation and how we judge on a certain search strategy, This article covers 5 uniformed search strategies in which all we know about the problem is its definition and no additional information about its states beyond those told in the problem definition.

Uniformed search strategies are very systematic, that all they can do expand nodes using the successor function and check whether state is a goal state or not.

Search strategies that we will talk about are:

  1. Breadth-first search
  2. Uniform-cost search
  3. Depth-first search and Depth-limited search
  4. Iterative deepening depth-first search
  5. Bidirectional search

Breadth-first search (BFS)

  • Description
    • A simple strategy in which the root is expanded first then all the root successors are expanded next, then their successors.
    • We visit the search tree level by level that all nodes are expanded at a given depth before any nodes at the next level are expanded.
    • Order in which nodes are expanded.

300px-Breadth-first-tree.svg

  • Performance Measure:
    • Completeness:
      • it is easy to see that breadth-first search is complete that it visit all levels given that d factor is finite, so in some d it will find a solution.
    • Optimality:
      • breadth-first search is not optimal until all actions have the same cost.
    • Space complexity and Time complexity:
      • Consider a state space where each node as a branching factor b, the root of the tree generates b nodes, each of which generates b nodes yielding b2 each of these generates b3 and so on.
      • In the worst case, suppose that our solution is at depth d, and we expand all nodes but the last node at level d, then the total number of generated nodes is: b + b2 + b3 + b4 + bd+1 – b = O(bd+1), which is the time complexity of BFS.
      • As all the nodes must retain in memory while we expand our search, then the space complexity is like the time complexity plus the root node = O(bd+1).
  • Conclusion:
    • We see that space complexity is the biggest problem for BFS than its exponential execution time.
    • Time complexity is still a major problem, to convince your-self look at the table below.

image


Uniform-cost search (UCS)

  • Description:
    • Uniform-cost is guided by path cost rather than path length like in BFS, the algorithms starts by expanding the root, then expanding the node with the lowest cost from the root, the search continues in this manner for all nodes.
    • Hints about UCS implementation can be found here.
    • You should not be surprised that Dijkstra’s algorithm, which is perhaps better-known, can be regarded as a variant of uniform-cost search, where there is no goal state and processing continues until the shortest path to all nodes has been determined.
  • Performance Measure:
    • Completeness:
      • It is obvious that UCS is complete if the cost of each step exceeds some small positive integer, this to prevent infinite loops.
    •  Optimality:
      • UCS is always optimal in the sense that the node that it always expands is the node with the least path cost.
    • Time Complexity:
      • UCS is guided by path cost rather than path length so it is hard to determine its complexity in terms of b and d, so if we consider C to be the cost of the optimal solution, and every action costs at least e, then the algorithm worst case is O(bC/e).
    • Space Complexity:
      • The space complexity is O(bC/e) as the time complexity of UCS.
  • Conclusion:
    • UCS can be used instead of BFS in case that path cost is not equal and is guaranteed to be greater than a small positive value e.

Depth-first search (DFS)

  • Description:
    • DFS progresses by expanding the first child node of the search tree that appears and thus going deeper and deeper until a goal node is found, or until it hits a node that has no children. Then the search backtracks, returning to the most recent node it hasn’t finished exploring.
    • Order in which nodes are expanded

 300px-Depth-first-tree.svg

  • Performance Measure:
    • Completeness: 
      • DFS is not complete, to convince yourself consider that our search start expanding the left sub tree of the root for so long path (may be infinite) when different choice near the root could lead to a solution, now suppose that the left sub tree of the root has no solution, and it is unbounded, then the search will continue going deep infinitely, in this case we say that DFS is not complete.
    • Optimality:
      • Consider the scenario that there is more than one goal node, and our search decided to first expand the left sub tree of the root where there is a solution at a very deep level of this left sub tree, in the same time the right sub tree of the root has a solution near the root, here comes the non-optimality of DFS that it is not guaranteed that the first goal to find is the optimal one, so we conclude that DFS is not optimal.
    • Time Complexity:
      • Consider a state space that is identical to that of BFS, with branching factor b, and we start the search from the root.
      • In the worst case that goal will be in the shallowest level in the search tree resulting in generating all tree nodes which are O(bm).
    • Space Complexity:
      • Unlike BFS, our DFS has a very modest memory requirements, it needs to story only the path from the root to the leaf node, beside the siblings of each node on the path, remember that BFS needs to store all the explored nodes in memory.
      • DFS removes a node from memory once all of its descendants have been expanded.
      • With branching factor b and maximum depth m, DFS requires storage of only bm + 1 nodes which are O(bm) compared to the O(bd+1) of the BFS.
  • Conclusion:
      • DFS may suffer from non-termination when the length of a path in the search tree is infinite, so we perform DFS to a limited depth which is called Depth-limited Search.

Depth-limited search (DLS)

  • Description:
    • The unbounded tree problem appeared in DFS can be fixed by imposing a limit on the depth that DFS can reach, this limit we will call depth limit l, this solves the infinite path problem.
  • Performance Measure:
    • Completeness:
      • The limited path introduces another problem which is the case when we choose l < d, in which is our DLS will never reach a goal, in this case we can say that DLS is not complete.
    • Optimality:
      • One can view DFS as a special case of the depth DLS, that DFS is DLS with l = infinity.
      • DLS is not optimal even if l > d.
    • Time Complexity: O(bl)
    • Space Complexity: O(bl)
  • Conclusion:
    • DLS can be used when the there is a prior knowledge to the problem, which is always not the case, Typically, we will not know the depth of the shallowest goal of a problem unless we solved this problem before.

Iterative deepening depth-first search (IDS)

  • Description:
    • It is a search strategy resulting when you combine BFS and DFS, thus combining the advantages of each strategy, taking the completeness and optimality of BFS and the modest memory requirements of DFS.
    • IDS works by looking for the best search depth d, thus starting with depth limit 0 and make a BFS and if the search failed it increase the depth limit by 1 and try a BFS again with depth 1 and so on – first d = 0, then 1 then 2 and so on – until a depth d is reached where a goal is found.
  • Performance Measure:
    • Completeness:
      • IDS is like BFS, is complete when the branching factor b is finite.
    • Optimality:
      • IDS is also like BFS optimal when the steps are of the same cost.
    • Time Complexity:
      • One may find that it is wasteful to generate nodes multiple times, but actually it is not that costly compared to BFS, that is because most of the generated nodes are always in the deepest level reached, consider that we are searching a binary tree and our depth limit reached 4, the nodes generated in last level = 24 = 16, the nodes generated in all nodes before last level = 20 + 21 + 22 + 23= 15
      • Imagine this scenario, we are performing IDS and the depth limit reached depth d, now if you remember the way IDS expands nodes, you can see that nodes at depth d are generated once, nodes at depth d-1 are generated 2 times, nodes at depth d-2 are generated 3 times and so on, until you reach depth 1 which is generated d times, we can view the total number of generated nodes in the worst case as:
        • N(IDS) = (b)d + (d – 1)b2 + (d – 2)b3 + …. + (2)bd-1 + (1)bd = O(bd)
      • If this search were to be done with BFS, the total number of generated nodes in the worst case will be like:
        • N(BFS) = b + b2 + b3 + b4 + …. bd + (bd + 1 – b) = O(bd + 1)
      • If we consider a realistic numbers, and use b = 10 and d = 5, then number of generated nodes in BFS and IDS will be like
        • N(IDS) = 50 + 400 + 3000 + 20000 + 100000 = 123450
        • N(BFS) = 10 + 100 + 1000 + 10000 + 100000 + 999990 = 1111100
        • BFS generates like 9 time nodes to those generated with IDS.
    • Space Complexity:
      • IDS is like DFS in its space complexity, taking O(bd) of memory.
  • Conclusion:
    • We can conclude that IDS is a hybrid search strategy between BFS and DFS inheriting their advantages.
    • IDS is faster than BFS and DFS.
    • It is said that “IDS is the preferred uniformed search method when there is a large search space and the depth of the solution is not known”.

Bidirectional search

  • Description:
    • As the name suggests, bidirectional search suggests to run 2 simultaneous searches, one from the initial state and the other from the goal state, those 2 searches stop when they meet each other at some point in the middle of the graph.
    • The following pictures illustrates a bidirectional search:

image

  • Performance Measure:
    • Completeness:
      • Bidirectional search is complete when we use BFS in both searches, the search that starts from the initial state and the other from the goal state.
    • Optimality:
      • Like the completeness, bidirectional search is optimal when BFS is used and paths are of a uniform cost – all steps of the same cost.
      • Other search strategies can be used like DFS, but this will sacrifice the optimality and completeness, any other combination than BFS may lead to a sacrifice in optimality or completeness or may be both of them.
    • Time and Space Complexity:
      • May be the most attractive thing in bidirectional search is its performance, because both searches will run the same amount of time meeting in the middle of the graph, thus each search expands O(bd/2) node, in total both searches expand O(bd/2 + bd/2) node which is too far better than the O(bd + 1) of BFS.
      • If a problem with b = 10, has a solution at depth d = 6, and each direction runs with BFS, then at the worst case they meet at depth d = 3, yielding 22200 nodes compared with 11111100 for a standard BFS.
      • We can say that the time and space complexity of bidirectional search is O(bd/2).
  • Conclusion:
    • Bidirectional search seems attractive for its O(bd/2) performance, but things are not that easy, especially the implementation part.
    • It is not that easy to formulate a problem such that each state can be reversed, that is going from the head to the tail is like going from the tail to the head.
    • It should be efficient to compute the predecessor of any state so that we can run the search from the goal.

Search Strategies’ Comparison:

Here is a table that compares the performance measures of each search strategy.

image


Wiki Links:

    1. Breadth-first search
    2. Uniform-cost search
    3. Depth-first search
    4. Depth-limited search
    5. Iterative deepening depth-first search
    6. Bidirectional search

References:

Artificial Intelligence: A Modern Approach (2nd Edition) by Stuart Russell and Peter Norvig, Chapter 3: Solving problems by searching.

Problem Solving Techniques part1

We know that the simple reflex agent is one of the simplest agents in design and implementation, it is based on a kind of tables that map a certain action to a corresponding state the environment will be in, this kind of mapping could not be applicable in large and complex environments in which storing the mapping and learning it could consume too much (e.i: Automated taxi driver environment),  Such environments may be a better place for goal-based agents to arise, that is because such agents consider future actions and their expected outcome.

This article is about giving a brief about a kind of goal-based agent called a problem-solving agent.

We start here by defining precisely the elements of a problem and its solution, we also have some examples that illustrate more how to formulate a problem, next article will be about a general purpose algorithms that can be used to solve these problems, this algorithms are search algorithms categorized as the following:

  • Uniformed search (Blind search): when all we know about a problem is its definition.
  • Informed search (Heuristic search): beside the problem definition, we know that a certain action will make us more close to our goal than other action.


image

In this section we will use a map as an example, if you take fast look you can deduce that each node represents a city, and the cost to travel from a city to another is denoted by the number over the edge connecting the nodes of those 2 cities.

In order for an agent to solve a problem it should pass by 2 phases of formulation:

  • Goal Formulation:
    • Problem solving is about having a goal we want to reach, (e.i: I want to travel from ‘A’ to ‘E’).
    • Goals have the advantage of limiting the objectives the agent is trying to achieve.
    • We can say that goal is a set of environment states in which our goal is satisfied.
  • Problem Formulation:
    • A problem formulation is about deciding what actions and states to consider, we will come to this point it shortly.
    • We will describe our states as “in(CITYNAME)” where CITYNAME is the name of the city in which we are currently in.

Now suppose that our agent will consider actions of the from “Travel from city A to City B”. and is standing in city ‘A’ and wants to travel to city ‘E’, which means that our current state is in(A) and we want to reach the state in(E).

There are 3 roads out of A, one toward B, one toward C and one toward D, none of these achieves the goal and will bring our agent to state in(E), given that our agent is not familiar with the geography of our alien map then it doesn`t know which road is the best to take, so our agent will pick any road in random.

Now suppose that our agent is updated with the above map in its memory, the point of a map that our agent now knows what action bring it to what city, so our agent will start to study the map and consider a hypothetical journey through the map until it reaches E from A.

Once our agent has found the sequence of cities it should pass by to reach its goal it should start following this sequence.

The process of finding such sequence is called search, a search algorithm is like a black box which takes problem as input returns a solution, and once the solution is found the sequence of actions it recommends is carried out and this is what is called the execution phase.

We now have a simple (formulate, search, execute) design for our problem solving agent, so lets find out precisely how to formulate a problem.


Formulating problems

A problem can be defined formally by 4 components:

  1. Initial State:
    • it is the state from which our agents start solving the problem {e.i: in(A)}.
  2. State Description:
    • a description of the possible actions available to the agent, it is common to describe it by means of a successor function, given state x then SUCCESSOR-FN(x) returns a set of ordered pairs <action, successor> where action is a legal action from state x and successor is the state in which we can be by applying action.
    • The initial state and the successor function together defined what is called state space which is the set of all possible states reachable from the initial state {e.i: in(A), in(B), in(C), in(D), in(E)}.
  3. Goal Test:
    • we should be able to decide whether the current state is a goal state {e.i: is the current state is in(E)?}.
  4. Path cost:
    • a function that assigns a numeric value to each path, each step we take in solving the problem should be somehow weighted, so If I travel from A to E our agent will pass by many cities, the cost to travel between two consecutive cities should have some cost measure, {e.i: Traveling from ‘A’ to ‘B’ costs 20 km or it can be typed as c(A, 20, B)}.

A solution to a problem is path from the initial state to a goal state, and solution quality is measured by the path cost, and the optimal solution has the lowest path cost among all possible solutions.


Example Problems

image

Vacuum world

  1. Initial state:
    • Our vacuum can be in any state of the 8 states shown in the picture.
  2. State description:
    • Successor function generates legal states resulting from applying the three actions {Left, Right, and Suck}.
    • The states space is shown in the picture, there are 8 world states.
  3. Goal test:
    • Checks whether all squares are clean.
  4. Path cost:
    • Each step costs 1, so the path cost is the sum of steps in the path.

image

8-puzzle

  1. Initial state:
    • Our board can be in any state resulting from making it in any configuration.
  2. State description:
    • Successor function generates legal states resulting from applying the three actions {move blank Up, Down, Left, or Right}.
    • State description specifies the location of each of the eight titles and the blank.
  3. Goal test:
    • Checks whether the states matches the goal configured in the goal state shown in the picture.
  4. Path cost:
    • Each step costs 1, so the path cost is the sum of steps in the path.


Searching

After formulating our problem we are ready to solve it, this can be done by searching through the state space for a solution, this search will be applied on a search tree or generally a graph that is generated using the initial state and the successor function.

Searching is applied to a search tree which is generated through state expansion, that is applying the successor function to the current state, note that here we mean by state a node in the search tree.

Generally, search is about selecting an option and putting the others aside for later in case the first option does not lead to a solution, The choice of which option to expand first is determined by the search strategy used.

image

The structure of a node in the search tree can be as follows:

  1. State: the state in the state space to which this state corresponds
  2. Parent-Node: the node in the search graph that generated this node.
  3. Action: the action that was applied to the parent to generate this node.
  4. Path-Cost: the cost of the path from the initial state to this node.
  5. Depth: the number of steps along the path from the initial state.

It is important to make a distinction between nodes and states, A node in the search tree is a data structure holds a certain state and some info used to represent the search tree, where state corresponds to a world configuration, that is more than one node can hold the same state, this can happened if 2 different paths lead to the same state.


Measuring problem-solving performance

Search as a black box will result in an output that is either failure or a solution, We will evaluate a search algorithm`s performance in four ways:

  1. Completeness: is it guaranteed that our algorithm always finds a solution when there is one ?
  2. Optimality: Does our algorithm always find the optimal solution ?
  3. Time complexity: How much time our search algorithm takes to find a solution ?
  4. Space complexity: How much memory required to run the search algorithm?

Time and Space in complexity analysis are measured with respect to the number of nodes the problem graph has in terms of asymptotic notations.

In AI, complexity is expressed by three factors b, d and m:

  1. b the branching factor is the maximum number of successors of any node.
  2. d the depth of the deepest goal.
  3. m the maximum length of any path in the state space.

The next article will be about uniformed search strategies and will end with a comparison that compares the performance of each search strategy.


References:

%d bloggers like this: