AI Engineer and Entrepreneur

AI: Planning

AI is the process of finding appropriate actions for an agent. Therefore planning is in some sense of a core of AI

Problem Solving Search over state space. Given a state space and problem description it can find a path to a goal. Those approaches are great for variety of environments, but they only work in an environment determinitsitic and fully observable.  In this approach planning is done ahead.

If we reduce constraints, and put an agent in real world it won’t be able to complete the task because he will need get feedback from real world as it goes (example with hiker).

Blue only saw one feet in front of him. Yellow also could saw one feet in front of him, but he also saw shadows from trees.

Se we cannot plan ahead and come up with the whole plan and only execute, we need get feedback from environment, plan and execute iteratively.

Why?

Because of properties of environment:

Stochastic – traffic light example

Multiagent – we now results of their actions only during execution time (not during planning)

Partial observably - [A,S,F,B], when we start we cannot predict if road is closed on S until we get into that state S.

We can also have difficulties because lack of knowledge on our own part.

Unknown – If some model of the world is unknown (for example map in GPS incomplete)

Hierarchical - Plan like [A,S,F,B] is very high level, that Agent cannot execute directly. Agent can execute only low level actions (press pedal, turn steering wheel). When we start we have only high level.

Most of problems could be addressed by changing perspective. Instead of planning in the space of world states, we plan in the sate of belief states. 

To understand that let’s look at the state with 8 possible states.

Suppose robot’s sensor break down. This is unobservable environment. How does the agent now represent the state of the world? It could be in anyone in 8 states. All we can do is to assume that agent somewhere inside the world. that doesn’t seem very helpful. Good is to know that we don’t know anything at all. But the point is that we can search in the space of belief states rather that in the state space of actual spaces.

So we believe that we are in 1 of these 8 states, and now when we execute an action we are going to get to another belied state. This is the belief state space for sensor-less vacuum problem. We started in top middle box. We don’t know anything about where we are, but the amazing thing is, if we execute actions we can gain knowledge about the world even without sensing. So let’s say we move right then we’ll know we’are in the right-hand location. So now we end up in top right box, and now we know more about the world. We are down to 4 possibilities rather than 8, even though we haven’t observed anything, and now note something interesting, that in real world, the operations, of going left and going right are inverses of each other, but in belief state world, going right and left are not inverses. If we go right and then we go left we don’t end up back where we were, in state of total uncertainty, rather going left takes us to top left box, where we again have 4 states instead of 8.

Note that is possible to formal plan that reaches a goal without ever observing the world. Plans like that are called conform-it plans. For example if the goal is to be a clean location all we have to do is suck. So we go from top middle box to middle box with 4 states. We don’t know where agent is, but we know that we achieved the goal.

We’ve been considering sensor-less planning in a deterministic world. Now let’s take a look to partially observable but still in deterministic world. Suppose we have what’s called local sensing, that is our vacuum can see what location it is in and it can see what’s going on in the current location, that is where there is dirt in a current location, but it can see anything about whether there is dirt in another location. So here is the pratial diagram of the part of the belief state from that world, and I want to show how the belief state unfolds as 2 things happen. First as we take action, so we start in this state, and take the action of going right, and in this case we still go from 2 world states in our belief state to 2 new ones, but then after we do an action, we do an observation, and we have the act percept cycle, and now, once we get the observation, we can split our belief state. We can say that “if we observe that we’re in location B and it’s dirty, then we know we are in top left state, that happens to have exactly 1 wold state in it, and if we observe that we’re clean, then we know that we are in left bottom state, that also happens exactly to have one state”.

What is the act-observe cycle do to the sizes of the belief states? In a deterministic world, each of the  individual world states within a belief state maps into exactly one other one. So that means the size of the belief state will either stay the same or it might decrease if two actions will accidentally bring them into one state.

The observation works in kind of the opposite way. We take our belief and partitioning it into pieces. Observation by itself cannot introduce new state. Am observation cannot make us more confused than we were before the observation.

In stochastic partially observable environment actions tend to increase uncertainty, and observation tend to bring that uncertainty back down.

Instead of writing sequence [S,R,S], we will use tree structure with conditionals – [S. while:A, R, S]. Garanteed at infinity.

S ituation Calculus

Situation calculus is regular first order logic with a set of conventions for how to represent states and actions.

    • Actions are represented as objects in FOL, normally by functions (which are objects)
    • Situations, are also objects in logic and they correspond not to states, but to paths (of actions that we have in state space search). If we arrive to single world state with different sets of actions, it would be considered as different situations in Situation Calculus. We describe the situations by objects, so we usually have an initial situation, S0. We have functions on situations called Result of situation object and an action object is equal to another situation S’.

Instead of describing the actions that are applicable in a situation with a predicate Actions of S, we are going to talk about the actions that are possible in the state  and we are going to do it with predicate Poss(a,s).

In general it has form of some precondition of state S, that implies it’s possible to do action A in state S. Example of possibility axiom for the Fly action. If there is some P, which is the plane in state S, and there is some X, which is an airport in state S, and there is some Y, which is also an airport in state S, and P is at location X in state S, then that implies that it is possible to fly P from X to Y in state S.

This is know as a possibility axiom for Action Fly.

Axiom – The fluent is true if and only if either it wasn’t true before and action A made it true or it was true before and A didn’t stop it being true.

Example: it is possible to execute A in situation S. If that’s true then In predicate holds between some cargo C and some plane P in the state, which is the result of executing action A in state S. So that In predicate will hold if and only if either A was a load action or it might be that it was already true that the cargo was in the plane in situation S and A is not equal to an unload action.



Leave a Comment

Your email address will not be published. Required fields are marked *

*


seven - 5 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>