The following section is a summary of the content presented in the tutorial.
Action Space (A)
The set of all possible actions the agent can perform (sometimes called the action set in the discrete case). An action is denoted Percept Space (P)
The set of all possible things an agent can perceiveState Space (S)
The set of all possible configurations of the world the agent is operating in (sometimes called the set of states in discrete state systems). A state is denoted World Dynamics / Transfer Function
(Percept Function
(Utility Function
(~~Action Space (A)~~
The set of all possible actions the agent can perform (sometimes called the action set in the discrete case). An action is denoted ~~Percept Space (P)~~
The set of all possible things an agent can perceive in a time step. E.g. the set of all possible 100 x 100 RGB images. In a fully observable system, Percept Space = State Space.~~State Space (S)~~
The set of all possible configurations of the world the agent is operating in (sometimes called the set of states in discrete state systems). A state is denoted ~~Percept Function~~
(~~Utility Function~~
(Initial State
and the Goal State
to fully define the Search Problem.A State Graph Representation
is a way to represent a search program concretely in a program
Another way of thinking about the problem
A State Graph is used in problems with continuous or very large state spaces to compactly represent the state space
Formally, a state graph
Vertices (V)
representing states, andEdges (E)
representing world dynamicsEach edge
It may also be labelled by the action to move from state
A solution
is a path from the initial to goal vertices in the state graph.
Cost
The sum of the cost associated with each edge of the path.
optimal solution
is the shortest (lowest cost) path through the state graph.Design a tic-tac-toe or noughts-and-crosses playing agent, using the design components listed above. Assume that a single time step includes a single move by the agent and the immediate move by the opponent. The goal is to win with as few steps as possible.
Given Assumptions
Other Assumptions
X
or O
? (This has implications on the action space)Agent Design Components
State Space
The State Space must allow for all possible states to be denoted.
Note that not all combinations that can be represented in the state space can result from valid gameplay. For example, in this representation, the grid can be filled with all X
or O
which is not possible in a game. A more compact representation (with less possibilities, more strongly enforced constraints) is possible, but this representation has good readability.
We don't need to enumerate over the entire state space (that is, determine every possible permutation of the state space - there are
Action Space
The set of all possible actions that the player can perform.
Placing the player's symbol in any of the available squares (tiles with the _
character).
Not every action is valid in every state - can't place in an already occupied tile. This means that the action space is limited by which tiles have already been populated.
World Dynamics
Upon the agent's move → When the agent performs an action
Upon the opponent's move → One of the available tiles is filled with the opponent's symbol (opponent's choice of tile depends on problem assumptions)
There is an assumption that the opponent is going to behave with perfect rationality.
Not completely deterministic as there are multiple possible squares that the opponent could choose between.
If the agent is playing against a human opponent, we need to model how the opponent will play (not deterministic, or consistent).
Complicated to describe in formal mathematical notation, describe using words instead.
Utility Function
Winning in the smallest number of moves (i.e. with the smallest number of occupied tiles placed
One possible utility function is defined as:
Note that if
The
The number 10 is intentionally chosen as the maximum number of moves is 9
Winning in the maximum number of moves yields
Any win is better than a draw
Any win with less moves is better than a win with more moves.
Percept Space
/ Percept Function
Is the environment fully observable
or partially observable
? The game is fully observable as the agent is able to observe the exact state at every time step.
Therefore, percept space = state space
percept function = identity function
Consider a navigation app, like an app on your smartphone or car that you use to find your way around UQ or other places. This program is essentially a rational agent. Assume that: 1. Its goal is to find the shortest path to a goal location 2. The map used by the agent is 100% up to date 3. The location provided by the GPS is up to datea) How will you design it? Use the design components listed earlier. b) Select the type of environment this agent operates in (i.e. discrete/continuous, deterministic/non-deterministic, fully/partially observable, static/dynamic). Explain your selections, and think of the effect of each assumption above to this type c) Define the search problem and its corresponding state graph representation for this query.
About This Question
Assumptions and Details
Other Details
What is the definition of a state? This depends on the use case of the navigation app
How precise does the navigation need to be?
Does the user always follow the directions correctly? Deviate from the path? Change their mind half way through?
Do we need to account for traffic and pedestrians?
Agent Design Components
State Space
The set of all valid street addresses / set of all landmarks / set of all (latitude, longitude) combinations
What about discretisation (resolution) of our navigation system / map / GPS?
Action Space
- Set of all actions (navigation) that can be performed by the agent
All legal driving manoeuvres
Movement and heading (with constraints for minimum and maximum value) for moving in 2d plane
World Dynamics
Position changes to the next node in the direction the agent selected (if we assume that the user always follows the instruction of the agent).
How does this change if the user doesn't necessarily follow the instruction of the agent / application entity.
Utility Function
Reach the goal state with the minimum distance travelled
Convert the cost to a utility function by multiplying by -1
Location provided by the GPS is correct (and gives enough resolution to fully determine the state of the agent) → Percept exactly reveals the agent's state, so this is a fully observable problem.
Percept Space = State Space
Percept Function = Identity Function
Environment Type
Discrete vs Continuous
Deterministic vs Non-Deterministic
Fully vs Partially Observable
Environment Type
(Static vs Dynamic)
Search Problem & State Graph Representation
Initial State: User's current location Goal State: User's desired location
Landmarks are vertices, paths between landmarks are edges
Street addresses and intersections are vertices road sections are edges, street addresses always have 1 or 2 connections, intersection may have more
Graph with grid structure, where each node has a latitude / longitude corresponding to row and column in the grid.
→ If continuous, complete state graph is impossible as there are
A web crawler is a program that systematically browses and downloads web pages from the internet. This is one of the programs that enables us to search the internet. A web crawler can be viewed as a rational agent. Please design a web crawler agent when the agent lives in a) An idea world where no broken links exist and the internet connection always works b) The real world, where both assumptions above are not valid
Agent Design Components
→ State Space
: Set of all valid web addresses / URLs
This may be hard to represent mathematically, need to build up as we go. Possible next state depends on the number of links on the current page. Impossible to enumerate over each possible sequence of characters that could for a URL.
→ Action Space
: Set of all links which can be followed (changes depending on the current state)
→ World Dynamics
: State changes to the URL of the selected link; in the non-ideal case, the link may be broken or the internet connection may be unavailable, causing the state to return to the previous webpage
→ Utility function
: Number of unique webpages visited (derived from sequence of states, and counting the distinct vertices / sites visited)
→ Percept Space / Percept Function
: Fully observable → State Space / Identity Function
A poker bot is a program that automatically plays poker on the internet. Poker bots are software agents that typically use AI techniques to attempt to beat human poker platers. Think about how to design a poker bot for the version of poker called `Texas hold 'em`, with the following rules → Every player is dealt two cards, for their eyes only → The dealer spreads five cards face up for all to see in `three stages` (i) three at once (ii) a single card (iii) another single card → Before and after the card/s in each stage are revealed, players take turns to bet → The best poker hand wins the pot (all the bets) What complications arise when a poker bot tries to play against more than one poker player?
State Space
: All combinations of:
Action Space
World Dynamics
Utility Function
Percept Space / Percept Function
Cards in the player's own hand are revealed + some of the common cards depending on the current state
Cards not dealt, cards in opponent's hand are not part of the percept.
Partially observable.
Same percept, but underlying state is different