1.0 - Theory

The following section is a summary of the content presented in the tutorial.

1.1 - Agent Design Problem Definitions

  1. Action Space (A) The set of all possible actions the agent can perform (sometimes called the action set in the discrete case). An action is denoted aA.a \in A.
  2. Percept Space (P) The set of all possible things an agent can perceive
  3. State Space (S) The set of all possible configurations of the world the agent is operating in (sometimes called the set of states in discrete state systems). A state is denoted sSs \in S.
  4. World Dynamics / Transfer Function (T:S×AS)T: S \times A\rightarrow S') A function that specifies how the world changes when the agent performs actions in it; a system model. We sometimes write T(s,a)=sT(s,a) = s'. Maps a tuple of states and actions to a single resulting state.
  5. Percept Function (Z:SP)\Z: S \rightarrow P) A function that maps a world state to a perception. For fully observable problems, the percept function is the identity function I(x)\Iota(x).
  6. Utility Function (U:SRU:S \rightarrow \R) A function that maps a state (or sequence of states) to a real number, indicating how desirable it is for an agent to occupy that state / sequence of states. We sometimes write U(s)=U(s) = some cost or reward.
  7. ~~Action Space (A)~~ The set of all possible actions the agent can perform (sometimes called the action set in the discrete case). An action is denoted aA.a \in A.
  8. ~~Percept Space (P)~~ The set of all possible things an agent can perceive in a time step. E.g. the set of all possible 100 x 100 RGB images. In a fully observable system, Percept Space = State Space.
  9. ~~State Space (S)~~ The set of all possible configurations of the world the agent is operating in (sometimes called the set of states in discrete state systems). A state is denoted sSs \in S
  10. ~~Percept Function~~ (Z:SP)\Z: S \rightarrow P) A function that maps a world state to a perception. There could be multiple states that map to the same percept (not injective, could be ambiguity). In a fully observable function, the Percept Function is the Identity Function I(x) = x
  11. ~~Utility Function~~ (U:SRU:S \rightarrow \R) A function that maps a state (or sequence of states) to a real number, indicating how desirable it is for an agent to occupy that state / sequence of states. We sometimes write U(s)=U(s) = some cost or reward.' Can be framed in terms of positive or negative case.

1.1.1 - Why the Agent Design Problem?

1.2 - Search Problem

1.3 - State Graph Representation

2.0 - Exercises

2.1 - Exercise 1.1

Design a tic-tac-toe or noughts-and-crosses playing agent, using the design components listed above. Assume that a single time step includes a single move by the agent and the immediate move by the opponent. The goal is to win with as few steps as possible.

Given Assumptions

  1. A single time step includes a single move by the agent and the immediate move by the opponent. (Since we need to be able to reframe this as a single-agent decision making problem, we represent the behaviour of the opponent in the world dynamics);.
  2. The goal is to win with as few steps as possible.

Other Assumptions

  1. Does the agent always make the first move or the second move?
  2. Does the agent play as X or O? (This has implications on the action space)
  3. How does the opponent behave? Perfectly rational opponent?
  4. Is a draw preferable to a loss? Losing in a greater number of steps preferable to losing in fewer steps.

2.2 - Exercise 1.2

Consider a navigation app, like an app on your smartphone or car that you use to find your way around UQ or other places. This program is essentially a rational agent. Assume that: 1. Its goal is to find the shortest path to a goal location 2. The map used by the agent is 100% up to date 3. The location provided by the GPS is up to date

a) How will you design it? Use the design components listed earlier. b) Select the type of environment this agent operates in (i.e. discrete/continuous, deterministic/non-deterministic, fully/partially observable, static/dynamic). Explain your selections, and think of the effect of each assumption above to this type c) Define the search problem and its corresponding state graph representation for this query.

About This Question

Assumptions and Details

Other Details

2.3 - Exercise 1.3

A web crawler is a program that systematically browses and downloads web pages from the internet. This is one of the programs that enables us to search the internet. A web crawler can be viewed as a rational agent. Please design a web crawler agent when the agent lives in a) An idea world where no broken links exist and the internet connection always works b) The real world, where both assumptions above are not valid

Agent Design ComponentsState Space: Set of all valid web addresses / URLs This may be hard to represent mathematically, need to build up as we go. Possible next state depends on the number of links on the current page. Impossible to enumerate over each possible sequence of characters that could for a URL. → Action Space: Set of all links which can be followed (changes depending on the current state) → World Dynamics: State changes to the URL of the selected link; in the non-ideal case, the link may be broken or the internet connection may be unavailable, causing the state to return to the previous webpage → Utility function: Number of unique webpages visited (derived from sequence of states, and counting the distinct vertices / sites visited) → Percept Space / Percept Function: Fully observable → State Space / Identity Function

2.4 - Exercise 1.4

A poker bot is a program that automatically plays poker on the internet. Poker bots are software agents that typically use AI techniques to attempt to beat human poker platers. Think about how to design a poker bot for the version of poker called `Texas hold 'em`, with the following rules → Every player is dealt two cards, for their eyes only → The dealer spreads five cards face up for all to see in `three stages` (i) three at once (ii) a single card (iii) another single card → Before and after the card/s in each stage are revealed, players take turns to bet → The best poker hand wins the pot (all the bets) What complications arise when a poker bot tries to play against more than one poker player?

State Space: All combinations of:

  1. Player's current hand/cards, opponent's current hand/cards, current dealer cards
  2. Current and previous bets placed by bother players
  3. Number of chips held by booth players and in current pot

Action Space A={Check,Call,Raise,Fold}A = \{Check, Call, Raise, Fold\} Assumes players always raise by the same amount, otherwise 1 action for each raising amount

World Dynamics

Utility Function

Percept Space / Percept Function