Skip to content
UoL CS Notes

Classification of Intelligent Agents

COMP111 Lectures

Intelligent agents are classified depending on how they map their precepts to their actions. They can be classed under:

  • Simple reflex agents
    • Select actions to execute based upon the current percept
    • Don’t take the percept history into account.
    • Very simple IFTTT actions.
    • Very simple to implement but have limited intelligence.
    • They have no memory.
graph TB
    A(Environment)--> |Sensors| B
    B[What the world is like now] --> C
    C[What action I should do now]--> |Actuators| A
    D[Condition-action rules]-->C
  • Model-based reflex agents
    • Maintain an internal state that depends upon the percept history - memory.
    • Helps deal with partial observability.
    • Knows how the world evolves and how its actions affect the world.
graph TB
    A(Environment)--> |Sensors| B
    B[What the world is like now] --> C
    C[What action I should do now]--> |Actuators| A
    D[Condition-action rules]-->C
    E-->B
    B.->E(State)
    F(How the world evolves)-->B
    G(What my actions do)-->B
  • Goal-based agents
    • Select appropriate actions to achieve desirable state of the environment.
    • Knowledge of the current state does not automatically mean that the agent knows what to do.
    • It will have to plan what to do over a long sequence of actions.
graph TB
    A(Environment)  --> |Sensors| B
    B[What the world is like now] --> H
    H[What it will be like if I do action A] --> C
    C[What action I should do now]--> |Actuators| A
    D[Condition-action rules]-->C
    B.->E(State)
    E-->B
    F(How the world evolves)-->B
    F --> H
    G(What my actions do)-->B
    G --> H
  • Utility-based agents
    • Make use of a utility function to compare the desirability of different states that result from actions.
    • Many actions may satisfy the goal but which one is the most desirable?
    • Utility function maps a state, or sequence of states, onto a number to give the degree of usefulness of the state to the agents.
    • Agent maximises the value of its utility function.
graph TB
    A(Environment)--> |Sensors| B
    B[What the world is like now] --> H
    H[What it will be like if I do action A] --> I
    I[How happy I will be in such a state] --> C
    J(Utility)-->I
    C[What action I should do now]--> |Actuators| A
    D[Condition-action rules]-->C
    B.->E(State)
    E-->B
    F(How the world evolves)-->B
    F --> H
    G(What my actions do)-->B
    G --> H
  • Learning agents
    • Learning agents find ways of maximising their utility function based on previous experience
    • The performance element is the same choice mechanism from the previous agents. This is now informed by the learning element.
    • The performance standard informs the learning element how well it’s actions have worked.
graph TB
	A(Environment) --> |Sensors| B[Performance element]
	B --> |Actuators| A
	D[Performance standard] --> C[Critic]
	C --> |Feedback| L[Learning element]
	L --> |Changes| B
	B --> |Knowledge| L
	L --> |Learning Goals| P[Problem Generator]
	P --> B