4 Intelligent Agents – New Technology To The Rescue

We have identified the main problems that need to be addressed to be able to come up with an efficient way of searching as much of the Web as easily as possible. Having taken a look at what requirements we have for a new Web search technology, we proceed to let this chapter provide us with an introduction to relevant aspects of the concept “intelligent agents”, a relatively new technology.

4.1 Intelligent Agents

Until it was discovered that some of the large apes use straws to ”fish” termites for eating, the fact that human beings develop tools to get things done, was considered one of the major differences between the animals and us. Although we cannot claim to be the only ones using tools anymore, it must be admitted that we are pretty good at inventing tools for performing tasks we could not possibly do manually. One of the latest tools the computer industry has come up with is the intelligent agent technology.

Like many brand new technologies that have not been clearly defined from the start, ”agent”, originally a theoretical concept borrowed from the Artificial Intelligence community, has become an over-used term. Pattie Maes [Maes, 1997] is a central person in the international agent research community, and presents agents as a computational system which:

- is long-lived, meaning it has a kind of life cycle to go through;
- has goals, sensors and effectors, meaning it has a clearly defined purpose and the means to sense and act in its surroundings;
- decides autonomously which actions to take in any current situation to maximize progress towards its (time-varying) goals.

In other words, according to this definition an agent is software that lets the user define what is eventually or instantly wanted, and works towards that goal, without the user having to worry about anything else than waiting for the results from the agent's work.

[Nwana, 1996] is more reluctant towards defining agents, but uses the word carefully and selectively in various settings. “When we really have to, we define an agent as referring to a component of software and/or hardware which is capable of acting exactingly in order to accomplish tasks on behalf of its user.”

Rather than being a definable term, Nwana says, “agents” should be considered an umbrella term, meta-term or class, covering a range of specific agent types that have been developed. Some agents automate and personalize user interface operations, other agents function as guardians patrolling computer networks and fixing errors and reporting errors they can not fix themselves. The computer game industry uses agents to create opponents or partners for game players, and there are also agents that act as assistants to experts in various fields.

4.2 Intelligence

The “intelligent” part of intelligent agents is not so much emphasized anymore. Although it may seem as if the software acts intelligently, it is actually just a matter of applying certain rules to the situation in a suitable manner. These rules, the agent’s “brains”, are the components of the agent’s knowledge database. The agent can acquire its rules, that is “learn”, in several ways. Let us take a look at how it can be done by what [Maes, 1997] calls a “software agent” and what [Nwana, 1996] calls an “interface agent”.

A software/interface agent is an agent that assists its owner, a user, in the use of a specific application or user interface. The interface agent may observe many user actions, over a long period of time, before deciding to take a single action, or a single user input may launch a series of actions on the part of the agent, also possibly over a long period of time. As shown in the figure below, the agent can “learn” through a variety of processes:

Figure 4-1, Agent learning

Observing the user’s activity using the application, e.g. identifying patterns of behavior that can be automated and/or used to create a profile for the user
Cooperating with the application, e.g. predict data necessary for future operations and retrieve them for a potentially quicker access for the application
Cooperating with the user, asking the user for instructions when in doubt about whether to act or not.
Being directly instructed by the user, e.g. the user defining new goals for the agent
Communicating with other agents, e.g. learning where to look for information regarding specific topics, based on their “experience”

The apparent intelligence shown by the agent is a result of acting in accordance with the rules in the agent’s knowledge base. These rules may come from:

Standard background knowledge that the agent is supplied with by the agent manufacturer.
Personal knowledge that the user can somehow program the agent to know. This is useful both when the agent is new and not yet familiar with its owner, and later when the user’s preferences and behavior changes more or less drastically.
The agent can automatically learn rules from observing, and build confidence in using the rules through suggesting actions for the user and have the user accept, decline or comment on them.

4.3 Why Agents Are Used

The main motivation for the introduction of agents is that we are heading for a future where the following will apply:

An increasing number of our everyday tasks will become computer-based
Vast amounts of unstructured, dynamic information will be unleashed
More and more users will be introduced to computers and the Internet in their work, and many of them so without being thoroughly trained for it.

Hence, it is necessary to come up with “something” that will make computer and Net operations easier to perform. Software agents seem to be a good solution, although there is still much to be done to make them perform better. Agents must know how their actions can change “the world”, and they must also know as much as possible about their owner’s interests, habits and preferences. With this “in mind”, the agents can make suggestions and/or act on behalf of their owner. They also can operate even when the human user is not logged on, and they can typically do things faster than their owner can when fast actions are called for. The knowledge necessary for all this is a combination of standard knowledge that many agents have in common and the user-specific information the agent acquires through the channels described above (observation, direct instruction and cooperation/communication). In the end, it all comes down to defining rules for the agent’s activities, and define goals that can be reached by acting according to these rules.

4.4 An Agent Example: E-mail Information Agent

Among the first applications to introduce agents were the e-mail applications. This environment provides a suitable example on various thinkable kinds of agent functionality. Let us imagine a busy businessperson who receives literally hundreds of e-mails every day, a mixture of personal e-mails and mails from a number of e-mail lists she subscribes to. She moves around a lot to attend meetings and conferences, and she depends on always having access to the latest information. Her most important tool for ensuring up-to-date information is her personal e-mail information agent, who may have the following functionality:

Simple agent functionality:

After having observed how she reads her e-mail, the agent develops an impression of which e-mails she reads instantly, which e-mails she reads eventually and which e-mails she does not bother to look at at all. Based on who the sender of an e-mail is and what the subject line says, the agent can use its knowledge to sort the incoming, unread e-mail for its owner, so that the most important e-mails are most likely to appear on top of the list of incoming e-mails. The owner can also directly instruct the agent by maintaining a list of e-mail addresses that is always to be considered important senders, for instance.
The agent can browse through the contents of the e-mails as soon as they arrive, and see if there are any references in the shape of URL’s to be found. If the agent finds such references, it can tell the Web browser (or the agent responsible for handling Web references) to retrieve the references to the computer’s local Web cache, so that the owner can quickly look up the information if she wants to.
If the person sees that another person has an agent that seems to act exactly like she wants her own agent to do, she can ask this person for permission to copy the rules used by the other person’s agent. This is based on the idea that “if other users are like me, my agent should act like their agents per default”. Since the agents work using a number of rules, all of or parts of these rules can be exported to and used by other agents. Typically, people would be interested in being able to import an efficient set of rules for very specific subjects, such as “Retrieve high quality information about publications concerning intelligent agents”.

Advanced agent functionality:

The e-mail agent can act as a secretary. If the people the businessperson is to meet also use e-mail agents, the agents can communicate between themselves, e.g. compare their owners’ schedules, which may be available from special scheduling agents, and find the best times for meetings. Having found a suitable date and time for a meeting by sending a number of e-mails between each other, they communicate this to the scheduling agents that update the schedules with the new meeting information.
If the agent gathers from an incoming e-mail that it is a very urgent/important e-mail, and the agent also knows that at this moment, according to her schedule/ scheduling agent, its owner is travelling between her office and a meeting somewhere, it can generate a message for her cellular phone/beeper, telling her to check her e-mail immediately.
When the agent is in doubt about what to do at some stage, it can take its problem to other, more experienced agents, show them the information it has and ask them for advice. For example, the user may seem to always immediately delete so-called “spam mail”, meaning e-mail advertisements. The agent ideally takes care of this, but in some cases it may be difficult to decide whether an e-mail is spam or not. In these cases it may perform a content analysis of the e-mail, and if necessary even go to other e-mail agents and ask them whether they believe this is spam. If e.g. more than six out of ten agents can report that “Yes, I received that e-mail as well, and I considered it to be spam”, our agent can assume that it really is spam, and go ahead and delete it.
By reading and analyzing all the e-mails and sorting them according to keywords either chosen by the owner or picked by the agent, the agent stores all read e-mails in an indexed mail hierarchy. By doing this, the owner can tell the agent to perform actions like “I want you to list all the e-mails I got about the new economy project in September”, “Please list all mails containing invitations to conferences and seminars taking place in Hawaii this winter”, and so on. This provides a very flexible but also efficient way of looking through old e-mails.

4.5 How Agents Travel and Talk

Most people are confused by the concept of agents moving around on the Net doing their job. However, it is important to keep in mind that “agent” itself is only a concept, which in reality is just a piece of software being run somewhere. It can be difficult to come up with an answer to the question “What can you do with mobile agents that you cannot do with stationary agents?”, so we will not concentrate on the possibility of using mobile agents in this paper. However, there are several reasons in general for running software remotely instead of locally:

To reduce network traffic
To share computer load among several computers
To move the program to the data if the data can not come to the program
To enable the user to get things done even while disconnected from the network

There is an increasing number of options, most of them developed fairly recently, when it comes to script and programming languages that allow running programs remotely, thus providing agents with “mobility” and the ability to communicate with agents both locally and on other locations:

Tcl/Tk: Executing scripts remotely.
Telescript: A script language especially designed for agent operations. Using Telescript, agents “go” to centralized servers, conduct their business, return to their users’ distributed locations with the results and present them to the users.
Java: Java-programs, “applets”, can move on the Internet through distributed, Java-enabled Web browsers, bringing both data and program with them and execute locally.

In the future, the mobility of agents will probably not be so much focused on. The most likely scenario is that agents will “stay at home” and conduct their work through communication with other agents elsewhere. A standard language for communication in agent communities has not been chosen yet, but this will probably happen soon, simply because it has to.

There are two popular approaches to the design of a communication language [Genesereth, 1994]. The procedural approach is based on the idea that communication can best be modeled as the exchange of procedural directives. Script languages like the already mentioned Tcl/Tk and Telescript are based on this approach, and offer simple but powerful possibilities. However, there are disadvantages to a purely procedural approach. The flexibility required for bi-directional communication and general non-directed requests is especially difficult to produce with script languages.

The other approach is the declarative one. This is based on the idea that communication is best modeled as the exchange of declarative statements. A declarative language must be sufficiently expressive to communicate a wide range of information, including procedures. The ARPA Knowledge Sharing Effort has outlined the components of an agent communication language (ACL) that belongs to the declarative approach.

ACL consists of three parts:

A vocabulary where all words have an English description for use by humans in understanding the meaning of the word, and formal annotations for use by programs/agents. In this way an ontology is created, where many different human words can all map to the same formal concept.
Knowledge Interchange Format (KIF), an inner language which is a prefix version of first order predicate calculus, with extensions to enhance expressiveness.
Knowledge Query and Manipulation Language (KQML), an outer language which can be used to form expressions suitable for communication.

To sum it up, an ACL message is a KQML expression in which the “arguments” are terms or sentences in KIF formed from words in the ACL vocabulary. These messages need to be understood by a variety of agents and agent types, since although many agents provide their owners with valuable functionality, many problems that need to be solved can not be solved by one agent alone.

4.6 A Look in the Rearview Mirror: Letizia

When people can not find and do not know where to look for the information they need, they sometimes try to find it by moving from page to page, following links that seem to be likely to lead to something they may be interested in. This is what is called a depth-first search or exploration, and is based on the idea that often a hyperlink leads to a different page which is in the same context as the page the link goes from. This is what is called depth-first search or exploration. However, the technique may take the user to pages containing all kinds of information not necessarily of the kind the user is looking for, and often leads to the user feeling “lost in hyperspace”. So far, no major Web browsers offer navigation possibilities that actively prevent this from happening.

Henry [Lieberman, 1997] has introduced Letizia to do something about the problem. Letizia is an agent that aids the user in searching the Web, through assisting the user in the Web browsing process. This autonomous information/interface agent tracks user behavior and attempts to anticipate items of interest by doing a concurrent, autonomous exploration of links from the user’s current position. In other words, it performs a breadth-first exploration of the Web. Letizia can explore search alternatives faster than the user can, and the agent can explore the Web while the user is working, reading or exploring on his own, taking advantage of computer resources that are available anyway. Letizia is one of the first attempts on using agent technology to aid people in searching the Web.

The Letizia interface is shown in Figure 4-2. The user is browsing as usual in the largest window, to the left. The agent controls the two windows on the right, and the user can choose to ignore these windows completely. The top right window displays search candidates, meaning the pages Letizia is considering for recommendation to the user. The bottom right window displays the pages that actually are recommended to the user for further exploration.

Figure 4-2, Screenshot from the Letizia agent used with Netscape

The work Letizia does is a blend of information retrieval and information filtering. The user saves retrieval time, by having the agent retrieve potentially interesting pages while the user reads the current page. Letizia also performs an analysis of what pages the user is most likely to want to read next, and presents the pages to the user in a ranked order. This analysis is based partially on documents the user has read previously, but mainly on the document the user is reading at the moment. Letizia is never in control of the browser, it is for the user to decide whether to use the agent’s suggestions or not. The best use of Letizia’s recommendations is when the user is unsure of what to look at next.

Letizia does not have a natural language understanding capability, so its content model of a document is simply a list of keywords, deducted from the contents of the Web pages the user seems to show an interest in. The agent remembers and looks out for the interests/keywords that the user expresses through his actions while browsing. Interests that are not repeatedly observed decays over time, so that the agent is not clogged with searching interests that may have fallen from the user’s attention.

The technique used to compute the content of a document is a simple keyword frequency measure, tf*idf (term frequency times inverse document frequency). This is based on the idea that keywords that are relatively common in the document, but relatively rare in general, are good indicators of the content. We will take a closer look at this technique in the following. It is not 100% accurate, but it has been sufficient for the purpose of Letizia.

Go to: Front page - Index - Ch. 1 - Ch. 2 - Ch. 3 - Ch. 4 - Ch. 5 - Ch. 6 - Ch. 7 - Ch. 8 - Ch. 9 - Glossary - References Visit the author's homepage : http://www.pvv.org/~bct/ E-mail the author, Bjørn Christian Tørrissen: bct@pvv.org