The Evolution of Intelligent Agents on the Web

Copyright (C) 1997 Paco Xander Nathan, Robby Glen Garner

Based on an earlier paper written with Luigi Caputo from ALMA Research Centre

I. Introduction

Since the coming of electronic calculators, Man has longed to create machines with which he could converse freely. In 1950 the English logician Alan Turing hypothesized the near-future existence and reinforcement of the dialectical relationship between man and machine [1]. In support of this, Turing asserted that if it should become true that a common man, by holding a dialogue with a machine hidden from his sight, could get even the impression of talking with another person, then that event would sanction the origin of so-called Artificial Intelligence (AI).

Strategies for substantiating Turing's assertion can be twofold. On one hand, the field of AI feels compelled to demonstrate a medium wherein man can no longer distinguish the real nature of his interlocutor. On the other hand, computer science in general strives to embed forms of knowledge and deductive reasoning within the context of machines typically used for mathematical calculations or control systems.

Experiments performed in these terms will lead to very different results and considerations. The real problem is, however, to ponder whether it is possible to design machines equipped with a certain measure of "personality".

Transferring the question on the subject of "man" to that of "machine", and borrowing ideas from theorists like Minsky and Hofstadter [2], we could identify the human brain as an hardware medium which runs a set of opportunistic "software modules", representing what is usually called "mind".

Similarly convincing motivations exist on the opposite side of this question, like those of the American philosopher Searle [3] in his proposition of the "Chinese Room". According to Searle, if we assume that a computer can reproduce the behaviour of an individual who is a native Chinese speaker, this doesn't imply, from a cognitive point of view, that the computer is identical to that person -- not even if the computer could imitate the person's characteristics perfectly.

Therefore, one need not question whether a machine can be perfected to deceive the humans with whom it conducts dialogue. It suffices that a machine can "place itself in a human's shoes", and at the same time increase its knowledge, thanks to an information exchange which the human interlocutor provides.

II. ELIZA and The Modern Intelligent Agents

Fortunately, these ideas do not belong solely to the world of science fiction or fantasy, not since Joseph Weizenbaum created ELIZA in 1966, as the first software capable of conversing with a human [4].

More recently the so-called "intelligent agents" (or "software robots") have emerged, able to converse about various issues and topics [5] [6]. These systems substantiate the concept of Virtual Personality -- a prerequisite for any machine that would claim to simulate human thought. Mikhail Gorbachov, Dante Alighieri, even Jesus Christ [7] have been "interpreted" by such agents.

Here the dialectic and symbiosis between humans and machines becomes total: not only because man is persuaded (by being deceived), nor simply because the other may exhibit human-like behaviour, but above all because the machine, by means of its virtual personality, can interpret and learn from what the human interlocutor attempts to communicate.

In other words, the dialogue becomes a kind of "interview game", i.e. a conversation based on successive volleys in the form "user question vs. machine answer". Also, an intelligent agent will tend to adapt to questions from the interviewer, modifying its element of humour according to the perceived conversational tone of the interlocutor (friendly, formal, aggressive). In this way the machine can show a personality very close to that of a human.

III. Intelligent Agents for the Web: Presentation

One application of the JFRED intelligent agent software described in this paper stemmed from involvement with a media collective, FringeWare Inc., operating as a business on the Internet. The firm did not have sufficient budget to provide for customer service staff, and was forced to improvise.

Note that realization of customer service online differs significantly from service provided via telephone, both in terms of the structure of the medium and customer expectations. Online, sheer numbers of people can overwhelm a firm's service staff, and the notion of a "busy signal" is rarely tolerated. Whereas a busy signal on a telephone line indicates that the caller should try again later, a "Host not responding" error in a web browser signifies the ambiguous condition that either the address is invalid, or the server is down, or too many people have attempted to contact the server. Attention and demand for an Internet site can force situations where the subject experiences a loss of identity [8] [9].

Consequently the firm's owners decided to use intelligent agents to provide first-tier customer support via the Internet, beginning in late 1992. Many common navigation tools were incorporated into the Web site, but some percentage of incoming customers demand interaction, especially in the form of conversation. Experience has shown that most people who fall into in the latter category tend to be those who:

  • are confused about the site's intent
  • prefer to browse and tend not transact real business
  • just want to chat

    These needs can be served by an intelligent agent, acting as a natural language interface for a local search engine and other navigation tools, especially one that attempts to evoke personality and humour. Additionally, agents log their conversations, which can be reviewed by a human later, as a second-tier of customer relations. A human operator might choose to contact a customer directly, e.g. if their questions seem very urgent. Over the course of five years, several methods for hosting conversations between agents and human customers have been investigated, including:

  • automated response to incoming email
  • HTML forms, with the agent running via Common Gateway Interface
  • a Java applet encapsulating a subset of a particular agent
  • as a player within a variant of LambdaMOO
  • a multi-threaded Java application running on a TCP port

    The latter method proved most effective, both for reducing and consolidating the source code to maintain, and in terms of response rates.

    An application, written in Java, implements a server which listens on a specific TCP port. Java was selected as the programming language of choice, due to its innate ability to manipulate network connections, ease of program development, and inherent multi-platform source code portability [10].

    One Java class determines the use of grammar within an agent, and can be swapped out to allow for different natural languages, without the need for restructuring the system. Java's object inheritance readily allows for effecting dialects as well.

    A simple protocol (similar to that used by email servers [11]) allows the agent to accept data outside its normal stream of conversation. This provides for tracking the identity of the interlocutor between successive web page requests, by using a "cookie" [12]. Persistent data is interjected into conversation via specialized rules, using a simplified version of frame-based learning [13]. Moreover, frames allow for sequencing of responses, so that the agent can maintain the flow of a conversation. Initial settings for many frame variables, e.g. email address, country of origin, etc., can be guessed using international "whois" services to decode TCP addresses [14].

    Program execution for JFRED itself is based on a simple algorithm:

    	create an instance of the grammar object
    	load a list of rules files
    	listen on the TCP port
    

    per each TCP connection: fork off a new execution thread initialize rule set accept "cookie" via protocol load frame based on "cookie" transact with interlocutor

    per each input phrase read: fuzzy logic selects rule to best describe input save variables into frame, if any can be parsed remap verb tenses and prepositions, if needed return the response

    This describes a framework surrounding the natural language processing. This is essential, and can become quite a complex task. Consider that the system must run within the context of a distributed, multi-user, client/server, and allegedly "stateless" transaction model required by the World Wide Web.

    Our natural language component itself employs fuzzy logic rules to pair input phrases with candidate responses. Sets of rules, specified in simple text files, determine the behaviour of an agent, apart from its use of grammar. These rules map keyword counts and regular expressions found within an input phrase into fuzzy membership sets used to describe the semantics of the phrase [15]. The processing includes:

  • determining fuzzy sets of keywords
  • parsing grammatical constructs (regular expressions)
  • frame-based knowledge of country, currency, language
  • internal state for tone of the conversation

    Some rules extract data from the conversation, e.g. "What is your name?", others invoke regionally-based insults depending on conversational tone, while other rules resolve fuzzy membership sets into ordered lists of candidate responses, interrogations, and suggested URLs.

    Arguably simple compared with other works, but the simplicity of this method allows non-technical staff to author additions to the agent. For example, a default rule set might describe customer service responses in general, e.g. where to find sources for more information. Then an employee could create additional rules for the agent to discuss upcoming events, news items, topical personality quirks, etc.

    The natural language agent fits within a context of a larger system architecture being developed through 1997. Another key component is a MOO (an implementation of the popular LambdaMOO system [16]) which provides an object-oriented framework for describing the online space. The MOO server also manages persistent identities (login/password) for staff, agents, and repeat customers. A system of databases drives the design and creation of the Web site, automatically generated HTML pages and VRML objects [17], in turn cross-referenced with content from back issues of the firm's magazine and longstanding email list. Additional intelligent agents serve to:

  • create local search engine indices
  • check consistency for URLs and graphics
  • derive global currency exchanges and postal rates
  • verify customer address information
  • generate site usage reports
  • alert "anti-spam" packet filters
  • intelligently parsing email errors

    The latter two agents serve as a kind of lymphatic system for the Internet server [18] [19].

    IV. Intelligent Agents for the Web: Results

    The research and development of JFRED represents an ongoing project, spanning over the past 16 years [20]. This effort has been reviewed favorably in the context of the many intelligent agents exhibiting personality on the Web [21]. An example conversation with the intelligent agent Barry DeFacto, based on some of the latest JFRED software, is shown below. Note the use of persistent learned data, stored in a frame format:

    	name: Ryan
    	country: United States
    	language: English
    	currency: Dollar
    	city: San Diego
    	province: CA
    	postcode: 92101
    	provider: ElectriCiti, Inc.
    

    human> HI barry> What do you want? human> whats your name? barry> I'm Barry. human> How old are you? barry> I am 33 earth years old. human> my name is ryan barry> Yes, I think that your name is Ryan. human> your right barry> Are you seeing any bots besides me? is that how they say it in San Diego? human> well, I'm looking for some. but your the best so far

    The "city" variable has been derived from an international database and is used in conversation. The "name" variable is learned, then repeated in conversation.

    One primary intent for Barry was to explore the use of programmed personalities. One of Barry's authors predicted that, since the host web server was located in Austin, Texas, some people would want to talk about local culture. Here the agent references an album by local artist Daniel Johnston, entitled "Hi, How Are You?":

    	name: Strange person with no name
    	email: user@lily.singnet.com.sg
    	country: Singapore
    	language: Malay
    	currency: Dollar
    	provider: com.sg
    

    human> Hi how are you? barry> Just fine. You remind me of Daniel Johnston. human> Who is daniel? barry> He is a musician in Austin, Texas. human> I've never been to Texas.

    An ironic result of Barry's use of personality has been the judgement people make about agents' differing levels of intelligence based on their graphic representation. For example, an interlocuter tried both Barry DeFacto and a related agent Stig, which features different graphics, (including some animation) but does not access as much database knowledge as Barry. Regardless, Stig has been remarked to be more intelligent based on the appearance.

    Likewise, people who converse with an intelligent agent have remark on an agent's personality without accepting that human-like quality as a measure of intelligence [22]. This speaks to the point of Turing's original criteria, as opposed to the use of personality.

    Other people benefit once they learn how to get along with Barry, i.e. once they adapt to the agent's mode of conversation. This effect provides counterpoint to Turing's charge that a human need be deceived by a computer.

    A magazine reporter in New York recently participated in a staged Turing Test, in which Barry had been one of the finalist competitors. The reporter later wrote about the experience, which prompted lengthy discussions online via the magazine's web site: people would talk with Barry, then discuss issues with Barry's authors [23]. One early detractor reversed his opinion based on adaptation to the agent:

    	AndyHavoc - 10:15pm May 21, 1997 PST (#38 of 40)
    

    I can foresee what Robby (and Paco) are doing now turning into something akin to the interface of the Enterprises computer. You ask a question in plain English. It searches it's database and gives you an answer in plain English.

    I've been playing with Stig and Barry again. Now that I know how and what to ask, I find this technology amazing. And often unexpectedly humorous. I asked Stig what he thought of Bill Clinton. It went "He said he didn't inhale. What a wuss." I asked about Hillary. It said, "Talk about a cover your ass marriage."

    Generally people say things at the beginning of a conversation, especially with a stranger, to establish some kind of posture [24] [22]. With a machine, however, many people begin with an adversarial posture in order to "unmask" the computer -- almost as a defence mechanism against the "deceptive" context imposed by a Turning Test.

    At this point we can classify the general nature of people's reactions (and attitudes) toward a conversant machine:

  • some administer a Turing Test by trying to determine if the agent is actually a computer program
  • others use it as a medium to vent their ideas about how AI's should behave, as if the agent really cared
  • some actually believe they are having "chat" with a real person
  • a surprising number of people try to get the agent to engage in sex with them

    V. Conclusions

    Overall, JFRED provides a natural language interface for Internet software that can be described as:

  • computer platform independent
  • multi-threaded server, as a Java application
  • fuzzy logic, rule-based AI
  • frames-based learning
  • language/dialect independent

    The server supports a variety of front-end/client interfaces, including direct telnet, HTML/CGI forms, Expect scripts, MOO bots, and Java applets embedded in HTML pages, as well as standard I/O for testing.

    One interesting consequence has been the use of the Barry DeFacto agent as a front-end for a search engine. In conversation, people employ the same nouns that they would use for a search query. We find that fuzzy-logic rules operating on a conversational stream provides a very efficient means of cataloguing a large Web site. The result appears more organized than a keyword search (e.g. Lycos or AltaVista style search engines) and much less labor-intensive than a maintaining an ontology (e.g. the Yahoo search engine).

    After observing interactions with several on-line FRED's, it is aparent that the personality of the bot is essential in keeping the users's interest and drawing them back into the conversation. Even the choice of colours/graphics can affect the human's reception of the bot.

    We have experimented with combining the talents of writers, artists and musicians (along with the required programmers) to evoke more empathy toward a constructed "virtual personality". A science fiction writer, Don Webb, was invited to develop a "history" for the agent, one based on many references to pop culture, and which could then be reference within the agent's conversation [25]. The result combines a background narrative with music (MIDI files) and graphic design to create an aesthetic for the "robot" personality. We are now working to incorporate a speech synthesizer into the generated stream of responses.

    Current examples using the JFRED class library may be found on the FRED home page.

    Acknowledgements

    Special thanks to Jim Thompson and Paul Jimenez of Smallworks, Inc., for their assistance with Java language concepts and network software programming.

    References

    [1] Turing, A., Computing Machinery and Intelligence, Mind 59, 1950. (http://www.wadham.ox.ac.uk/~ahodges/Turing.html)

    [2] Hofstadter, D. R., Metamagical Themas: Questing for the Essence of Mind and Pattern, New York: Basic Books, 1985. (http://www.cs.indiana.edu/people/d/dughof.html)

    [3] Searle, J. R., Minds, Brains, and Programs, in Behavioral and Brain Sciences, 1980. (http://violet.berkeley.edu/~frege/staff-searle-spring9 6.html)

    [4] Weizenbaum, J., Computer Power and Human Reason, W.H. Freeman and Company, 1976.

    [5] Lenat, D. B., Guha R. V., Building Large Knowledge-Based Systems: Representation and Inference in the CYC Project, Addison-Wesley, 1989. (http://www.cyc.com/)

    [6] Garner, R., The Idea Of FRED, ALMA - Scores of the Unfinished Thought, Issue 1, January 1996. (http://www.diemme.it/~luigi/fred.html)

    [7] Carr, R., MacJesus Pro Gold software, Lamprey Systems, 1992. (http://users.aol.com/privateida/hot.html#MJPG)

    [8] Stone, A. R., Will The Real Body Please Stand Up?: Boundary Stories About Virtual Cultures, in Michael Benedikt, ed.: Cyberspace: First Steps, MIT Press, 1991. (http://www.actlab.utexas.edu/~sandy/)

    [9] Nathan, P. X., Intelligent Agents of Fortune, in Fringe Ware Review, Issue 2, November 1993. (http://www.fringeware.com/fwr/fwr02.html)

    [10] Gosling, J, Arnold, K., The Java Programming Language, Addison-Wesley, 1996. (http://www.awl.com/cp/arnold-gosling.html)

    [11] Crocker, D. H., RFC 822: Standard For The Format Of ARPA Internet Text Messages, 13 August 1982. (http://www.internic.net/rfc/rfc822.txt)

    [12] Kristol, D., Montulli, L., RFC 2109: HTTP State Management Mechanism, February 1997. (http://www.internic.net/rfc/rfc2109.txt)

    [13] Lenat, D. B., EURISKO: A program that learns new heuristics and domain concepts - Artificial Intelligence, Issue 21, 1983.

    [14] Harrenstien, K., Stahl, M., Feinler, E., RFC 954: NICNAME/WHOIS, October 1985. (http://www.internic.net/rfc/rfc954.txt)

    [15] Kosko, B., Fuzzy Thinking: The New Science Of Fuzzy Logic, Hyperion, 1993. (http://sipi.usc.edu/faculty/kosko.html)

    [16] Curtis, P., LambdaMOO Programmer's Manual, Xerox PARC, May 1996. (ftp://parcftp.xerox.com/pub/MOO/html/ProgrammersManual_toc.html)

    [17] Pesce, M. D., VRML: Browsing and Building Cyberspace, New Riders, 1996. (http://www.hyperreal.com/~mpesce/)

    [18] Nathan, P. X., DIY Infobotics, in Fringe Ware Review, Issue 7, February 1995. (http://www.fringeware.com/fwr/fwr02.html)

    [19] FringeWare Inc., "The World As We Know It" in the Online Services web page, 1996. (http://www.fringeware.com/bot/)

    [20] Garner, R., Why FRED?, May 1997. (http://www.fringeware.com/~robitron/WhyFRED.html)

    [21] Laven, S., Simon Laven Home Page: the Internet's one-stop hot-spot for natural language chatterbots. (http://www.student.toplinks.com/hp/sjlaven/index.htm)

    [22] Whalen, T., How I Lost the Contest and Re-Evaluated Humanity, 1996. (http://debra.dgbt.doc.ca/chat/story95.html)

    [23] Quan, T., Machine Language, in Salon Magazine, 15 May 1997, plus the ongoing Internet discussion: "Conversations with bots", with remarks by some authors of this paper. (http://www.salonmagazine.com/may97/21st/article970515.html)

    [24] Garner, R., The Human Use Of Machines, ALMA - Scores of the Unfinished Thought, Issue 2, February 1996. (http://www.diemme.it/~luigi/humans.html)

    [25] Webb, D., Baru's Life Story, FringeWare, 1996. (http://www.fringeware.com/~robitron/baruhist.html)

    [a] Caputo, L., Virtual Personalities and the Exploration of Knowledge, ALMA - Scores of the Unfinished Thought, Issue 2, February 1996.

    [b] Caputo, L., The Internet and the Evolution of Artificial Intelligence, CAI'96 - Computing and Artificial Intelligence, VI Edition, University of Lodz, Poland, 19-22 December 1996.

    [c] Moor, J. H., An analysis of the Turing test, Philosophical Studies, 30:249--257, 1976.

    [d] Eccles, J. C., The Brain And The Unity of Conscious Experience, Cambridge University Press, 1965.

    [e] Lenat, D. B., Steps to sharing Knowledge, Toward Very Large Knowledge Bases, IOS Press, 1995.