Clippy and the History of the Future of Educational Chatbots

read

Earlier this year, Microsoft made headlines when it debuted Tay, a new chatbot modeled to speak like a teenage girl, which rather dramatically turned into “a Hitler-loving sex robot within 24 hours” of its release, as The Telegraph put it. The Twitter bot was built to “learn” by parroting the words and phrases from the other Twitter users that interacted with it, and – because, you know, Twitter – those users quickly realized that they could teach Tay to say some really horrible things. Tay soon began responding with increasingly incendiary commentary, denying the Holocaust and linking feminism to cancer, for starters.

Despite the public relations disaster – Microsoft promptly deleted the Tay bot – just a few days later Bloomberg Businessweek pronounced that “The Future of Microsoft Is Chatbots.” “Clippy’s back,” the headline read.

Neither Tay nor Clippy should reassure us all that much, I’d contend, about that future.

User Interface Agent as Pedagogical Agent

Clippy was the user interface agent that came bundled with Microsoft Office starting in 1997. It remains, arguably, the best known and most hated user interface agent in computer history.

The program’s official name was Office Assistant, but the paperclip was the default avatar, and few people changed it. Indeed, almost every early website offering instructions on how to use Microsoft’s software suite contained instructions on how to disable its functionality. (Microsoft turned off the feature by default in Office XP and removed Clippy altogether from Office 2007.)

The Office Assistant can trace its lineage back to Microsoft Bob, which was released in 1995, itself becoming one of the software company’s most storied failures. (TIME named it one of “The 50 Worst Inventions” – “Imagine a whole operating system designed around Clippy, and you get the crux of Microsoft Bob.”) Bob was meant to provide a more user-friendly interface to the Microsoft operating system, functioning in lieu of Windows Program Manager. The challenge – quite similar to the one that Clippy was supposed to tackle – was to make computer software approachable to novice users. In theory at least, this made sense as the number of consumers being introduced to the personal computer was growing rapidly – according to US Census data, in 1993 22.8% of households had computers, a figure that had grown to 42.1% by 1998.

How do you teach novices to use a PC? And more significantly, can the personal computer do the teaching?

Microsoft drew on the work of Stanford professors Clifford Nass and Byron Reeves (who later joined the Bob project as consultants) and their research into human-computer interactions. Nass and Reeves argued that people preferred to interact with computers as “agents” not as tools. That is, computers are viewed unconsciously as social actors, even if consciously people know they’re simply machines. And as such, people respond to computers in social ways and in turn expect computers to follow certain social rules.

“The question for Microsoft was how to make a computing product easier to use and fun,” Reeves said in a Stanford press release timed with the Computer Electronics Show’s unveiling of Microsoft Bob. “Cliff and I gave a talk in December 1992 and said that they should make it social and natural. We said that people are good at having social relations – talking with each other and interpreting cues such as facial expressions. They are also good at dealing with a natural environment such as the movement of objects and people in rooms, so if an interface can interact with the user to take advantage of these human talents, then you might not need a manual.” If you made the software social, people would find it easier to learn and use.

Microsoft Bob visualized the operating system as rooms in a house, with various icons of familiar household items representing applications – the clock opened the calendar, the pen and paper opened the word processing program.

This new “social interface” was hailed by Bill Gates at CES as “the next major evolutionary step in interface design.” But it was a flop, panned by tech journalists for its child-like visuals, its poor performance, and perhaps ironically considering Microsoft’s intentions for Bob, its confusing design.

Nevertheless Microsoft continued to build Bob-like features into its software, most notably with Clippy, which offered help to users as they attempted to accomplish various tasks within Office.

Clippy as Pedagogical Agent

Start writing a letter in (pre-Office XP) Microsoft Word. No sooner have you finished typing “Dear” than Clippy appears in the corner of your screen. “It looks like you’re writing a letter,” Clippy observes. The talking paperclip then offers a choice: get help writing the letter or continue without help. Choosing the former opens up the Letter Wizard, a four step process in formatting layout and style.

Other actions within the Office suite triggered similar sorts of help from Clippy – offering to implement various features or offering advice if, according to the program, it appeared that the user was “stuck” or struggling. The Office Assistant also provided access to the Answer Wizard, offering a series of possible solutions to a user’s help query. And it sometimes appeared as an accompaniment to certain dialog boxes – saving or printing, for example.

In all these instances, Clippy was meant to be friendly and helpful. Instead it was almost universally reviled.

Of course, software can be universally reviled and still marketed as good (ed-)tech. (See, for example, the learning management system.) But it seems doubtful that Clippy was all that effective at helping newcomers to Microsoft learn to use Office’s features, as Luke Swartz found in his study on Clippy and other user interface agents.

Swartz suggests that part of the problem with Clippy was that it was poorly designed and then (mis)applied to the wrong domain. If you follow Nass and Reeves’ theories about humans’ expectations for interactions with computers, it’s clear that Clippy violates all sorts of social norms. The animated paperclip is always watching, always threatening to appear, always interrupting. For users busy with the rather mechanical tasks of typing and data entry, these were never the right situations for a friendly chatbot to interject, let alone to teach software skills.

And yet, despite our loathing and mockery of Clippy, pedagogical agents have been a mainstay in education technology for at least the past forty years – before the infamous Microsoft Office Assistant and since. These agents have frequently been features of intelligent tutoring systems, and by extension then, featured in education research. Like much of ed-tech, that research is fairly inconclusive: depending on their design and appearance… pedagogical agents may or may not be beneficial … to some students… under some conditions… in some disciplines… working with certain content or material.

The History of the Future of Chatbots

This spring, despite the PR disaster of Microsoft’s Tay, the tech industry declared that chatbots would be The Next Big Thing, an assertion bolstered when Mark Zuckerberg announced at Facebook’s developer conference in April that the company’s 10 year roadmap would emphasize artificial intelligence, starting with chatbots on its Messenger platform.

Bots are, in fact, a Very Old Thing traceable to the earliest theorization of computer science – namely, Alan Turing’s “Computing Machinery and Intelligence” published in 1950.

The sudden and renewed interest in bots by tech investors and entrepreneurs, and the accompanying hype by industry storytelling, overlooks the fact that roughly half the traffic on the Internet is already bots. Bots crawl and scrape websites. Bots send spam. Bots spread malware. Bots click on ads. Bots tweet, and bots like. Bots DDOS. Bots monitor for vulnerabilities.

Bots also chat but as Clippy demonstrated, not always that effectively. And as one recent Techcrunch opinion writer lamented about the Facebook Messenger platform, “No one actually wants to talk to a bot.” That seems to be rather a crucial observation, often overlooked when hyping the capabilities of artificial intelligence. To be fair, no one actually wants to talk to a human either in many of the scenarios in which bots are utilized – in customer service, for example, where whether conducted by human or machine, interactions are highly scripted.

The first chatbot was developed at the MIT AI Lab by Joseph Weizenbaum in the mid–1960s. This bot, ELIZA, simulated a Rogerian psychiatrist. “Hello,” you might type. “Hi,” ELIZA responds. “What is your problem?” “I’m angry,” you type. Or perhaps “I’m sad.” “I am sorry to hear you are sad,” ELIZA says. “My dad died,” you continue. “Tell me more about your family,” ELIZA answers. The script always eventually asks about family, no matter what you type. It’s been programmed to do so. That is, ELIZA was programmed to analyze the input for key words and to respond with a number of canned phrases containing therapeutical language.

Many of the claims that one hears about “the rise of bots” (now and then and always) focus on AI’s purported advancements – particularly in the area of natural language processing. The field has reached a point where “personal assistant” technologies like Siri and Alexa are now viable – or so we’re told. Commercially viable. (Maybe commercially viable.)

But pedagogically viable? That still remains an open question.

Scripting Pedagogy

“Imagine Discovering That Your Teaching Assistant Really Is a Robot,” The Wall Street Journal wrote in May to describe an experiment conducted on students in an online course taught by Ashok Goel at Georgia Tech. The chatbot TA, “Jill Watson,” would post questions and deadline reminders on the class’s discussion forum and answer students’ routine questions. The surname is a nod to the technology that powered the chatbot – IBM’s Watson. The first name and gendering of the robot? Well, like Tay and ELIZA and Siri and Alexa, these bots are female, as Clifford Nass explained in an interview with The Toronto Star, because of the stereotypes we have about the work – and the gender – of personal assistants, and by extension, perhaps, of teaching assistants.

Artificial intelligence and cognitive science professor Roger Schank, a vocal critic of IBM’s marketing claims about Watson, responded to the Georgia Tech TA bot story:

The artificial TA is not an attempt to understand TA’s, I assume. But, let’s think about the idea that we might actually like to build an AI TA. What would we have to do in order to build one? We would first want to see what good teachers do when presented with problem students are having. The Georgia Tech program apparently was focused on answering student questions about due dates or assignments. That probably is what TA’s actually do which makes the AI TA question a very uninteresting question. Of course, a TA can be simulated if the TA’s job is basically robotic in the first place. [emphasis mine]

But, what about creating a real AI mentor? How would we build such a thing? We would first need to study what kinds of help students seek. Then, we would have to understand how to conduct a conversation. This is not unlike the therapeutic conversation where we try to find out what the student’s actual problem was. What was the student failing to understand? When we try to help the student we would have to have a model of how effective our help was being. Does the student seem to understand something that he or she didn’t get a minute ago? A real mentor would be thinking about a better way to express his advice. More simply? More technically? A real mentor would be trying to understand if simply telling answers to the student made the best sense or whether a more Socratic dialogue made better sense. And a real TA (who cared) would be able to conduct that Socratic dialogue and improve over time. Any good AI TA would not be trying to fake a Rogerian dialogue but would be thinking how to figure out what the student was trying to learn and thinking about better ways to explain or to counsel the student.

Is this possible? Sure. We stopped working on this kind of thing because of the AI winter than followed from the exaggerated claims being made about what expert systems could do in 1984.

Schank’s commentary underscores that, despite all the recent hype about advances in artificial intelligence, we do not have thinking machines. Not even close. We certainly don’t have caring machines. Yet we continue to build teaching machines that reduce pedagogy to its most instrumental form. We continue to build pedagogical agents that reduce helping to the most mechanical and scripted gestures.

“Do pedagogical agents work?” – the question, perhaps unintentionally, underscores the labor of teaching and caring we seem so eager to replace with machines. Instead of relationships, we'll get "chat." Instead of people, we'll have robots.

All this gets to the heart of why Clippy remains (ironically perhaps) so instructive: Clippy was a pedagogical agent that urged Office users to utilize a step-by-step “wizard.” It referred them to the software’s knowledge base. Templated knowledge. Templated writing. Templated and scripted responses based on key words not on cognition or care. And according to Luke Swartz's research on Clippy, people preferred asking other people for help with Office than relying on the machine for guidance or support – asking co-workers, asking the Web. It's not a surprising finding. And yet the tech industry today insists that bots are coming to all sectors, including education. The history of the future...

Clippy and the History of the Future of Educational Chatbots

Audrey Watters

User Interface Agent as Pedagogical Agent

Clippy as Pedagogical Agent

The History of the Future of Chatbots

Scripting Pedagogy

Written by

Audrey Watters

Credits

Hack Education

The History of the Future of Education Technology