The Design of Future Things Page 5
The twenty-first-century automobile has more and more reflective components: the conscious, reflective parts of the car+driver are being taken over by the car itself. The reflective powers are evident in the adaptive cruise control that continually assesses how close the car is to other vehicles, navigation systems that monitor how well the driver conforms to instructions, and all systems that monitor the driver’s behavior. When the car’s reflective analyses find problems, they signal the person to change behavior or just simply correct it when possible—but the car will take over complete control when it determines that this is required.
Someday cars will no longer need drivers. Instead, people will all be passengers, able to gossip, read, or even sleep while the car chauffeurs them to their destination. Do you enjoy driving? Fine, there will be special places set aside for people to drive their cars, just as those who enjoy horseback riding today have special places set aside for that activity. When this day arrives, and I expect it to happen some time in the twenty-first century, the entity known as car+driver will be extinct. Instead, we will have cars, and we will have people, just as we used to, except now the car will be visceral, behavioral, and reflective: a truly intelligent, autonomous machine, at least for the purposes of transportation, which will include not only the navigation and driving but also taking care of the comfort and well-being of the passengers, providing the right lighting, temperature, food and drink, and entertainment.
Will passengers be able to have meaningful conversations with their cars? In the past, the human tendency to assign beliefs, emotions, and personality traits to all sorts of things has been criticized as anthropomorphism. As machines gain in their cognitive and emotional capacities, the anthropomorphism may not be so erroneous. These assignments might very well be appropriate and correct.
The Gulfs of Goals, Action, and Perception
People have many unique capabilities that cannot be replicated in machines—at least not yet. As we introduce automation and intelligence into the machines we use today, we need to be humble and to recognize the problems and the potential for failure. We also need to recognize the vast discrepancy between the workings of people and of machines.
Today, there are “intelligent systems” in many everyday things. We have intelligent washing machines, dishwashers, robot vacuum cleaners, automobiles, computers, telephones, and computer games. Are these systems really intelligent? No, they are responsive. The intelligence is all in the heads of the design team, people who carefully try to anticipate all possible conditions and program into the system the appropriate responses. In other words, the design team is mind reading, trying to assess all of the possible future states and how a person would respond in each situation. On the whole, these responsive systems are valuable and helpful—but they often fail.
Why the failure? Because these systems can seldom measure directly the object of interest: they can only measure things their sensors can detect. Human beings have an incredibly rich sensorimotor system that allows continuous assessment of the state of the world and of our own bodies. We have tens of millions of specialized nerve cells for detecting light and sound, touch and taste, feel and balance, temperature and pressure, and pain, and internal sensors for our muscles and body position. In addition, we have built up complex representations of the world and our actions upon it, as well as accurate expectations based upon a long history of interaction. Machines don’t even come close.
Machines’ sensors are not only limited, but they measure different things from those of people. Psychological perception is not the same as physical sensing. Machines can detect light frequencies, infrared and radio waves that people cannot. They can detect sound frequencies that lie outside the range of human perception. The same is true for many other variables, as well as for action systems. We humans have flexible muscles and limbs, with dexterous fingers and toes. Machines are much less flexible but also more powerful.
Finally, people’s goals are very different from those of machines. Indeed, many people would even deny that machines have goals. As machines get smarter and smarter, more and more intelligent, however, they will assess the situation and decide upon a course of action, with some distinct goals that they wish to accomplish. As for emotions, well, human emotions are central to our behavior and interpretation of the world. Machine emotions don’t exist, and even when machines do start to have rudimentary emotions, they will differ considerably from those of people.
Common Ground: The Fundamental Limitation in Human-Machine Interaction
Alan and Barbara begin with a great mass of knowledge, beliefs, and suppositions they believe they share. This is what I call their common ground. . . . [T]hey assume to be common ground what has taken place in conversations they have jointly participated in, including the current conversation so far. The more time Alan and Barbara spend together, the larger their common ground. . . . [T]hey cannot coordinate their actions without rooting them in their common ground.
—Herbert Clark, Using Language.
Communication and negotiation require what linguists call a “common ground”: a shared basis of understanding that serves as the platform for the interaction. In the quotation by the psycholinguist Herbert Clark, above, the fictitious couple, Alan and Barbara, involve their shared common ground in all joint activities, whether linguistic or not. When people from the same culture and social group interact, their shared beliefs and experiences allow for rapid and efficient interactions. Ever eavesdrop on the conversations of others? I do it often while walking through shopping malls and parks, in the name of science, of course. I am continually amazed by the lack of content, even between two people heavily engaged in discussion. A typical conversation might go like this:
Alan: “You know?”
Barbara: “Yeah.”
To Alan and Barbara this interchange might very well be deep and significant. You and I will never know because all the critical knowledge we need to understand what is being referred to is missing: their common ground is unavailable to us.
The lack of common ground is the major cause of our inability to communicate with machines. People and machines have so little in common that they lack any notion of common ground. People and people? Machine and machine? That’s different: those pairs function quite well. People can share with other people. Machines can share with other machines. But people and machines? Nope.
It might surprise you to hear that machines can share common ground with one another, but that is because their designers, usually engineers, spend a lot of time to ensure that all the background information required for efficient communication is indeed shared. When two machines start to interact, they first go through a ritual to ensure that there is mutual agreement about shared information, states, and even the syntax of the interaction. In the jargon of communication engineers, this is called “handshaking.” This is so important that the engineering world has developed a huge framework of international committees to develop worldwide standards to ensure that communicating devices share the same assumptions and background knowledge. Standards are difficult to work out, for they require complex negotiations among otherwise competing companies, with technical, legal, political issues all having to be resolved. The end results are worth it, however: they establish the common language, protocols, and background knowledge required for the establishment of a common ground and, therefore, for effective communication.
Want an example of how two machines establish common ground? Although the handshaking is usually quiet and invisible to us humans, it is involved in almost every use of electronic devices that wish to communicate with another, whether it is your television set talking to the cable box and the cable box to the transmitting equipment, your computer connecting to a website, or your cell phone searching for a signal when you first turn it on. The most accessible example, however, comes from all those peculiar sounds that come out of a fax machine. After you have dialed the phone number (note that the dial tone and ringing sounds are also
forms of handshaking), you then hear a series of warbling tones as your fax machine negotiates with the receiving machine what coding standard to use, what transmission rate, and what resolution on the page. Then, as the fax proceeds, one machine transmits the signals, and the other continually acknowledges correct receipt. It’s a more restricted and mechanized version of the interaction between two people meeting for the first time as they try to figure out whom they know in common and what skills and interests they might share.
People can share common ground with other people. Machines can negotiate a common ground with other machines. But machines and people inhabit two different universes, one of logically prescribed rules that govern their interaction, the other of intricate, context-dependent actions, where the same apparent condition will give rise to different actions because “circumstances are different.” Moreover, the fundamental gulfs of goals, actions, and perception mean that machines and people will not even be able to agree upon such fundamental things as, What is happening in the world? What actions can we take? What are we trying to accomplish? The lack of common ground is a supergulf, keeping machines and humans far apart.
People learn from their pasts, modifying their behavior to account for what they have learned. This also means that the common ground between people grows over time. Moreover, people are sensitive to which activities have been shared, so that Alan may interact with Barbara quite differently than he does with Charles, even in similar circumstances, because Alan realizes that the common ground he shares with Barbara is quite different from what he shares with Charles. Alan, Charles, and Barbara have the capacity to exchange new information; they can learn from their experiences and modify their behavior accordingly.
In contrast, machines can barely learn. Yes, they can make modifications in their performance as they experience success or failure, but their ability to generalize is very weak and, except in a few laboratory systems, pretty much nonexistent. Machine capabilities are continually improving, of course; throughout the world, research laboratories are working on all of these issues. But the gulf between what people have in common with one another and what machines and people have in common is huge and unlikely to be bridged in the foreseeable future.
Consider the three opening scenarios of future capabilities that started this chapter. Are they possible? How can machines know a person’s private thoughts? How can they know what other activities are happening outside the range of their sensors? How can machines share enough knowledge about people to be so cocky in their suggestions? The answer is, they can’t.
My refrigerator won’t let me eat eggs? Maybe I’m not going to eat them; maybe I’m cooking for someone else. Yes, the refrigerator could detect that I was removing eggs, could know my weight and cholesterol levels through a medical information network that included both my home and some parts of my medical record from my physician’s office, but that still doesn’t give it the ability to read my mind and determine my intentions.
Can my automobile check my schedule and select an interesting route for me to drive? Yes, everything in that scenario is possible except, perhaps, the natural language interaction, but systems that speak are getting pretty good, so I wouldn’t rule that out. Would I agree with the choice? If the car acted as described, it wouldn’t matter: it is presenting an interesting suggestion, one I might not have thought of, but allowing me to choose. That’s a nice, friendly interaction, one I certainly approve of.
Could my house actually be jealous of other nearby homes? This is unlikely, although comparing the equipment and operation of nearby homes is a perfectly sensible way to keep up to date. In businesses, this is called “benchmarking” and following “best practices.” So, once again, the scenario is possible, although not necessarily with the same jaunty language.
Machines are very limited in learning and predicting the consequences of new interactions. Their designers have incorporated whatever limited sensors their budget and the state of technology will allow. Beyond that, the designers are forced to imagine how the world might appear to the machine. From the limited data provided by the sensors, the designers must infer what might actually be going on and what actions the machine ought to take. Many of these systems do remarkably well as long as the task is well constrained and there are no unexpected occurrences. Once the situation goes outside the simple parameters for which they were designed, their simple sensors and intelligent decision-making and problem-solving routines are simply insufficient for the task. The gulf that separates people from machines is immense.
The fundamental restriction on people’s successful interactions with machines is the lack of common ground, but systems that avoid this danger, that suggest rather than demand, that allow people to understand and choose rather than confronting them with unintelligible actions, are perfectly sensible. The lack of common ground precludes many conversationlike interactions, but if the assumptions and commonalities are made clear, perhaps through implicit behavior and natural interactions that are readily interpreted by both machines and people, why then, I’m all for it. And this is the topic of chapter 3.
FIGURE 3.1
Kettle with whistle. A simple technology that summons us to do its bidding: Hear my whistle? Come and take care of me.
Photograph © Daniel Hurst. Used under license
from Acclaim Images™.
CHAPTER THREE
Natural Interaction
Whistles signal. People communicate. The difference is profound. Designers may think their designs communicate, but, in fact, they only signal, for the communication only goes in one direction. We need a way of coordinating our activities, cooperating with autonomous machines, so that we can perform tasks together smoothly, pleasurably.
Natural Interaction: Lessons to Be Learned
Almost all modern devices come with an assortment of lights and beeping signals that alert us to some approaching event or act as alarms, calling our attention to critical events. In isolation, each is useful and helpful. But most of us have multiple devices, each with multiple signaling systems. The modern home and automobile can easily have dozens or even hundreds of potential signals. In industry and health care, the number of alerts and alarms increases dramatically. If the trend continues, the home of the future will be one continual wail of alerts and alarms. So, although each single signal may be informative and useful, the cacophony of the many is distracting, irritating, and, as a result, potentially dangerous. Even in the home, where danger is less often encountered, when many signals might be active, even the beep of one is unintelligible:
“Did I hear the washing machine beep?” asks my wife.
“I thought it was the dishwasher,” I respond, scurrying from kitchen to laundry room and back again, trying to figure out which it was.
“Oh, it’s the timer on the microwave oven. I forgot that I had set it to remind me when I had to make that phone call.”
The devices of the future promise to move us into even more confusion and annoyance if they follow the same method of signaling used today. Yet, there is a better way, a system of natural interaction that can be more effective and simultaneously less annoying. We manage well in the natural world, interpreting the signs and signals of the environment and its inhabitants. Our perceptual system conveys a rich sense of space, created from the seamless combination of sights and sounds, smells and feelings that surround us. Our proprioceptive system conveys information from the semicircular canals of the inner ear and our muscles, tendons, and joints to give us a sense of body location and orientation. We identify events and objects rapidly, often from just minimal cues—a brief glimpse or sound, for instance. But more importantly for my purposes, natural signals inform without annoyance, providing a natural, nonintrusive, nonirritating, continuous awareness of the events around us.
Consider natural sounds, for example: not the beeps and buzzes of our equipment, not even speech sounds, but natural environmental sounds. Sounds convey a rich picture of the happenings aroun
d us because sounds are an automatic result whenever objects move, whenever they meet one another, scraping, colliding, pushing, or resisting. Sounds tell us where things are located in space, but they can also reveal their composition (leaves, branches, metal, wood, glass) and activity (falling, sliding, breaking, closing) as well. Even stationary objects contribute to our aural experience, for the way that sounds are reflected and shaped by environmental structures gives us a sense of space and our location within it. This is all done so automatically, so naturally, that we are often unaware of how much we depend upon sound for our spatial sense and for our knowledge of the events in the world.
There are lessons to be learned from these natural interactions with the real world. Although simple tones and flashes of white or colored light are the easiest ways for designers to add signals to our devices, they are also the least natural, least informative, and most irritating of means. A better way to design the future things of everyday life is to use richer, more informative, less intrusive signals: natural signals. Use rich, complex, natural lights and sounds so that people can tell whether a sound is in front or behind, up or down, what the material and composition is of visible objects, whether an expected event is near in time or far, critical or not. Not only are natural signals less intrusive, but they can be a lot more informative, always in the background making us, if only subconsciously, aware of the state of ongoing processes. They are easier to identify, so we no longer have to scurry about trying to find the source of the signal. Natural, yet providing continual awareness. The natural world of sound, color, and interaction is also the most satisfying. Want an example? Consider the whistling kettle.