In Pursuit of the Mechanical Man
Robot soldiers in any form may be decades away, but that task is simple compared with the skills and efforts needed to produce a robot that could be mistaken for a real human. Creating a humanoid robot is the ultimate goal for many AI researchers, and the most daunting. A convincing humanoid robot would have to walk, gesture, and maneuver as easily as a human, understand the spoken language, be able to communicate, and perhaps even be able to feel emotions and respond to feelings. These are just some of the challenges that AI researchers in labs all over the world must consider.
The scientific research in pursuit of the mechanical man is scattered. Researchers tend to specialize in only one small area of humanoid robotic operation such as speech recognition, visual perception, or mobility, each of which is a highly technical, complex discipline in itself. This is an enormously costly endeavor with an uncertain timeline. For many years, Japan has led the research in humanoid robotics because, as Hirohisa Hirukawa of Japan's National Institute of Advanced Industrial Science and Technology says, "We are confident that the long-term future of humanoid robots is bright." 17
Why Make It Look Human?
Researchers already know how to make machines that are sturdy enough to be dropped onto another planet, smart enough to run businesses, and precise enough to perform surgery. So why would scientists go to all the trouble of making a robot look like a person? This question is hotly debated. Some scientists believe there is no reason. So far, nonhumanoid robots perform better than those with humanoid designs, and it is less expensive to create machines that maneuver on several legs or on wheels or treads. Some researchers raise ethical concerns that if robots look too human there may be the potential for abuse. "Robots need to be designed so that they will be regarded as appliances rather than as people," 18 says John McCarthy of Stanford University. People may treat humanoid robots as slaves,
and that is one relationship that ethicists fear could carry over to interpersonal relationships.
The Creep Factor
Kismet's face hardly resembles a human's, but it is cute and it does attract an observer's attention because of its lifelike expressions. David Hanson, who worked for Walt Disney Imaging as a sculptor-roboticist, is one researcher who is working on putting a more realistic face on future robots. He created a high-tech polymer plastic that resembles human skin, which he calls f'rubber. With it he created a very human-like face over a robotic head. So far powered only by a laptop computer, the head, called Hertz, can bat its eyes and ask questions. Hertz has twenty-eight very realistic-looking facial movements, including smiling, sneering, and furrowing its brow. But a robot who looks too real can cause problems. Tests showed that a robot with a face that was too realistic gave people the creeps and actually decreased the robot's effectiveness.
The other side in this debate is strongly in favor of a robot that looks like a person, especially if it were to work with people in a home or work environment. "If you build a robot that people have a short-term interaction with, you'd better make it connect with things people are familiar with," 19 says Stanford University professor Sebastian Thrun. People are more likely to interact with a robot that is designed like themselves than a robot with an alien shape and design. There is also the belief that the essence of intelligence is a combination of mind and body. Japanese robotics expert Fumio Hara believes that a robot would not be completely effective without the embodiment of a humanlike presence. So what elements are essential in order to make a robot humanlike? It needs to have a familiar face, identifiably humanlike behaviors, appropriate social interactions, and of course a physical form that stands upright and is capable of walking on two legs.
Walk This Way
Almost all children are toddling around on two legs by the age of two. This physical attribute took millions of years to evolve in primates. Honda, one of the largest industrial companies in the world, spent millions of dollars but only ten years of top-secret research to effectively duplicate this movement in a machine.
Walking upright is extremely difficult to duplicate and requires a lot of AI computing power. It requires balance, coordination, muscle power, and flexibility just to take three steps across a smooth tile floor in a straight line. But stepping out into an unfamiliar rocky terrain with many obstacles in a robot's path requires even more AI power to adjust a step, alter foot placement, and register ground resistance—things people do without conscious thought.
Honda's robot, ASIMO (Advanced Step in Innovative Mobility), is able to walk backward and sideways, go up and down stairs, and even play soccer. ASIMO looks like a small child traipsing around in a white plastic space suit. ASIMO is only four feet tall, but it can simulate human locomotion and use its arms in a realistic fashion. Its designers made special efforts to make it cute so that it was not perceived as threatening and would be more easily accepted. In 2003 ASIMO marched into a state dinner attended by the prime ministers of Japan and the Czech Republic. ASIMO shook hands with Prime Minister Vladimir Spidla and placed a bouquet of flowers at the base of a statue honoring science fiction author Karel Capek, who coined the term robot in 1920.
Other humanoid robots have demonstrated their prowess at martial arts, soccer, dancing, and even conducting an orchestra. The key to making a robot athletic is simulating the muscle, bone, and nerves in machinery. All robots make use of many of the same components: a jointed metal or plastic skeletal frame and motors, pulleys, gears, and hydraulics to provide the muscle power. But advances in polymer chemistry are changing that. Researchers are experimenting with new materials such as EAP, or electroactive polymer, to produce more realistic muscle power. EAP is a rubbery plastic substance that works by changing shape when electricity is applied to it. It can be made into bundles of fibers that are able to shorten or lengthen, just like real muscles, when the fibers are attached to a motor. The material also weighs less and is less likely to break down than metal. But the most important element is the brainpower needed to coordinate it all.
ASIMO carries its computer brain in a pack on its back. It has three cameras (two on its head and one at its waist) that allow it to see and chase a soccer ball. It also has sensors in each ankle to predict its next step. Gravity sensors keep track of the force of each movement, and solid-state gyroscopes monitor its body position and keep it upright. But what keeps this petite robot upright and balanced while carrying out complex movements are impressive AI algorithms programmed into its circuitry. If ASIMO stands on one leg and swings its arm out to the side, the program automatically adjusts and the robot moves its torso to keep its balance.
Being able to move and walk on two legs is one accomplishment, but knowing how to navigate is another. Using preprogrammed maps like Shakey the robot used fifty years ago is no longer the way robots get around. Today genetic algorithms direct robotic navigation and control systems so that a robot can learn and adapt to any new environment. There are many navigation programs under experimentation, but the most unusual uses pain as a navigational tool.
Robots in this experiment were trained to seek out specific objects, grab them, and transport them to a specific drop-off point. The experiment's designer, Matthew Hage of the Lawrence Livermore National Laboratory, influenced the robot's choice of route by programming it to "feel" pain. When the robot bumped into a physical object, it "hurt" from the damage it suffered. If it came close to a hot spot, a place where radiation emanated, the robot also associated it with pain and kept its distance. The robot's task, then, was to follow the least painful path.
Few robots are equipped to perceive stimuli as pain, but sensory perception is key to making an effective humanoid robot or any other artificial intelligence. Humans experience the world through five senses. In order for a robot to interact with humans effectively, it has to be able to experience what humans experience. Without the senses of hearing, touch, sight, taste, and smell, people would not be able to act fully within their environment. Even Alan Turing felt that perception was important in his early theoretical design. In his paper "Computing Machinery and Intelligence" he suggested, "It is best to provide the machine with the best sense organs that money can buy and then teach it to understand and speak English. This process could follow the normal teaching of a child." 20
Perception and thinking are the respective functional correlates of the sensory organs and the brain. In order to learn the most from its environment, the human brain fine-tunes how and what a person senses. Giving machines the chance to perceive the world through similar, if not better, sensory organs allows them a chance to understand the world as humans do. Otherwise they would simply be programmed machines incapable of learning.
Some of the earliest perception systems were designed to recognize language, that is, identify characters and words that make up text. Once the language was perceived, the machine would convert it into a coded form. For example, most search engines today operate using Optical Character Recognition Systems to read typed-in information. Understanding what that information meant, however, was limited to the context of the word or symbol. What at first looked like a simple exercise to create a machine that could see and recognize symbols became an exploration into how humans perceive and understand the world.
Many aspects of human sensory perception are difficult if not impossible to duplicate. The human eye, for example, is an incredibly complex structure that
provides frontal and peripheral vision; a pair of eyes also provides depth perception and better form recognition. The eyes can lock on to a moving object and follow it in one smooth motion or move in a stop and start motion from object to object so that vision is not blurry as a person's eyes move around. The eyes register the level of light and measure depth and distances of objects. The eyes sift through a lot of information before any encoded messages are sent to the brain, where 60 percent of the cerebral cortex is devoted to processing visual information. Scientists know how the eye operates—the mechanics of the eye and its movements can be duplicated—but scientists know much less about how eye-brain perception works. Duplicating that connection is more difficult.
To simulate eye-brain perception, a robot needs several cameras operating at the same time. The cameras must be connected to a neural network that can sift through data to pass on the pertinent information to higher levels of network. Mathematical algorithms convert patterns of color intensities and turn them into descriptions of what appears before the cameras. Computer vision is capable of detecting human faces, locating eyes in a face, tracking movement, and registering various shapes and colors. But it is not as good at recognizing whether a face is male or female, determining the direction of a person's gaze, or recognizing the same person who appears later with a hat or beard, or recognizing the difference between a cup and a comb. Those distinctions come from a higher level of perception that is still being explored. A robot's view of the world is best described by Rodney Brooks in his book Flesh and Machines as "a strange, disembodied, hallucinatory experience." 21
Like eyes, ears are complex organs. The human ear is capable of registering and identifying thousands of different sounds. A person can determine what made the sound and how far away the source of the sound may have been. Microphones that operate as a robot's ears only receive the sound. The perception comes in the form of complex AI programs that register sound and match it to a stored catalog of recognizable noises.
Human ears receive sound in waves, which are converted into electrical impulses that are sent to the brain. This conversion is done in the cochlea of the ear, where the sound waves resonate and trigger the movement of tiny hairs called cilia, which fire the attached neural cells. An artificial ear operates in much the same way. Locating the origin of a sound is done through measuring the infinitesimal differences in time between the waves reaching two different microphones. Speech recognition is achieved through the use of sophisticated pattern recognition software.
The sense of touch is also very complicated. Large industrial robotic arms are dangerous when they maneuver with great force; a humanoid robot would need to have a delicate sense of touch so that if it bumped into a person, the robotic arm would recoil automatically without harming the human.
A humanoid robot would also have to be able to use human tools and grasp and hold objects. Robotic hands are designed with special strain gauges that measure the amount of pressure needed to pick up an object and contact switches that simulate touch and grasping motions. When a switch comes in contact with something, it closes and sends a signal to the computer brain. The strain gauges record the appropriate amount of pressure needed to pick up the object, making it possible for the same robotic hand to pick up a hammer one minute and a fragile egg the next.
The advantage of an artificial system is that it can be enhanced with extrasensory perception. For example, an artificial nose can be enhanced to far exceed the sensitivity of a flesh-and-blood nose. Called neural noses, these microchips are so sensitive that they can detect smells that humans are not even aware of. Artificial noses are used in airports to detect explosive materials and narcotics and may be used in diagnosing cancer by smelling precancerous tissues. Incorporating this type of extrasensory perception into a humanoid robot would give it an advantage over its human counterpart. Military robots, for example, might be fitted with chemical noses or infrared night vision that would tell them when a living being was nearby.
NASA has engineered one of the more dexterous robots, called Robonaut, to perform dangerous construction work on the space station. The prototype had to be designed to use finer motor skills than a space-suited astronaut would have. Each arm attached to Robonaut's Kevlar body contains more than 150 sensors that are connected to an artificial spinal cord through which sensory signals flow quickly back and forth to the computer brain. Its hands are capable of opening a can of soda, cutting with scissors, and handling tools commonly used on the space station.
Researchers in Japan are also working on a way to make robot hands as sensitive as a human hand. A rubbery pressure-sensing membrane laminated onto a flexible layer of plastic transistors creates a primitive artificial skin. When the fake skin is touched with the metal tip of a rod, it generates a weak electrical signal, which is then sent to receivers in the computer brain and registers as a touch.
A robot that can touch, see, and hear is ready to experience the world as a human does. One robot that embodies Alan Turing's idea of allowing a robot to learn like a child through sensory perception is Cog (from the Latin word cogitare , "to think"). A human infant learns through trial and error as it encounters each new aspect of its environment. Each new piece of information it learns is filed away and used as a basis for more experiences and learning opportunities. Cog, created at MIT more than ten years ago, learns the same way but is still no more knowledgeable than a six-month-old baby.
Unlike Deep Blue and other expert systems, Cog was not programmed to do much of anything. It must learn all the necessary data it needs by experiencing the world around it. It learns to move and react by watching its trainer's movements and reactions. Cog's bulky metal frame and face hide the fact that it is just a baby. Like an infant, it can track movement with its camera eyes and move its head to keep an object in view. It can recognize some faces, detect eye contact, and react to sounds.
This imitative learning takes time, but it is approaching self-directed learning because it allows interaction between human and robot. Cog can ask
questions or request that a movement be repeated over and over. Some robots have even shown frustration and boredom when the learning process gets difficult.
Down the hall from Cog is its cousin, Kismet. Inspired by Cog's infantile behavior, researcher Cynthia Breazeal created Kismet, one of the most sociable robots. Whereas no one treats the hulkish Cog like a baby, people frequently use baby talk with Kismet. A cartoonish head bolted to a table, Kismet will respond to a person's approach, waggle its fuzzy, caterpillar-like eyebrows, and turn up its red licorice-whip lips in a grin. The underlying premise of Kismet is that emotions are necessary to guide learning and communication.
Because most information exchanged between humans is done through facial expressions, it is important to give a robot facial expressions to make robot-human interactions as informative as possible. A human infant will smile and try to attract the attention of its mother. When that is achieved, it will follow the mother's
movements and try to engage in play. Kismet can do the same. As a baby is motivated by hunger or thirst, Kismet is motivated by stimulation. It is programmed with a social, or stimulation, drive, which means it seeks out experiences that will stimulate it. But it also has a fatigue syndrome, which means it gets tired over time. The goal is for Kismet to keep these two drives in balance and learn what works and what does not in social situations. "The robot is trying to get you to interact with it in ways that can benefit its ability to learn," 22 says Breazeal.
For example, Kismet is programmed to seek out social stimulation. Its bright blue eyes are always looking around for movement, bright colors, skin tones, and face shapes. The images it takes in are processed through a neural network that recognizes faces and their expressions. Its large pink ears listen for voices. When it senses a person is near, it will try to attract the person with facial expressions and baby talk. If an expression works and a person passing by stops to talk, Kismet's internal social drive program is satisfied. If the expression does not work, the internal social drive level sinks and prompts Kismet to try something different. Kismet also knows how to react to stimulation it perceives negatively. If a person gets too close to its face, Kismet will sense an invasion of space and either mimic an expression of annoyance or turn away.
Breazeal programmed Kismet with a repertoire of seven basic facial movements, but she theorizes that the more Kismet interacts with humans, the more it will learn and refine those expressions and add to them. Its impressive array of facial expressions is controlled by fifteen external computers along one wall, with no one computer in control.
Putting All the Pieces Together
All of these components—bipedalism, sensory perception, and facial expressions—have yet to be put together into an effective and convincing mechanical
man. There are no robots that combine the mobility of ASIMO, the sociability of Kismet, and the chess-playing ability of Deep Blue. Many of the skills that humans take for granted like running, enjoying music, or recognizing objects are still beyond the abilities of even the most advanced robots, but AI researchers are working on them piece by piece. The potential is there. No one can determine a date in the future when humanoid robots will babysit children or assist the elderly, but hope is high and the technology is progressing.