The Human Genome Project is the scientific research effort to construct a complete map of all of the genes carried in human chromosomes. The finished blueprint of human genetic information will serve as a basic reference for research in human biology and will provide insights into the genetic basis of human disease.
The human "genome" is the word used to describe the complete collection of genes found in a single set of human chromosomes. It was in the early 1980s that medical and technical advances first suggested to biologists that a project was possible that would locate, identify, and find out what each of the 100,000 or so genes that make up the human body actually do. After investigations by two United States government agencies—the Department of Energy and the National Institutes of Health—the U.S. Congress voted to support a fifteen-year project, and on October 1, 1990, the Human Genome Project officially began. It was to be coordinated with the existing related programs in several other countries. The project's official goals are to identify all of the approximately 50,000 genes in human deoxyribonucleic acid (DNA) and to determine the sequences of the 3.2 billion base pairs that make up human DNA. The project will also store this information in databases, develop tools for data analysis, and address any ethical, legal, and social issues that may arise.
In order to understand how mammoth an undertaking this ambitious project is, it is necessary to know how genetic instructions are carried on the human chromosome. Humans have forty-six chromosomes, which are coiled structures in the nucleus of a cell that carry DNA. DNA is the genetic material that contains the code for all living things, and it consists of two long chains or strands joined together by chemicals called bases or nucleotides (pronounced NOO-klee-uh-tides), all of which are coiled together into a twisted-ladder shape called a double helix. The bases are considered to be the "rungs" of the twisted ladder. These rungs are made up of only four different types of nucleotides—adenine (A), thymine (T), guanine (G), and cytosine (C)—and are critical to understanding how nature stores and uses a genetic code. The four bases always form a "rung" in pairs, and they always pair up the same way. Scientists know that A always pairs with T, and G always pairs with C. Therefore, each DNA base is like a letter of the alphabet, and a sequence or chain of bases can be thought of as forming a certain message.
Chromosome: Structures that organize genetic information in the nucleus of a cell.
DNA (deoxyribonucleic acid): Large, complex molecules found in the nuclei of cells that carry genetic information for an organism's development.
Gene: A segment of a DNA (deoxyribonucleic acid) molecule contained in the nucleus of a cell that acts as a kind of code for the production of some specific protein. Genes carry instructions for the formation, functioning, and transmission of specific traits from one generation to another.
Nucleotide: The basic unit of a nucleic acid. It consists of a simple sugar, a phosphate group, and a nitrogen-containing base. (Pronounced NOO-klee-uh-tide.)
The human genome, which is the entire collection of genes found in a single set of chromosomes (or all the DNA in an organism), consists of 3.2 billion nucleotide pairs or bases. To get some idea about how much information is packed into a very tiny space, a single large gene may consist of tens of thousands of nucleotides or bases, and a single chromosome may contain as many as one million nucleotide base pairs and four thousand genes. What is most important about these pairs of bases is the particular order of the As, Ts, Gs, and Cs. Their order dictates whether an organism is a human being, a bumblebee, or an apple. Another way of looking at the size of the human genome present in each of our cells is to consider the following phone book analogy. If the DNA sequence of the human genome were compiled in books, 200 volumes the size of the Manhattan telephone book (1,000 pages) would be needed to hold it all. This would take 9.5 years to read aloud without stopping. In actuality, since the human genome is 3.2 billion base pairs long, it will take 3 gigabytes of computer data storage to hold it all.
In light of the project's main goal—to map the location of all the genes on every chromosome and to determine the exact sequence of nucleotides of the entire genome—two types of maps are being made. One of these is a physical map that measures the distance between two genes in terms of nucleotides. A very detailed physical map is needed before real sequencing can be done. Sequencing is the precise order of the nucleotides. The other map type is called a genetic linkage map and it measures the distance between two genes in terms of how frequently the genes are inherited together. This is important since the closer genes are to each other on a chromosome, the more likely they are to be inherited together.
As an international project involving at least eighteen countries, the Human Genome Project was able to make unexpected progress in its early years, and it revised its schedule in 1993 and again in 1998. During December 1999, an international team announced it had achieved a scientific milestone by compiling the entire code of a complete human chromosome for the first time. Researchers chose chromosome 22 because of its relatively small size and its link to many major diseases. The sequence they compiled is over 23 million letters in length and is the longest continuous stretch of DNA ever deciphered and assembled. What was described as the "text" of one chapter of the 23-volume human genetic instruction book was therefore completed. Francis Collins, director of the National Human Genome Research Institute of the National Institutes of Health, said of this success, "To see the entire sequence of a human chromosome for the first time is like seeing an ocean liner emerge out of the fog, when all you've ever seen before were rowboats."
In February 2001, scientists working on the project published the first interpretations of the human genome sequence. Previously, many in the scientific community had believed that the number of human genes totaled about 100,000. But the new findings surprised everyone: both research groups said they could find only about 30,000 or so human genes. This meant that humans have remarkably few genes, not that many more than a fruit fly, which has 13,601 (scientists had decoded this sequence in March 2000). This discovery led scientists to conclude that human complexity does not come from a sheer quantity of genes. Instead, human complexity seems to arise as a result of the structure of the network of different genes, proteins, and groups of proteins and the dynamics of those parts connecting at different times and on different levels.
The Human Genome Project typically is called "big science," usually referring to a large, complex, and, above all, expensive operation that can only be undertaken by a government. That is why the emergence of private sector (non-governmental) competition in 1998 was such a surprise. During that year, J. Craig Venter, the founder of Celera Genomics, announced that his company planned to sequence the human genome on its own. He said he could achieve this because Celera Genomics was using the largest civilian supercomputer ever made to produce the needed sequences. When Celera began to show real progress, it appeared to many that a race between the public and private sectors would occur. However, once Venter met with Francis Collins, the director of the Human Genome Project, it was agreed that cooperation would achieve more than competition. Therefore, when what was called the "draft sequence" was completed and announced on June 26, 2000, Venter and Collins appeared together and gave the public its first news of this achievement. They stated that the project had compiled what might be called a rough draft of the human genome, having put together a sequence of about 90 percent.
Total project completion—meaning that all of the remaining gaps will be closed and its accuracy improved, so that a complete, high-quality reference map is available—is expected in 2003. This will coincide with the fiftieth anniversary of the description of the molecular structure of DNA, first unraveled in 1953 by American molecular biologist James Watson (1928– ) and English molecular biologist Francis Crick (1916– ). When the genome is completely mapped and fully sequenced in 2003, two years earlier than planned, biologists can for the first time stand back and look at each chromosome as well as the entire human blueprint. They will start to understand how a chromosome is organized, where the genes are located, how they express themselves, how they are duplicated and inherited, and how disease-causing mutations occur. This could lead to the development of new therapies for diseases thought to be incurable as well as to new ways of manipulating DNA.
It also could lead to testing people for "undesirable" genes. However, such a statement leads to all sorts of potential dangers involving ethical and legal matters. Fortunately, such issues have been considered from the beginning. Part of the project's goal is to address these difficult issues of privacy and responsibility, and to use the completely mapped and fully sequenced genome to everyone's benefit.
There is no doubt that the breakthrough in human genome research will mark a revolution in the biology of the twenty-first century. Although we presently have only a glimpse of what the potential benefits of this knowledge may be, there are several good indicators of the fields that may benefit the most. What we have already witnessed is the real beginning of an American biotechnology effort whose genomic research will result in the creation of a multibillion-dollar industry. Many of these new companies that will emerge will develop new DNA-based products and technologies that will improve our health and well-being.
Molecular medicine. First among these new medical applications might be advances in what is known as molecular medicine. With the knowledge gained from the completion of the Human Genome Project, doctors will be less concerned with the symptoms of a disease or how it shows itself than they will with what actually causes the disease. Detailed genome maps will allow researchers to seek and find the genes that are associated with such diseases as inherited colon and breast cancer and Alzheimer's disease. Not only will doctors be able to diagnose these conditions at a much earlier point, but they will have new types of drugs, as well as new techniques and therapies, that will allow them to cure or even prevent a disease. In the near future, it is expected that doctors will know much earlier whether a person has or is predisposed to getting certain diseases,
and to then be able to use certain gene therapies or "custom drugs" to cure these diseases.
New uses for microbes. Microbes are any forms of life that are too small to be seen without a microscope. Bacteria are the most common form of microbes or microorganisms. In the not-too-distant future, we can expect the knowledge gained from the Human Genome Project to result in science being able to sequence and therefore understand the genomes of bacteria as well as of humans. This, in turn, will result in such highly
useful applications as energy production, environmental cleanup, toxic waste reduction, and the creation of entirely new industrial processes or ways in which we manufacture things. Eventually, new "biotechnologies" will be developed that will create bacteria that can "digest" waste material of all sorts, produce energy the way plants do, and improve the way industry makes products from food to clothing. Understanding the genetic sequence of bacteria can also show how harmful bacteria work against the body and perhaps how to combat them as well.
Risk assessment. Biologists know already that certain individuals are more susceptible to certain toxic or poisonous agents than others. They know that the cause for these susceptibilities is found in their genetic makeup or in their genes. Understanding the human genome will lead to science being able to identify, ahead of time, those who are at risk in certain environments, and conversely, those who have a stronger, "built-in" resistance. Such knowledge will help to understand and control the cancer risks of people who might be exposed to radiation or similarly dangerous energy-related materials or processes.
Comparative genomics. Comparative genomics means being able to compare the genetic make-up of individuals, groups, and all forms of life. Understanding the human genome and then the genomes of other life forms will enable scientists to better understand human evolution and to learn how people are connected to all other living things. Once researchers discover what the actual genetic "map" is for all major groups of organisms, there will be greater insight into how all life is connected and related.
Forensic science. Forensic science is the use of scientific methods to investigate a crime and to prepare evidence that is presented in court. Any living thing can be easily identified by examining the DNA sequences of the species to which it belongs. Although identifying an individual by its own DNA sequence is less precise, techniques developed during the Human Genome Project have made it presently possible to create a DNA profile of a person with the assurance that there is an extremely small chance that another individual has the exact same "DNA fingerprint." This not only will allow police to identify suspects whose DNA may match evidence left at a crime scene, but it also can, and has, been used to prove others innocent who were wrongly accused or convicted. DNA fingerprinting can identify victims, prove whether a man is the father of a child, and better match organ donors with recipients.
Better crops and animals. A deeper genetic understanding of plants and animals, as well as humans, will allow farmers to develop crops that can better resist disease, insects, and drought. Such "bioengineered" food would enable farmers to use little or no pesticides on the fruit and vegetables we eat, as well as to reduce waste. They could do the same for the animals farmers breed, and could produce healthier, more disease-resistant livestock.
Altogether, the Human Genome Project has already begun to have a major impact on the life sciences and the quality of our lives. Although it has proven to be a highly successful effort and will certainly achieve all of its stated goals, it really marks only the most basic of beginnings in understanding the genetic secrets of life.