Pining for Answers

Winter 2013

loblolly-pine
A TALL ORDER: Deciphering the loblolly pine’s genome is no walk in the park; it’s seven times as long as the human genome.
(Photo courtesy of Steve McKeand, NCSU, Raleigh, NC)

Steven Salzberg has lost count of the number of genomes he’s assembled. In addition to helping decode the first human genome and the first plant genome (the mustard relative Arabidopsis thaliana), Salzberg has read the complete or partial genetic blueprints of the woodland strawberry, the Bacillus anthracis bacterium that causes anthrax, the extinct Columbian mammoth, and many more. Now Salzberg—a professor of medicine and biostatistics with a joint appointment in Computer Science at the Whiting School—is tackling his biggest subject: the loblolly pine tree.

“Is it just miles and miles of junk DNA, or is something else going on?” Steven Salzberg

About seven times as long as the human genome, the loblolly pine’s entire set of genetic material consists of about 22 billion base pairs, the individual letters that string together to make a DNA sequence. No one knows why the pine tree has so much DNA. But when the analysis is finished sometime in 2013, the pine tree will be the largest genome ever deciphered. Every project has its own challenges, Salzberg says. The pine tree—the main challenge is that it’s so large.”

Understanding the biology of the loblolly pine isn’t just an academic pursuit. This fast-growing tree is a big cash crop across the southeastern United States. The wood is used in paper products and inexpensive construction projects. That’s why the U.S. Department of Agriculture is sponsoring the research, which is led by plant geneticist David Neale of the University of California, Davis.

The job of Salzberg’s lab is to put the loblolly pine genome back together after it’s been sequenced. Although technology has come a long way, only 100 or 150 letters of DNA can be read at one time, Salzberg says. And the DNA can’t be read in order. Instead, genetic material from multiple pine tree cells is snipped into more than a trillion random fragments. After all the snippets are read, Salzberg and his colleagues will use a computer to look for overlapping fragments to stitch the pieces back together to make a complete genetic instruction book of the pine.

“When we finish the genome sequence and assemble it, we’ll have a better idea of what’s in it,” Salzberg says. “Is it just miles and miles of junk DNA, or is something else going on?”

That’s hard to predict because there are no rules of thumb for how complex an organism’s genome should be. Larger or smarter creatures don’t necessarily have more DNA than smaller or simpler beings, Salzberg notes. And no other conifers have had their DNA decoded. Salzberg and colleagues will gain even more insight into conifers in their next project: analyzing the genome of the sugar pine—which has even more DNA than the loblolly tree.