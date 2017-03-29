Ten years ago, a team of scientists published the first genome of Aedes aegypti­—the infamous mosquito that spreads Zika, dengue fever, and yellow fever. It was a valiant effort, but also a complete mess. Rather than tidily bundled in the insect’s three pairs of chromosomes, its DNA was scattered among 36,000 small fragments, many of which were riddled with gaps and errors. But last week, a team of scientists led by Erez Lieberman Aiden at the Baylor College of Medicine announced that they had finally knitted those pieces into a coherent whole—a victory that will undoubtedly be helpful to scientists who study Aedes and the diseases it carries. This milestone is about more than mosquitoes. The team succeeded by using a technique called Hi-C, which allows scientists to assemble an organism’s genome quickly, cheaply, and accurately. To prove that point, the team used Hi-C to piece together a human genome from scratch for just $10,000; by contrast, the original Human Genome Project took $4 billion to accomplish the same feat. “It’s very clear that this is the way that you want to be doing it,” says Olga Dudchenko, who was part of Aiden’s team. “At least in the foreseeable future, there’s no method that can compete,” adds her colleague Sanjit Singh Batra.

This technique should make it easier to map the genome of any species—especially those that have never been sequenced before. The word “genome” has become so commonplace that it’s easy to forget how difficult it can be to sequence one—even now. When geneticists decipher an organism’s DNA, they do so in fits and starts, rather than in one continuous burst from start to finish. The result is a lot of short pieces, or “reads,” which must then be assembled. Sometimes, that’s easy: If two reads have a lot of overlap, they probably fit next to each other. But it’s much harder when genomes include long repetitive stretches. Assembling these is like solving a jigsaw puzzle filled with blue sky; it’s a royal pain to work out where each piece fits in. That’s why the Aedes genome was so fragmented. It is full of repetitive sections. And that’s where Hi-C comes in. Aiden and his colleague Job Dekker created the technique in 2009 for a completely different purpose—to study the shape of the human genome. Each of our cells contains around two meters of DNA, which somehow packs into a compartment just six millionths of a meter wide. To fit, the long one-dimensional DNA strands fold into a tight three-dimensional ball. Aiden and Dekker developed Hi-C to study these folds: It freezes the entire genome in place, and reveals which bits of DNA are touching each other in three-dimensional space. As it happens, this information also reveals how far apart two bits of DNA are likely to be in the one-dimensional string—which is really useful for assembling genomes. Think about that jigsaw puzzle. If you have two identical pieces of blue sky, you may not know where they go, but Hi-C can tell you that they have 15 pieces between them. Gather enough of that information, and you can put the whole sky together.