Hiển thị các bài đăng có nhãn DNA. Hiển thị tất cả bài đăng
Hiển thị các bài đăng có nhãn DNA. Hiển thị tất cả bài đăng

Thứ Ba, 7 tháng 3, 2017

New Computer Operating System unlock DNA's Molecules nearly full storage potential

In a study in Science, researchers Yaniv Erlich and Dina Zielinski describe a new coding technique for maximizing the data-storage capacity of DNA molecules.Credit: New York Genome Center

An algorithm designed for streaming video on a cellphone can unlock DNA's nearly full storage potential by squeezing more information into its four base nucleotides, say researchers. They demonstrate that this technology is also extremely reliable.



Humanity may soon generate more data than hard drives or magnetic tape can handle, a problem that has scientists turning to nature's age-old solution for information-storage -- DNA.

In a new study in Science, a pair of researchers at Columbia University and the New York Genome Center (NYGC) show that an algorithm designed for streaming video on a cellphone can unlock DNA's nearly full storage potential by squeezing more information into its four base nucleotides. They demonstrate that this technology is also extremely reliable.



DNA is an ideal storage medium because it's ultra-compact and can last hundreds of thousands of years if kept in a cool, dry place, as demonstrated by the recent recovery of DNA from the bones of a 430,000-year-old human ancestor found in a cave in Spain.

"DNA won't degrade over time like cassette tapes and CDs, and it won't become obsolete -- if it does, we have bigger problems," said study coauthor Yaniv Erlich, a computer science professor at Columbia Engineering, a member of Columbia's Data Science Institute, and a core member of the NYGC.

Erlich and his colleague Dina Zielinski, an associate scientist at NYGC, chose six files to encode, or write, into DNA: a full computer operating system, an 1895 French film, "Arrival of a train at La Ciotat," a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon.

They compressed the files into a master file, and then split the data into short strings of binary code made up of ones and zeros. Using an erasure-correcting algorithm called fountain codes, they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T. The algorithm deleted letter combinations known to create errors, and added a barcode to each droplet to help reassemble the files later.



In all, they generated a digital list of 72,000 DNA strands, each 200 bases long, and sent it in a text file to a San Francisco DNA-synthesis startup, Twist Bioscience, that specializes in turning digital data into biological data. Two weeks later, they received a vial holding a speck of DNA molecules.

To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. They recovered their files with zero errors, the study reports. (In this short demo, Erlich opens his archived operating system on a virtual machine and plays a game of Minesweeper to celebrate.)

They also demonstrated that a virtually unlimited number of copies of the files could be created with their coding technique by multiplying their DNA sample through polymerase chain reaction (PCR), and that those copies, and even copies of their copies, and so on, could be recovered error-free.



Finally, the researchers show that their coding strategy packs 215 petabytes of data on a single gram of DNA -- 100 times more than methods published by pioneering researchers George Church at Harvard, and Nick Goldman and Ewan Birney at the European Bioinformatics Institute. "We believe this is the highest-density data-storage device ever created," said Erlich.

The capacity of DNA data-storage is theoretically limited to two binary digits for each nucleotide, but the biological constraints of DNA itself and the need to include redundant information to reassemble and read the fragments later reduces
its capacity to 1.8 binary digits per nucleotide base.

The team's insight was to apply fountain codes, a technique Erlich remembered from graduate school, to make the reading and writing process more efficient. With their DNA Fountain technique, Erlich and Zielinski pack an average of 1.6 bits into each base nucleotide. That's at least 60 percent more data than previously published methods, and close to the 1.8-bit limit.

Cost still remains a barrier. The researchers spent $7,000 to synthesize the DNA they used to archive their 2 megabytes of data, and another $2,000 to read it. Though the price of DNA sequencing has fallen exponentially, there may not be the same demand for DNA synthesis, says Sri Kosuri, a biochemistry professor at UCLA who was not involved in the study. "Investors may not be willing to risk tons of money to bring costs down," he said.



But the price of DNA synthesis can be vastly reduced if lower-quality molecules are produced, and coding strategies like DNA Fountain are used to fix molecular errors, says Erlich. "We can do more of the heavy lifting on the computer to take the burden off time-intensive molecular coding," he said.
Source: Materials provided by Columbia University School of Engineering and Applied Science.

YOUR INPUT IS MUCH APPRECIATED! LEAVE YOUR COMMENT BELOW.

Chủ Nhật, 5 tháng 3, 2017

Biologists propose to sequence the DNA of all life on Earth

Can biologists sequence the genomes of all the plants and the animals in the world, including this greater bird of paradise in Indonesia?



When it comes to genome sequencing, visionaries like to throw around big numbers: There’s the UK Biobank, for example, which promises to decipher the genomes of 500,000 individuals, or Iceland’s effort to study the genomes of its entire human population. Yesterday, at a meeting here organized by the Smithsonian Initiative on Biodiversity Genomics and the Shenzhen, China–based sequencing powerhouse BGI, a small group of researchers upped the ante even more, announcing their intent
to, eventually, sequence “all life on Earth.”

Their plan, which does not yet have funding dedicated to it specifically but could cost at least several billions of dollars, has been dubbed the Earth Bio Genome Project (EBP). Harris Lewin, an evolutionary geneticist at the University of California, Davis, who is part of the group that came up with this vision 2 years ago, says the EBP would take a first step toward its audacious goal by focusing on eukaryotes—the group of organisms that includes all plants, animals, and single-celled organisms such as amoebas.



That strategy, and the EBP’s overall concept, found a receptive audience at BioGenomics2017, a gathering this week of conservationists, evolutionary biologists, systematisms, and other biologists interested in applying genomics to their work. “This is a grand idea,” says Oliver Ryder, a conservation biologist at the San Diego Zoo Institute for Conservation Research in California. “If we really want to understand how life evolved, genome biology is going to be part of that.”

Ryder and others drew parallels between the EBP and the Human Genome Project, which began as an ambitious, controversial, and, at the time, technically impossible proposal more than 30 years ago. That earlier effort eventually led not only to the sequencing of the first human genome, but also to entirely new DNA technologies that are at the center of many medical frontiers and the basis for a $20 billion industry. “People have learned from the human genome experience that [sequencing] is a tremendous advance in biology,” Lewin says.

Many details about the EBP are still being worked out. But as currently proposed, the first step would be to sequence in great detail the DNA of a member of each eukaryotic family (about 9000 in all) to create reference genomes on par or better than the reference human genome. Next would come sequencing to a lesser degree a species from each of the 150,000 to 200,000 genera. Finally, EBP participants would get rough genomes of the 1.5 million remaining known eukaryotic species. These lower resolution genomes could be improved as needed by comparing them with the family references or by doing more sequencing, says EBP co-organizer Gene Robinson, a behavioral genomics researcher and director of the Carl R. Woese Institute for Genomic Biology at the University of Illinois in Urbana.



The entire eukaryotic effort would likely cost about the same as it did to sequence that first human genome, estimate Lewin, Robinson, and EBP co-organizer John Kress, an evolutionary biologist at the Smithsonian National Museum of Natural History here. It took about $2.7 billion to read and order the 3 billion bases composing the human genome, about $4.8 billion in today’s dollars.

With a comparable amount of support, the EBP’s eukaryotic work might be done in a decade, its organizers suggest. Such optimism arises from ever-decreasing DNA sequencing costs—one meeting presenter from Complete Genomics, based in Mountain View, California, says his company plans to be able to roughly sequence whole eukaryotic genomes for about $100 within a year—and improvements in sequencing technology that make possible higher quality genomes, at reasonable prices. “It became apparent to me that at a certain point, it would be possible to sequence all life on Earth,” Lewin says. Although some may find the multibillion-dollar price tag hard to justify for researchers not studying humans, the fundamentals of matter, or the mysteries of the universe, the EBP has a head start, thanks to the work of several research communities pursuing their own ambitious sequencing projects.

These include the Genome 10K Project, which seeks to sequence 10,000 vertebrate genomes, one from each genus; i5K, an effort to decipher 5000 arthropods; and B10K, which expects to generate genomes for all 10,500 bird species. The EBP would help coordinate, compile, and perhaps fund these efforts. “The [EBP] concept is a community of communities,” Lewin says. There are also sequencing commitments from giants in the genomics field, such as China’s BGI, and the Wellcome Trust Sanger Institute in the United Kingdom. But at a planning meeting this week, it became clear that significant challenges await the EBP, even beyond funding. Although researchers from Brazil, China, and the United Kingdom said their nations are eager to participate in some way, the 20 people in attendance emphasized the need for the effort to be more international, with developing countries, particularly those with high biodiversity, helping shape the project’s final form.



They proposed that the EBP could help develop sequencing and other technological experts and capabilities in those regions. The Global Genome Biodiversity Network, which is compiling lists and images of specimens at museums and other biorepositories around the world, could supply much of the DNA needed, but even broader participation is important, says Thomas Gilbert, an evolutionary biologist at the Natural History Museum of Denmark in Copenhagen.

The planning group also stressed the need to develop standards to ensure high-quality genome sequences and to preserve associated information for each organism sequenced such as where it was collected and what it looked like. Getting DNA samples from the wild may ultimately be the biggest challenge—and the biggest cost, several people noted. Not all museum specimens yield DNA preserved well enough for high-quality genomes. Even recently collected and frozen plant and animal specimens are not always handled correctly for preserving their DNA, says Guojie Zhang, an evolutionary biologist at BGI and the University of Copenhagen. And the lack of standards could undermine the project’s ultimate utility, notes Erich Jarvis, a neurobiologist at The Rockefeller University in New York City: “We could spend money on an effort for all species on the planet, but we could generate a lot of crap.”



But Lewin is optimistic that won’t happen. After he outlined the EBP in the closing talk at BioGenomics2017, he was surrounded by researchers eager to know what they could do to help. “It’s good to try to bring together the tribes,” says Jose Lopez, a biologist from Nova Southeastern University in Fort Lauderdale, Florida, whose “tribe” has mounted “GIGA,” a project to sequence 7000 marine invertebrates. “It’s a big endeavor. We need lots of expertise and lots of people who can contribute.”

Source: Elizabeth Pennisi

YOUR INPUT IS MUCH APPRECIATED! LEAVE YOUR COMMENT BELOW.

Thứ Sáu, 3 tháng 3, 2017

Scientists reveal new Super-Fast form of Computer that 'Grows as it Computes'

DNA double helix. Credit: public domain

Researchers from The University of Manchester have shown it is possible to build a new super-fast form of computer that "grows as it computes".

Professor Ross D King and his team have demonstrated for the first time the feasibility of engineering a nondeterministic universal Turing machine (NUTM), and their research is to be published in the prestigious Journal of the Royal Society Interface.

The theoretical properties of such a computing machine, including its exponential boost in speed over electronic and quantum computers, have been well understood for many years – but the Manchester breakthrough demonstrates that it is actually possible to physically create a NUTM using DNA molecules.

"Imagine a computer is searching a maze and comes to a choice point, one path leading left, the other right," explained Professor King, from Manchester's School of Computer Science. "Electronic computers need to choose which path to follow first.



"But our new computer doesn't need to choose, for it can replicate itself and follow both paths at the same time, thus finding the answer faster.

"This 'magical' property is possible because the computer's processors are made of DNA rather than silicon chips. All electronic computers have a fixed number of chips.
"Our computer's ability to grow as it computes makes it faster than any other form of computer, and enables the solution of many computational problems previously considered impossible.
"Quantum computers are an exciting other form of computer, and they can also follow both paths in a maze, but only if the maze has certain symmetries, which greatly limits their use.

"As DNA molecules are very small a desktop computer could potentially utilize more processors than all the electronic computers in the world combined - and therefore outperform the world's current fastest supercomputer, while consuming a tiny fraction of its energy."



The University of Manchester is famous for its connection with Alan Turing - the founder of computer science - and for creating the first stored memory electronic computer.

"This new research builds on both these pioneering foundations," added Professor King.
Alan Turing's greatest achievement was inventing the concept of a universal Turing machine (UTM) - a computer that can be programmed to compute anything any other computer can compute. Electronic computers are a form of UTM, but no quantum UTM has yet been built.

DNA computing is the performing of computations using biological molecules rather than traditional silicon chips. In DNA computing, information is represented using the four-character genetic alphabet - A [adenine], G [guanine], C [cytosine], and T [thymine] - rather than the binary alphabet, which is a series of 1s and 0s used by traditional computers.

Provided by: University of Manchester

YOUR INPUT IS MUCH APPRECIATED! LEAVE YOUR COMMENT BELOW.

Thứ Năm, 26 tháng 1, 2017

Are we closer than ever to a timeline for human evolution?

Dating when our ancestors split from Neanderthals and other relatives has long been a puzzle, but DNA advances are making our evolutionaNeanderthalsy journey clearer



Anthropologists and geneticists had a problem. And the farther back in time they looked, the bigger the problem became.

For the past several years, there have been two main genetic methods to date evolutionary divergences - when our ancestors split from Neanderthals, chimpanzees, and other relatives. The problem was, the results of these methods differed by nearly two-fold.

By one estimate, modern humans split from Neanderthals roughly 300,000 years ago. By the other, the split was closer to 600,000 years ago. Likewise, modern humans and chimps may have diverged around 6.5 or 13 million years ago.

Puzzled by this wild disagreement, researchers with diverse expertise have been studying it from different angles. Their combined discoveries, recently reviewed, here, have shed light on how genetic differences accumulate over time and have advanced methods of genetic dating.



And if you’re in suspense, yes, they’ve also pinned down important events in our evolutionary timeline. Everyone alive today seems to share ancestors with each other just over 200,000 years ago and with Neanderthals between 765,000-550,000 years ago.

Dating with the molecular clock
Go back in time and you’ll find a population of Homo sapiens who were the ancestors of everyone living today. Go back farther and our lineage meets up with Neanderthals, then chimps, and eventually all primates, mammals, and life.
In order to date these evolutionary splits, geneticists have relied on the molecular clock - the idea that genetic mutations accumulate at a steady rate over time. Specifically this, concerns mutations that become neutral substitutions, or lasting changes to letters of the genetic code that do not affect an organism’s chances of surviving and reproducing.



If such mutations arise clocklike, then calculating the time since two organisms shared common ancestors should be as easy as dividing the number of genetic differences between them by the mutation rate - the same way that dividing distance by speed gives you travel time.

But you need to know the rate.
For decades, anthropologists used fossil calibration to generate the so-called phylogenetic rate (a phylogeny is a tree showing evolutionary relationships). They took the geologic age of fossils from evolutionary branch points and calculated how fast mutations must have arisen along the resulting lineages.

For example, the earliest fossils on the human branch after our split with chimps are identified by the fact that they seem to have walked on two legs; bipedalism is
the first obvious difference that distinguishes our evolutionary lineage of hominins from that of chimps. These fossils are 7-6 million years old, and therefore the chimp-human split should be around that age. Dividing the number of genetic differences between living chimps and humans by 6.5 million years provides a mutation rate.

Determined this way, the mutation rate is 0.000000001 (or 1x10-9) mutations per DNA base pair per year. Applied to genomes with 6 billion base pairs, that means over millions of years of chimp and human evolution, and there have been on average six changes to letters of the genetic code per year.

Why archaeology needs to come out of the cave and into the digital age
This rate can be used to date evolutionary events that are not evident from fossils, such as the spread of modern humans out of Africa.



But genetic dating got messy in 2010, when improvements to DNA sequencing allowed researchers to determine the number of genetic differences between parents and their children. Known as pedigree analysis, this provides a more direct measurement of the current mutation rate within one generation, rather than an average over millions of years.

Pedigree analysis counts 60-some mutations every generation; that converts to a rate approximately half the phylogenetic estimate—meaning evolutionary events would be twice as old.

The erratic molecular clock
Resolving this disagreement propelled researchers to reassess and revise their starting assumptions: How accurately were they counting the small number of differences between genomes of parents and children? Were fossils assigned to the correct branches of the evolutionary tree? And above all, how constant is the molecular clock?

It turns out that among primates, the molecular clock varies significantly by species, sex, and mutation type. A recent study found that New World monkeys (i.e. monkeys of the Americas like marmosets and squirrel monkeys) have substitution rates about 64% higher than apes (including humans). Within apes, rates are about 7% higher in gorillas and 2% higher in chimpanzees, compared to humans.



But even among humans, mutation rates differ, particularly between the sexes with age. As fathers get older, they gain about one additional mutation per year in the DNA they can pass on to children. Mothers, on the other hand, accumulate considerably fewer mutations with each passing year.

These species and sex differences make sense when you consider how mutations form. Most heritable mutations occur from mistakes when DNA copies itself in the germline, or cells leading to eggs and sperm. The number of times germline DNA has to copy itself depends on developmental and reproductive variables including age at puberty, age at reproduction, and the process of sperm production.
These traits vary across primates today and certainly varied over primate evolution.

For instance, average generation times are six years for New World monkeys, 19 years for gorillas, 25 years for chimps, and 29 years for humans.

And those extra mutations as fathers get older? Sperm are produced continuously after puberty, so sperm made later in life are the result of more rounds of DNA replication and opportunities for replication errors. In contrast, a mother’s stock of eggs is formed by birth. The small increase with maternal age could be due to mutations from DNA damage, rather than replication errors.

Ways forward for dating backwards
It’s now clear that one mutation rate cannot determine the dates for all divergences relevant to human evolution. However, researchers can secure the timeline for important evolutionary events by combining new methods of genetic dating with fossils and geologic ages.

Innovative computational methods have incorporated reproductive variables into calculations. By taking into account ages of reproduction in both sexes, age of male puberty, and sperm production rates, researchers have estimated split times that accord with the fossil record.



Another new approach has analyzed mutations that are mainly independent of DNA replication. It seems that certain classes of mutations, related to DNA damage, do behave more clocklike.

And some researchers have focused on ancient DNA. Comparing human fossils from the past 50,000 years to humans today, suggests a mutation rate that agrees with pedigree analysis.

At least one evolutionary split was pinned down in 2016, after ancient DNA was extracted from 430,000 year-old hominin fossils from ‘Sima de los Huesos’, Spain. The Sima hominins looked like early members of the Neanderthal lineage based on morphological similarities. This hypothesis fit the timing of the split between Neanderthals and modern humans based on pedigree analysis (765,000-550,000 years ago), but did not work with the phylogenetic estimate (383,000-275,000 years ago).
Where do the Sima hominins belong on our family tree? Were they ancestors of both Neanderthals and modern humans, just Neanderthals, or neither?

DNA answered this definitively. The Sima hominins belong to the Neanderthal branch after it split with modern humans. Moreover, the result provides a firm time point in our family tree, suggesting that the pedigree rate works for this period of human evolution.

Neanderthals and modern humans likely diverged between 765,000-550,000 years ago. Other evolutionary splits may soon be clarified as well, thanks to advances brought about by the mutation rate debates. Someday soon, when you see a chimp, you may be able to salute your great, great… great grandparent, with the correct number of “greats.”
Source: Biology News

YOUR INPUT IS MUCH APPRECIATED! LEAVE YOUR COMMENT BELOW.

Thứ Sáu, 19 tháng 8, 2016

Amazing Discovery: Viruses are Dominant Drivers of Human Evolution, Researchers Say

In a new study, scientists apply big-data analysis to reveal the full extent of viruses’ impact on the evolution of humans and other mammals. Their findings suggest 30% of all protein adaptations since humans’ divergence with chimpanzees have been driven by viruses.



When an environmental change occurs, species are able to adapt in response due to mutations in their DNA. Although these mutations occur randomly, by chance some of them make the organism better suited to their new environment. These are known as adaptive mutations.

In the past decade, scientists have discovered a large number of adaptive mutations in a wide variety of locations in the genome of humans and other mammals.



The fact that adaptive mutations are so pervasive is puzzling. What kind of environmental pressure could possibly drive so much adaptation in so many parts of the genome?

Viruses are ideal suspects since they are always present, ever-changing and interact with hundreds to thousands of proteins.

“When you have a pandemic or an epidemic at some point in evolution, the population that is targeted by the virus either adapts, or goes extinct,” said lead author Dr. David Enard, of Stanford University.

“We knew that, but what really surprised us is the strength and clarity of the pattern we found.”

Previous research on the interactions between viruses and proteins has focused on individual proteins that are directly involved in the immune response.

This is the first study to take a global look at all types of proteins.

“The big advancement here is that it’s not only very specialized immune proteins that adapt against viruses,” Dr. Enard said.

“Pretty much any type of protein that comes into contact with viruses can participate in the adaptation against viruses. It turns out that there is at least as much adaptation outside of the immune response as within it.”



The team’s first step was to identify all the proteins that are known to physically interact with viruses.

After reviewing tens of thousands of scientific abstracts, they culled the list to 1,256 proteins of interest.

“We identified 1,256 proteins that physically interact with viruses out of a total of 9,861 proteins with orthologs in the genomes of the 24 mammals included in the analysis,” the scientists said.

The next step was to build big-data algorithms to scour genomic databases and compare the evolution of virus-interacting proteins to that of other proteins.

The results revealed that adaptations have occurred three times as frequently in virus-interacting proteins compared with other proteins.

“We’re all interested in how it is that we and other organisms have evolved, and in the pressures that made us what we are,” said senior author Dr. Dmitri Petrov, also from Stanford University.

“The discovery that is in constant battle with viruses has shaped us in every aspect — not just the few proteins that fight infections, but everything — is profound.”



“All organisms have been living with viruses for billions of years; this work shows that those interactions have affected every part of the cell.”

Viruses hijack nearly every function of a host organism’s cells in order to replicate and spread, so it makes sense that they would drive the evolution of the cellular machinery to a greater extent than other evolutionary pressures such as predation or environmental conditions.

The study sheds light on some longstanding biological mysteries, such as why closely-related species have evolved different machinery to perform identical cellular functions, like DNA replication or the production of membranes.

Scientists previously did not know what evolutionary force could have caused such changes.
“This paper is the first with data that is large enough and clean enough to explain a lot of these puzzles in one fell swoop,” Dr. Petrov said.

Source: eLife

YOUR INPUT IS MUCH APPRECIATED! LEAVE YOUR COMMENT BELOW.

 
OUR MISSION