DNA

As the name suggests, DNA sequencing is simply the process of determining the content and arrangement of nucleotides in a particular genome, either in part or in whole. Most DNA sequencing techniques are based on the dideoxy method which was first described by Sanger et al (cited in Pierce, p.553-5 Davies, 2001, p.10). The dideoxy method of DNA sequencing is based on the principle of replication. After the DNA of interest has been amplified using the polymerase chain reaction (PCR) method, the amplified products are divided into 4 different tubes and subjected to denaturation. The resultant single strands or the primers complementary to the single strands are labelled and the DNA subjected to further rounds of replication (Pierce, p.554).
However, these subsequent replication rounds generate many short contigs since dideoxyribonucleotides (ddNTPs) are used in place of deoxyribonucleotides. The ddNTPs do not have a free 3 OH group and therefore termination of replication occurs whenever any of the 4 ddNTPs is incorporated into the growing chain. Each of the 4 different ddNTPs is added into one particular tube. Separation of the DNA fragments through gel electrophoresis followed by autoradiography reveals sequentially arranged bands which can be easily read to determine the sequence of the DNA of interest (Pierce, p.554).
In recent times, the dideoxy method has become highly automated and high tech with variants such as massively parallel signature sequencing (MPSS), expressed cDNA sequence tag (EST), and serial analysis of gene expression (SAGE) techniques vastly improving the turnaround time, quality and throughput of DNA sequences (Mount, p.33).
As in health, DNA sequencing has found many applications in the food industry. This essay looks at the impact of rapid sequence DNA on food with a particular emphasis on how the data generated is handled, how it can inform our understanding of underlying mechanisms, what the future holds among others.
Impacts of Rapid DNA Sequencing on Food
DNA sequencing has had a significant impact on food. First, it has led to an improvement in the safety of food. This is primarily because the technique allows the detection of food borne pathogens with stunning accuracy. Common pathogens which have been implicated in food borne illnesses include the 0157H7 strain of E. coli, Campylobacter species, Listeria species and Salmonella (Volokhov, p. 4270-8).
The challenge to food safety is particularly serious when it is considered that these pathogens are not only pervasive in the environment but are also spread through innumerable reservoirs. While the pathogenicity and pervasiveness of these pathogens is widely documented, DNA sequencing techniques have in particular helped to improve the sub-typing of species and this has made it possible for pathogenic organisms to be identified at the subspecies level (PhysOrg).
Another impact which rapid DNA sequencing has had on food relates to quality control. The technique has been widely employed to assess genetically modified (GM) contamination in food. In this instance, it is a routine practice that is commonly applied in many countries, including the UK, to assess whether there is any GM contamination in food in order to protect the public from potentially undesired effects. This is because GM food has been implicated in horizontal gene transfer and has been noted in some instances to cross the species barrier. Foods which have been analyzed through rapid DNA sequencing to determine their purity include potatoes, rice, tea, coffee and Drum wheat. However, PCR is reportedly ineffective in discriminating GM from non-GM content (USDA PhysOrg).
Additionally, DNA sequencing has been used to improve the nutritional value of foods. Examples of foods which have been enhanced through this technique include rice, soya, and maize. Specifically, these foods have been improved through recombinant DNA technology which utilizes DNA sequencing in primer design and to confirm that transformation has indeed taken place (Barnaum, p. 69-118).
Rapid DNA sequencing techniques have also helped to improve food security. Globally, food security is under threat due to the effects of climate change, desertification, a growing human population, poor agricultural practices among others. For example, there has been a significant reduction in the yield of harvested wheat globally due to climate change (Physorg). Rapid DNA sequencing techniques can help to boost the food security especially as regards important food crops. According to Physorg, scientists in the UK have embarked on a major project to determine the genetic sequences of several varieties of wheat. This project is expected to enhance and provide new knowledge on how the genes for wheat are expressed, discover disparities in gene networks that are of vital importance to resistance to diseases and improve the yield and hardiness by discovering genes responsible for these traits (PhysOrg).
Yet another impact of rapid DNA sequencing on food is that it has led to significant cost saving. This is because it has reduced the incidence of food borne pathogens which have been associated with huge economic losses in many countries throughout the world. Enhanced human health is another area that has been impacted upon by rapid DNA sequencing methods. Due to rapid DNA sequencing, the turnaround time required for the diagnosis of diseases caused by food borne pathogens has been reduced. Additionally, newer and more effective targets have been identified, leading to more effective diagnostic and therapeutic strategies aimed against the food borne pathogens and effective. Important also is the fact that people are eating food that is less hazardous and safer as a result of DNA sequencing techniques (Barnaum, 69-118).
Rapid DNA sequencing has also improved our understanding of diseases caused by food-borne pathogens. Biodiversity, especially as it pertains to plant and animal food species, is another area which has been impacted by rapid DNA sequencing. Due to this technique, preservation of whole DNA sequences of plant and animal food sources is possible. Besides, rapid DNA sequencing methods have made it possible to preserve the genomes of food species that are in danger of extinction. This will enable the species data to be made use of long after they have become extinct (Barnaum, 69-118). 
How the Sequence data is handled
Sequence data obtained through rapid DNA sequencing techniques is stored in databases. Currently, many databases exist to store the sequenced data and any other information pertinent to the sequences. Bibliographic databases such as AGRICOLA, BIOSIS, and PubMed contain important bibliographic data related to food. Sequence data is also stored in taxonomic and nucleotide sequence databases. Examples of taxonomic databases include. NCBIs Taxonomic Browser and the Tree of Life. Nucleotide sequence databases include the DNA Data Bank of Japan (DDBJ), the European Bioinformatics Biotechnology Laboratory (EMBL) which is hosted by EBI at Hinxton in the UK, and GeneBank. Secondary databases which are non-redundant and curated include Refseq, CCDS, dbEST which is a database of expressed sequence tags, dbSNP which is a database of single nucleotide polymorphisms, UniGene among other. Third party annotation (TPA) databases, genomic databases, protein databases, structural databases, gene expression libraries and micro array databases are also used to archive data obtained through rapid DNA sequencing (NCBI, 2009 Rashidi  Buehler, 2000). .
Examples of genomic databases include the Proteome Analysis Database and the Online Mendelian Inheritance in Animals (OMIA). UniProtKBSwiss-Prot is an example of a protein database that has entries from many plants, microbial and animal species that have an impact on food security, availability and safety or which are sources of food such as Arabidopsis thaliana, salmonella typhimurium, and E. coli. Secondary structural databases such as PROSITE, pfam and PRINTS are also of immense value. Other types of databases which contain sequence information that is pertinent to food are structural databases such as the Protein Data Bank (PDB), micro array databases and gene expression libraries (GEO) (NCBI, 2009).
The databases have dedicated tools that can be used to submit the sequenced data. Examples of such tools include the SAKURA, Sequin or the Mass Submission System (MSS) in DDBJ, SPIN in UniProt, and AEdb ENA or IMGTHLA links in EBI. Microaarray submissions are sent through the ArrayExpress link. High throughput genomic sequences are sent through an FTP link for all the major databases including GenBank, EMBL and DDBJ (NCBI, 2009).
Use of the 3 primary databases is free and is coordinated by the International Nucleotide Sequence Database (INSD) collaboration. According to the rules governing these sequence databases, anybody can submit sequence data at anytime for free, the data is available to any interested party at no cost, no restrictions to access the databases are placed, sequences can be corrected by the submitters but all sequences which have been submitted remain accessible through their accession numbers, and full disclosure is made to the public on the submitted sequence data (NCBI, 2009).
How the knowledge can help to advance the underlying mechanisms
The knowledge stored in these databases can be used in many ways to advance the knowledge of underlying mechanisms. First, the information can be used for gene expression studies. One particular important use of DNA sequence data is in determining the manner in which genes are expressed. Gene expression can be done through current sequencing methods such as MPSS, EST sequencing and microarrays (USDA, 2009 Stekel, 2003).
The information can also be used to predict gene function either through ab initio methods or through homology based search techniques. In homology based searching, an example would be determining the function of a gene through comparison of this gene with other similar genes whose function has been elucidated before. Similar methods can also be used to determine the protein function and metabolic pathways of DNA sequences (Eisenberg, Marcotte, Xenarios  Yeates, 2000 Blanco  Guigo, 2005).
Additionally, the information can be used to determine the evolutionary mechanisms through phylogenetics. Here, the evolutionary changes in DNA sequences of related food-borne pathogens, plants or animals used for food are compared. This will enable the generation of more accurate evolution mechanisms as it does not depend solely on physiological features or physical taxonomy. Whole genome sequences can also be compared and intricate evolutionary occurrences, for instance horizontal gene transfer, studied. The obtained sequence data can also be used to come up with computation-based models which would help to predict the long term prospects of the system under consideration. This can be accomplished in 4 main steps and these include multiple sequence alignment, determination of the substitution model to use, creation of phylogenetic trees and evaluation of the tree using bootstrap analysis, likelihood ratio tests, permutation or skewness tests (NCBI, 2009).
Of particular importance has been the use of this DNA sequence information in the design of primers. The sequence data has been used to create probes for hybridization and this has proved useful in identification of food borne pathogens. Sequence analysis can be used to detect mutations in microbes affecting food and in food sources such as plants and animals. Currently, there is a lot of sequence data and methods such as the comparative genomic hybridization and single nucleotide polymorphism arrays have been developed to identify and locate point mutations (NCBI, 2009). This is especially important for food-borne pathogens and animal and plant foods and will help to identify mutations responsible for drug resistance in the pathogens or lethal mutations that can cause reduced yields and death of plant and animal species. However, the data is generally noisy and is thus not very useful. The information obtained from rapid DNA sequencing can also be used in the high throughput analysis of proteins, genome annotation and to analyse the way regulation is carried out by plants and organisms which serve as food sources or which affect the availability or quality of food in one way or another (Kolchanovv  Hofestaedt, 2004).
Currently, micro arrays are being utilized to simplify the analysis of contaminated food. Here, thousands and thousands of sequences for different pathogens are immobilized on a platform such as a silicon chip. This enables the simultaneous identification of pathogens that can potentially cause a food borne disease, the drugs which it is resistant to and its pathogenicity. This has therefore made it simpler to analyze contaminated food (Stokel, 2003).
The success of DNA sequencing methods on food has largely been as a result of the availability of highly precise and discriminative oligonucleotides and the complete automation of the DNA sequencing technique. Regarding automation, machines which utilize robotics to provide automated pipetting and highly mechanized gel loading systems, bar-coded comb loading and cycle sequencing have helped to significantly reduce the turnover time besides enhancing the quality and quantity of the outcomes obtained (ScientistLive, 2009).
Multi locus sequence typing (MLST) has also been applied in the DNA sequencing of food with relative success. The MLST technique is different form techniques which are based on banding patterns in that the nucleotide sequences variations in multiple allelic genes which manifest low mutation frequencies are targeted. The technique can be replicated easily and has been used to distinguish outbreaks of Listeria (Zhang, Jayaro  Knabel, 2004). However, the technique is unable to discriminate between different isolates of the 0157H7 strain of E. coli since the homology of the allelic genes is nearly 100 (Chang, Zhang  Knabel, 2005).
Future of DNA Sequencing in Food
Whereas the widespread application of DNA sequencing has vastly improved the quality and availability of food, there are some inherent challenges associated with the current DNA sequencing methods. A particularly important hindrance is that the information which is obtained through DNA sequencing, especially if it is high throughput data, is associated with a lot of noise. This noise has reduced the utility off some of the available DNA sequencing techniques. Future use of DNA sequencing is thus premised on newer tools which are geared at reducing this noise (Bansal, p.1-11).
For instance, studies of gene expression using currently available techniques are generally deemed to be full of noise and are largely subjective. The focus of future studies is on the development of statistical techniques which can separate the noise for the gene sequences from useful signals especially in high through put gene expression studies (Bansal, p.1-11).
Tools based on the Hidden Markov model and change point analysis are also being developed so that the large quantity of sequence data currently available can be effectively utilized to study mutations in plant and animal food species and food-borne pathogens  (Bansal, p.1-11).
Additionally, studies of protein-protein docking also promise better prospects in future. It is expected that protein-protein interactions of the 3D structures can be elucidated based solely on these structures thus obviating the need for wet laboratory. If this happens, it will be possible to develop highly effective and low cost diagnostic and therapeutic tools directed against food-borne pathogens quickly and with less difficulty  (Bansal, p.1-11 Blundell, p.413-23).
In future also, high throughput image analysis may become a possibility and can be used in many ways to boost food security. A commonly cited example which is applicable to food is the inference of contigs in DNA mapping.  Rational drug design through lead optimization that is based on analysis of DNA sequences is yet another possibility. Other future trends include whole genomic expression analysis using microarray hybridization, increased commercialization and therefore use of gene chip technology and improvement of multi-locus sequence typing methods (Blundell, p.413-23 Bansal, p.1-11).

The DNA sequencing technique has been used widely in the food industry. The technique has had a great impact on food. It has helped to enhance food safety, improved the nutritional value of food and enhanced food security. Use of rapid DNA sequencing has also aided in the quality control of food, led to cost savings, improved the understanding of food borne diseases and led to the development of better diagnostic techniques against food borne diseases.
There is a lot of sequence information and this is stored in bibliographic, nucleotide, genomic, protein, and microarray databases. The information is used in gene expression studies, prediction of gene function, prediction of protein function, primer design and elucidation of evolutionary mechanisms. Success of the technique has primarily been due to automation and development of highly specific and accurate oligonucleotides. Future prospects are focused on reducing the amount of noise inherent in the data through hidden Markov Model (HMM) tools, protein-protein docking, high throughput protein analysis, lead optimization, whole genomic expression analysis using microarray hybridization and increased use of gene chip technology.

0 comments:

Post a Comment