The science of bioinformatics characterizes a junction of volatile growth in both biotechnology and information technology. Traditionally it has been tantamount with the management and analysis of DNA and protein sequences. The term is now more largely used to embrace epidemiology (especially genetic epidemiology) and evolutionary studies. Another aspect of bioinformatics is computational biology research that is the development of new algorithms and statistics.
The bioinformatics includes other aspect of the use of these application tools for the analysis and elucidation of sequence data. DNA sequencing generates large amounts of data that would be extremely tiresome to analyze without computers. Quite a lot of commercially available software packages will carry out routine sequence analysis. Many sequence analysis programs are also available on the Internet. The sequence is submitted via the website or by e-mail and the results are provided after the analysis.
A DNA sequencing project:
A DNA sequencing project requires the use of a computer to assist in compiling and editing the sequence data. Overlapping clones are established and assembled into proximate sequences. Computers are also used to produce the complementary sequence, reverse the 5′?3′ orientation, or supply amino acid translations of nucleic acid sequences. It is also feasible to calculate different physical, biochemical and structural properties of both nucleic acids and proteins. For example, algorithms to predict the secondary structure of either proteins or RNA are available. Definite sites or motifs within a gene (for example glycosylation sites, signal sequences, etc.), restriction sites, promoter elements or other DNA signals, as well as various other protein motifs can be identified with the help of a computer.
Alignment of associated sequences and homology searches are frequent analyses performed on both nucleic acid and protein sequences. For example, it is moderately unproblematic to obtain sequence data, but it is quite complicated to foretell protein structure and function. Protein structures and functions can be approximated through similarities to the already known sequences. Such similarities are identified in the bioinformatics by aligning DNA or protein sequences. This process is often termed as sequence alignment.
It is a tool of bioinformatics that provides a powerful way to compare the sequences for either evolutionary or the phylogenetic relatedness and structural or functional relatedness. Sequences can be compared by either global or local alignments. The global alignment works for the complete alignment of the sequences under analysis, while the local alignment of the sequences only align their most similar segments in the whole of the provided sequences. The choice of the method depends upon whether the sequences are supposed to be related over their total lengths or to only contribute to secluded regions of homology. There are different algorithms to carry out global and local alignments. The statistics associated with the output are different in these algorithms. The pair-wise sequence alignments make use of a scoring matrix in order to analyse the best alignment between the two sequences under analysis. Proteins are more complicated than nucleic acids and the scoring matrix must be also taken into account the similarity between residues and their relative abundance. The amino acid similarity can be distinct as similar on either a chemical basis or a structural basis. No single scoring matrix can be universally used for proteins.