What is this DNA stuff?
If you've never heard of DNA, first read http://en.wikipedia.org/wiki/Dna and maybe pick up a textbook for the basics.
The critical things you should understand before trying to do this tutorial:
- DNA is a chemical entity, but we represent its sequence in the computer
- DNA is made of 4 deoxyribonucleotides, A, T, C, and G (casually called bases) in a specific sequence determined by covalent bonds
- DNA molecules have directionality, one end is the 5' terminus, the other end is the 3' terminus
- By convention, DNA sequences are always written out in the 5' to 3' direction unless stated explicitly with "5'-" and "3'-" Alternatively, DNAs can be represented in cartoon form as a line with a barb at one end. The barb refers to the 3' end.
- DNA can be circular or linear
- DNA can be single stranded or double standed
- Double standed DNAs anneal to each other by Watson-Crick base pairing
- The sequence of the complementary strand of a double standed DNA is the "reverse-complement" of the other strand
- The "reverse" and "complement" operations on a DNA sequence do not result in biochemically-meaningful sequences. You must always do both (reverse-complement) to get the sequence of the complementary strand
- With some exceptions, bacterially replicating DNAs are double stranded circular molecules regardless of whether they are genomic DNAs or plasmid DNAs.
- Even when DNAs are circular double stranded molecules, we represent them as linear single-stranded sequences using our software tools like ApE