# Nucleic acid structure

## Contents

## General

- Different double helical structures can be seen called A, A', B, α-B', β-B', C, C', C'', D, E, and Z
- The letters denote structural differences, the α and β are associated with packing differences, and primers indicate small variations

- the symmetries of the various double helices are represented with two numbers [math]N_m[/math] (from crystallography nomenclature)
- N is the number of nucleotides to reach the exact same point along the helix axis
- m is the number of helical turns to reach the exact same point along the helix axis

- the axial rise is the distance along helical axis between nucleotides
- If all bases were coplanar and the pairs perpendicular to the helix axis, the rise should equal the van der Waals distance of 3.4 Å

- Pitch of helix is distance along helix axis for one complete helix turn
- The pitch equls the number of nucleotides in one turn multipled by the axial rise

- The unit twist is 360 divided by the number of nucleotides in one turn and is the rotation between neighboring nucleotides
- The base-pair tilt is when the normal to the base pair plane is not exactly parallel to the helical axis.
- There is a linear relationship between the tilt of an individual base with the axial rise per nucleotide

**Sugar puckering**is the deviation from planarity for the 5 atoms of the sugar ring. The 5 atoms are never seen to be planar. It can be in an envelope form where 4 atoms are in a plane and the fifth is out by 0.5Å or in a twist form where two adjacent atoms are out of the plane made by the other three atoms. Atoms on the same side as the 5'-C are called endo and those on the opposite side are called exo.

Except for the left-handed S/Z helices, the structures are broadly classified into A and B families. The essential distinction between A and B type helices is in the sugar puckering. In A helices, 3'-endo sugar puckering is seen and in B-type helices, 2'-endo (or 3'-exo) is seen. This leads to differences in distance between phosphates from 5.9Å in A-type to 7.0Å in B-type helices. Base-pair tilt is positive (clockwise) in A-type and negative in B-type helices.

In A-type double helices, the axial rise can vary from 2.59 to 3.29 Å but has small variation in rotation from 30.0° to 32.7°. In B-type helices, the axial rise only changes from 3.03 to 3.37 Å but the rotation varies from 36° to 45°

Typical parameters for the helices:

Structure | Pitch (Å) | Helical symmetry | Axial rise (Å) | Twist (°) | Minor groove width (Å) | Major groove width (Å) | Minor groove depth (Å) | Major groove depth (Å) |
---|---|---|---|---|---|---|---|---|

A | 28.2 | [math]11_1[/math] | 2.56 | 32.7 | 11.0 | 2.7 | 2.8 | 13.5 |

B | 33.8 | [math]10_1[/math] | 3.38 | 36.0 | 5.7 | 11.7 | 7.5 | 8.5 |

C | 31.0 | [math]9.33_1[/math] | 3.32 | 38.6 | 4.8 | 10.5 | 7.9 | 7.5 |

B' | 32.9 | [math]10_1[/math] | 3.29 | 36 | ||||

C' | 29.5 | [math]9_1[/math] | 3.28 | 40 | ||||

C | 29.1 | [math]9_1[/math] | 3.23 | 40 | ||||

D | 24.3 | [math]8_1[/math] | 3.04 | 45 | 1.3 | 8.9 | 6.7 | 5.8 |

E | 24.35 | [math]7.5_1[/math] | 3.25 | 48 | ||||

S | 43.4 | [math]6_5[/math] | 3.63 | -30.0 | ||||

Z | 45 | [math]6_5[/math] | 3.7 | -30.0 |

## DNA

DNA can form a wide range of double helical structures. Random sequences are found in the A, B, and C forms. Designed repetitive sequences can form D, E, and Z forms.

### B-form DNA

- minor groove angle: 137.5078°
- Twist angle of 34.7°
- frequency: 10.4 bases/turn
- The roll and tilt angles vary by a few degrees depending on the basepairs. The dinucleotide AA (or TT) causes significant variations in the roll and tilt angles

## RNA

The extra 2'-OH usually prevents formation of the B-form helix found in DNA. Double-helical RNA is usually of the A or A' form:

- 11 bases/turn
- The basepair stacks are tilted and displaced with respect to the axis of the helix

### Pseudoknots

RNA is normally assumed by folding algorithms to fold without pseudoknots. A non-pseudoknotted structure in parenthesis format would close all parenthesis in order, i.e. `[()]`. A pseudoknot has the form `[(])`. In a pseudoknot, the knotted region the "`()`" pairing cannot exceed 9 or 10 basepairs. This constraint is because of the helical structure of RNA which forms 10 or 11 basepairs per turn. With a full turn, the two strands of the pseudoknot would form a true knot which is physically and biologically unrealistic.

### Thermodynamics

[math]\Delta G^0 = -RT log K = \Delta H^0 - T\cdot\Delta S^0[/math] where [math]K=\frac{\rm [duplex]}{\rm [single-strand]^2}[/math]

At the melting temperature, [math]T_m[/math], [math]2[{\rm duplex}] = [{\rm single-strand}][/math] and from conservation of total RNA, [math]2[{\rm duplex}] + [{\rm single-strand}] = [{\rm RNA}]_{total}[/math]. From this, we can derive that:

[math]T_m = \frac{\Delta H^0}{\Delta S^0 + R\cdot log[{\rm RNA}]_{total}}[/math]

You can experimentally find the melting curve and extract the values of [math]\Delta H^0[/math] and [math]\Delta S^0[/math] from which you can get [math]\Delta G^0[/math]. The Freier-Turner rules shows the incremental [math]\Delta G^0[/math] of stacking another basepair to the end of another pair. The top row shows the 5' basepair, the left column shows the 3' basepair, and the values are in kcal/mol. For example, a GC basepair followed by a CG basepair has -3.4 kcal/mol. This data was calculated for the folding of RNA at 37°C.

GU | UG | AU | UA | CG | GC | |

GU | -0.5 | -0.6 | -0.5 | -0.7 | -1.5 | -1.3 |

UG | -0.5 | -0.5 | -0.7 | -0.5 | -1.5 | -0.9 |

AU | -0.5 | -0.7 | -0.9 | -1.1 | -1.8 | -2.3 |

UA | -0.7 | -0.5 | -0.9 | -0.9 | -1.7 | -2.1 |

CG | -1.9 | -1.3 | -2.1 | -2.3 | -2.9 | -3.4 |

GC | -1.5 | -1.5 | -1.7 | -1.8 | -2.0 | -2.9 |

To calculate the total energy of a RNA duplex, simply sum the contribution of each pair plus a nucleation term for the first pair, which has been experimentally determined to be 3.4 kcal/mol. It's positive because of entropic loss due to association of two strands.

Loops can be analyzed similarly. The Freier and Turner values for loops are:

Length | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 12 | 14 | 16 | 18 | 20 | 25 | 30 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Bulges | 3.3 | 5.2 | 6.0 | 6.7 | 7.4 | 8.2 | 9.1 | 10.0 | 10.5 | 11.0 | 11.8 | 12.5 | 13.0 | 13.6 | 14.0 | 15.0 | 15.8 |

Terminal loops | ∞ | ∞ | 7.4 | 5.9 | 4.4 | 4.3 | 4.1 | 4.1 | 4.2 | 4.3 | 4.9 | 5.6 | 6.1 | 6.7 | 7.1 | 8.1 | 8.9 |

Internal loops | -- | 0.8 | 1.3 | 1.7 | 2.1 | 2.5 | 2.6 | 2.8 | 3.1 | 3.6 | 4.4 | 5.1 | 5.6 | 6.2 | 6.6 | 7.6 | 8.4 |

Some 4 base terminal loops (tetraloops) are more stable than would be predicted. These include the sequences GNRA, UNCG, and CUYG.

### Triple helices

Purines have a second face (the Hoogsteen face) that can hydrogen bond with a pyrimidine (A with U and G with C). In Hoogsteen pariing, the two strands are parallel. In reverse Hoogsteen pairing, the two strands are antiparallel. When one strand of a Watson-Crick paired helix contains a *homopurine region*, it can make Hoogsteen or reverse Hoogsteen pairing with a third homopyrimidine strand inserted into the major groove of the duplex to form a triple helix.

### Tetraloop-receptor interactions

Tetraloops of the GNRA family can interact with specific helical structures. Different loops interact with different receptors.