dna libraryThe advances in genomics and particularly in NGS over the last few decades have led to dramatic increase in the speed of sequencing. This increase has not only been through the discovery and development of new technologies, but also through advancements within each method. A notable example is Illumina sequencing, which has improved to the point that the run time has been reduced from several days to less than two hours. 

Just as sequencing has been getting faster, so too has library preparation. Different groups will use different library preparation techniques, but the process can be broadly broken up into five or six main stages, depending on the sequencing method.

1. DNA Fragmentation

It is vital to ensure that the DNA strands are of similar size prior to sequencing, something which is achieved by DNA fragmentation. This is important because different sized fragments will act differently in later stages of the process, especially during PCR which may over-amplify smaller fragments in comparison to larger ones and incorrectly skew your results. There are many different methods for fragmenting DNA, but there are five that occur frequently:

Acoustic Shearing – the strands are broken apart by high energy sound waves.

Sonication – the strands are subjected to hydrodynamic shearing.

Enzyme Restriction – site specific enzymes are used to cleave the DNA into smaller pieces.

Nebulisation – the DNA strands are forced through a small hole in the nebulising unit to create a fine mist that can indicate fragment size through pressure.

Needle Shearing – the strands are forced through a small gauge needle to create sheering forces sufficiently strong to break them apart.

Most NGS techniques will produce short reads, so the DNA fragments are usually less than 800 base pairs long. However more recent techniques, such as SMRT or Nanopore sequencing, do allow for longer reads, meaning that longer fragments may be required for certain experiments. You need to factor in your choice of sequencing technique and experimental goals in all stages of library preparation to ensure that the sequencer and library protocol you are using can provide you with the most accurate and appropriate results possible.
The exceptions to this stage are microarrays designed for SNP genotyping or genome-wide copy number detection. These can use
DNA that has not undergone fragmentation and tend to use DNA ‘as is,’ although it is not uncommon to instead use DNA that has undergone restriction-enzyme digestion. It is also important to note that clinical samples that have undergone FFPE preservation may already be sufficiently degraded for library preparation and as such will not require further fragmentation.

2. End Repair

Due to the often uncontrolled nature of the process, after fragmentation the ends of the DNA strands will vary randomly between blunt ends, overhangs and recessed ends. In the same way that varied sizes will affect the action of DNA later in the process, having different ends present in the sample will lead to complications in future steps and so the ends need to be repaired. There are several different treatments available to perform end repair, which will repair the ends either by filling in recesses where bases have been removed or by cleaving off trailing overhangs of single stranded DNA, leaving all fragments present in the sample with blunt ends.

3. dA-tailing

dA-tailing is a stage that only exists during Illumina sequencing. This involves attaching an extra A base to the 3’ end of the repaired fragments to produce a single nucleotide overhang which enables the DNA strands to ligate to the correct adaptors in the next step.


Clinical Genomics 101: 2017 Edition


4. Adaptor Ligation

Once the ends have been corrected, the unknown sequences need to be attached to smaller, known sequences, known as adaptors.
These adaptors will ligate to either end of the unknown sequence and are essential later in the process. Illumina sequencing, for example, requires the adaptors to allow for hybridisation of the DNA to the primers in the flow cell, clonal amplification of the clusters, and priming for the sequencing reaction. These adaptors also contain a known ‘index’ sequence (also known as a barcode) to allow for multiple different samples to
be studied in a single cell (a process called multiplexing). Alternatively, SOLiD and 454 Sequencing use the adaptors to enable the binding of the DNA to the agarose beads prior to the sequencing process.

5. Amplification

DNA amplification was discussed in detail in the previous chapter, so we won’t cover the different methods again here. In library building, PCR is the most common technique used during this stage to enrich the sample, as well as ensuring that only fragments with an adaptor at both ends of the strand are selected for sequencing.

6. Purification and Quantification

Gel electrophoresis or bead clean-up is usually used at this stage in Illumina sequencing to purify the final product and thereby conclude the library preparation. 

Before sequencing, however, it is important to ensure that the library is rich enough (ie. contains enough DNA fragments with adaptors
attached) to proceed to sequencing – the sample requires quantification. Quantification is also useful in experiments involving the sequencing of more than one library at a time, something which is possible with more recent techniques like Illumina sequencing.

There are several different methods of library quantification but most techniques rely on measuring the size of the fragments present through various means such as fluorimetry and spectrophotometry:

Fluorimetry involves binding fluorescent dye to the DNA fragments – larger molecules will bind to more dye and so will fluoresce more brightly. Similarly, in spectrophotometry, UV light is shone through the sample with larger fragments absorbing more energy than smaller ones and indicating the fragment size of the sample.

Alternatively, qPCR with adaptor-specific primers may be used to quantify the number of ‘sequencable’ molecules in a sample. This can be useful for balancing multiple samples if you’re using a multiplexing technique.

Found this handy?

Download the full e-book, and learn more about clinical genomics in our free guide that introduces you to the clinical workflow from sample collection through to data interpretation.