Concept on Plant DNA Barcodes and their Application in Identification of Plants

Author: Vasa Dileep Reddy, Suhel Mehandi*, Harmeet S. Janeja, Kanak Saxena and Satya Prakash

Journal Name:

PDF Download PDF

Abstract

DNA barcoding is an reliable mechanism which utilizes the specific regions of DNA to identifiy plant species. Plant DNA barcodes such as rbcL, matK, trnH-psbA, and ITS2 have been produced and employed to answer fundamental problems in evolutionary biology and ecology over the last decade, but however none of the above listed loci work across entire species. In closely related Species, these single-locus DNA barcodes do not have enough variation, So many investigators have proposed a multi-locus strategy that allows for more species differentiation than single-locus strategies. Because of these constraints of single-locus strategies a new genome called complete chloroplast genome is used to differentiate closely related plants. Here, I review single-locus and multi-locus DNA barcodes, as well as methods for preparing DNA barcodes and the future outlook of DNA bar coding in plants.

Keywords

DNA barcoding, single locus barcode, Multi locus Barcode, super barcode, DNA barcode library, plants

Conclusion

DNA barcoding is an reliable mechanism which utilizes the specific regions of DNA to identifiy plant species. Plant DNA barcodes such as rbcL, matK, trnH-psbA, and ITS2 have been produced and employed to answer fundamental problems in evolutionary biology and ecology over the last decade, but however none of the above listed loci work across entire species.

References

INTRODUCTION A critical task for any research ecologist, evolutionary biology scientist or any plant breeder is to determine the exact recognition of plant samples from a group of different types of plants samples. DNA barcodes," they say i.e., standardized small or short DNA sequences of 400 and 800 (base pair) long that can be easily obtained and described for all plant species on the earth, were created to make this work easier (Herbert et al., 2003). It is a technique which utilizes specific regions of DNA and internationally agreed protocols for species identification and to build a global database of biological organisms, and this also has the capability to speed up the findings of thousands of plant species (Cowan et al., 2006). The main purpose of DNA barcode is to create online libraries of all well-known species that can be used as a standard against which DNA barcodes from any unidentified or identified specimens may be easily matched and it can also help to solve some of the problems that come up with standard taxonomy identification based on morphological features. In May 2004, a (CBOL) consortium for the barcode of life was formed to enhance DNA barcoding applications for all eukaryotic species on the planet. More than 120 organizations from 45 countries are involved in this (CBOL) (Ratnasingham et al., 2007). DNA barcodes are initially planned and applied first for the recognition of animal species in the beginning years of this century (Hebert et al., 2004b). A uniform DNA barcode for plants, on the other hand, was not immediately effective and was not welcomed by the botanical group until several years later (Kress, 2011), DNA barcoding in plants is accepted after the remarkable inventory of plastid, nuclear and mitochondrial genomic regions (Kress et al.,2005; Chase et al., 2005; Lahaye et al., 2008; Kress and Erickson, 2007; New master et al., 2008). trnH-psbA, rbcL, matK and ITS are the four main gene areas utilized in DNA barcoding applications, and these are the conventional DNA barcodes of choice in most plant applications (China plant BOL group, 2011; CBOL plant working group, 2009; Li et al., 2015). Because of the significantly slower mutation rate in plants, the cytochrome c oxidase 1 (CO1) sequence does not discriminate in most of the plants, but it has been touted as a universal barcode in animals but does not discriminate in the plants (Kress et al., 2005). With a discriminating efficiency of 72 percent, the (CBOL) proposed the matK+rbcL two-locus combination as the optimal plant barcode (CBOL plant working group, 2009). DNA barcoding has become a widespread global way of identification, with the ability to distinguish a plant species throughout its life cycle (fruits, seeds, seedlings, mature individuals both sterile and fertile, as well as destroyed specimens), gastrointestinal contents, and fecal contents from animals also. This also aids evolutionary biology scientist in identifying regulated species, rare species, medicinal plants, and endangered species by comparing species definitions across plant lineages using genetic variability measures based on DNA barcode sequence data, as well as flagging or marking species that are new to science, such as cryptic species. Here, I review single-locus and multi-locus DNA barcodes, as well as methods for preparing DNA barcodes and the future outlook of DNA barcoding in plants. SINGLE- LOCUS DNA BARCODES Traditional barcodes have been explored extensively, but they still have significant limitations. Below are descriptions of some commonly used single-locus barcodes. 1. matK. It has a grater evolutionary rate, inter specific variation, appropriate length, as well as a low or non-existent transition or transversion rate (Min and Hickey, 2007; Sharma and Kumar 2008). Unfortunately, due to currently available primer sets it is difficult to amplify universally and also Taxonomic groups require various primer pairings (chase et al., 2007; Hollingsworth, 2008). As per the CBOL Plant Working Group (2009), a single primer pair will amplify angiosperm DNA with a roughly 90% success rate, but even with multiple primer sets, the success rate was low in gymnosperms (83%) and even worse in cryptogams (10%). Lahay et al., (2008); Cuenoud et al., (2002) employed specific or particular primers to amplify the matK gene in 1667 angiosperm plant samples, resulting in a 100% success rate. matK has different discriminate rates in no of taxonomic families; it can differentiate more than 90% of orchidaceae members (Kress and Erickson, 2007), but only 49% of nutmeg family members (New master et al., 2008). Fazekas et al. (2008) attempted to identify 92 species from 32 genera, but only had a 56 percent success rate. As a result of these observations, the matK barcode alone is not a viable universal barcode. 2. rbcL. With nearly 50000 sequences accessible in the gene bank, rbcL is commonly used in phylogenetic analyses. The key benefit of this gene is that it is simple to amplify, sequence, and align in most plants. However, rbcL sequences are slow to evolve, and the locus contains the least amount of plastid gene divergence among flowering plant species (Kress et al., 2005) and also it is not suitable at the spices level due to its low discriminatory power (Fazekas et al., 2008; Lahaye et al., 2008; CBOL Plant Working Group; Chen et al., 2010). The length of the gene is also a challenge, as double-standard sequencing of the complete gene sequence necessitates the use of four primers. Although rcbL does not have all of the essential characteristics, it is believed that when combined with other plastid or nuclear loci, it can provide correct identification (New master, Fazekas and Ragupathy 2006; Chase et al., 2007; Kress and Erickson 2007; CBOL Plant Working Group, 2009; Hollings worth et al., 2009). Despite these drawbacks, rbcL was one of the best prospective candidate plant barcodes based on the ease with which the gene sequence could be recovered, even though it had previously been rejected as a species identification target (Gielly and Taberlet 1994; Renner, 1999; Salazar et al., 2003). trnH-psbA. The plastid barcode trnH-psbA is presently one of the most extensively utilized barcode and this design is globally possible due to the presence of substantially conserved or maintained coding sequences on both sides (Shaw et al., 2005). It has highest rate of Insertions/deletions as well as the most sequence divergence (Kress and Erickson 2007), and a single primer pair is likely to multiply almost all Angiosperms (Shaw et al., 2017). In plants group members like Dendrobium, Pteridophytes, Hydrocotyle, the trnH-psbA region could recognize or identify all the species (vandewiel et al., 2009; Yao et al., 2009; Ma et al., 2010) and it is ideal or suitable for usage as a plant barcode in plant differentiation (Kress and Erickson, 2007; Shaw et al., 2007). In some monocots and conifers, there are duplicated loci and a pseudogene, and the trnH-psbA sequence is substantially longer [>1000 base pair(bp)] (Chase et al., 2007; Hollingswroth et al., 2009) while it is relatively short (less than 300 base pairs in other categories) (Kress et al., 2005) and it is shorter than 100 base pair sequence in bryophytes (Quandt and Stech 2010). The problem with using the trnH-psbA barcode is that some plant ancestry has multiple inversions, which can lead to overestimation of genetic variability and incorrect phylogenetic classification (Whitelock, Hale and Groff, 2010). Another issue with mononucleotide repeats which prematurely terminates sequencing reads, so that longer areas can be difficult to recover without internal sequencing primers (Chase et al., 2009; Ebihara, Nitta and Ito 2010). To achieve acceptable resolution, the trnH-psbA can be employed in a two-locus or three-locus barcode system (Kress et al. 2005; Chase et al., 2007). ITS. It is a robust phylogenetic marker with significant interspecific variation, higher discriminatory strength across plastid regions at lower taxonomic levels, and is studied extensively and suggested as a plant barcode (Alvarez and Wendel 2003; Stoeckle, 2003; Kress et al., 2005; Sass et al., 2007). However, because limitations like as incomplete coordinated evolution, fungal invasion, and amplification and sequencing challenges, (CBOL) has classified ITS as a supplemental locus (CBOL Plant Working Group 2009; Hollingsworth et al., 2011). To avoid the difficulties of sequencing the entire ITS, the CBOL Plant Research Group suggested using ITS2 as a backup to reduce amplification and sequencing issues. So, it is accepted that ITS2 could be used as universal barcode for the identification of wider range of plant taxa, A major concern is that due to the presence of multiple copies in the genome which may lead to inaccurate and misleading results (Chen et al., 2010; Gao et al., 2010ab; Luo et al., 2010; Pange et al., 2010, 2011; Alvarez and Wendel 2003). OTHER WIDELY USED PLASTID BARCODES Other often used plastid barcoding markers include the following: (rpoB, rpocL, atpF-atoH, psbK-psbL, ycf5 and trnL). These chloroplast areas are useful for barcoding research and phylogenetic studies at higher taxonomic levels, but due to insufficient variability, they are not ideal for plant DNA barcoding at lower taxonomic level. CANDIDATE MULTI-LOCUS DNA BARCODES Many researchers have proposed a multi locus technique to acquire significant species discrimination because single locus alterations are insufficient (Herbert et al., 2004; Kress and Erickson 2007; Erickson et al., 2008; Kane and Cronk 2008; Lahaye et al., 2008; CBOL Plant Working Group, 2009; Chase and Fay, 2009). Plastid loci of various combinations have given including rbcL + trnH-psbA (Kress and Erickson, 2007), rpocL + matK + trnH-psbA (or) rpocL + rpoB + matK (Chase et al., 2007) and matK + atpF-atpH + psbK-psbI (or) matK + atpF-atpH + trnH-psbA (Pennisi, 2007), Compared to single-locus barcodes these combinations show greater species difference. Due to the recovery of the rbcL area and the selective capability of the matK sequences, the CBOL plant advisory committee has approved matK + rbcL as a universal barcode combination (CBOL Plant Working Group, 2009). Despite having a somewhat higher recognition efficiency than other combinations, this option fell short of the original aim of a universal DNA barcode. For new beginners, rbcL+ matK combinations cannot overcome matK's low PCR efficiency, and its success in animals is lower than that of CO1, but coupled barcodes cause more analytical difficulties than single-locus markers. SUPER-BARCODING Due to the inconsistencies of single-locus DNA barcodes, a novel process for recognizing closely related plant species is necessary (Heinze, 2007). According to reports, the full CP-genome contains as many variants as the CO1 locus in animals and might be employed as a plant barcode (Kane and Cronk, 2008). The chloroplast genome sequence is 110 to 160 kb long, far longer than commonly employed DNA barcodes, and gives greater diversity to distinguish closely related plants (Mehandi et al., 2013). The CP-genome is a versatile method for phylogenetics that can improve resolution at lower taxonomic levels in plant phylogenetic, population genetic, and phylogeographic study, allowing for the recovery of monophyletic lineages and therefore being proposed as a species-level DNA barcode (Parks et al., 2009). The Chloroplast genome is smaller than the nuclear genome and has a greater interspecific and lesser intraspecific divergence, making it suitable for use as a genome-based barcode (Mehandi et al., 2015). Although sequences from several or single nuclear or chloroplast genes have been useful for distinguishing species, the chloroplast genome has proven to be an effective tool to identify closely related species (Parks et al., 2009; Nock et al., 2011. Joly, (2012), termed "JML," utilized to examine chloroplast gene sequences and identify a hybrid and geographically isolated ancestry of Pachcycladon in New Zealand's southern alps (Beker et al., 2013). PROCESS OF DNA BARCODING I've outlined the full DNA barcoding procedure, from specimen collecting in the field to lab processing and manual editing and verification after sequencing. NAMING AND LOCATING OF SPECIMENS Prepare a list of desired species and regions to visit, as well as regional floras, internet databases, and local recorders, to assist in locating the correct target species. Furthermore, appropriately recognizing and naming DNA barcoding samples is critical, as is using a standard reference guide for plant names or recognized monographs for taxonomic sampling. FIELDCOLLECTINGOF PLANT SAMPLES REQUIREMENTS: Specimen collection envelopes, Self-indicating silica gel, Herbarium voucher collection bags, Field notebook or laptop, Field press, Drying paper, Camera, GPS, Air-tight sealable box, Jewelry tags. PREPARATIONOF HERBARIUM SPECIMENS REQUIREMENTS Drain paper, Corrugates, Flimsies, Field press, Drying oven, Herbarium mounting paper, Gummed linen strips, Herbarium labels, PVA glue, Freezer, herbarium cupboards. COLLECTING SAMPLES FOR DNA EXTRACTIONS FROM HERBARIUM SPECIMENS REQUIREMENTS Laptop, Specimen labels, Plastic zip lock bags, Forceps, 70% Ethanol, A3 scanner. Collecting DNA samples straight from herbarium specimens is a quick and easy technique to get a huge number of validated samples. The age of the specimens, as well as how they were conserved and stored, will influence the chances of getting usable DNA. We discovered that there is a 10% loss of DNA per decade, thus it is preferable to use samples which is less than 30 years old (de vere et al., 2012). Create a catalog of herbarium species to gather and labels with duplicate collection codes; these can be cut in half and one half stuck to the herbarium specimen to mark that it has been sampled, while the other half is placed in the bag with the leaf sample. Now pick an herbarium specimen to sample. Using forceps, take a tiny piece of material measuring 2-4cm square and store it in airtight zip lock bags. Label with the collection code and species name. We must use an A3 scanner to capture the collection information after the herbarium specimens have been sampled. LABORATORY INFORMATION MANAGEMENT SYSTEMS (LIMS) Keeping track of the gathered samples, especially in plants as they transit through the lab operations, is a difficult undertaking, since each sample will be amplified numerous times to allow for effective amplification utilizing the two DNA barcode markers. Spreadsheets can be used to keep a record of samples, but for large-scale DNA barcode campaigns, a LIMS system and the Biocode plugin, a free utility that can be added to the Geneious pro bio informatics programme, are utilized (Parker et al., 2012). DNA EXTRACTION OF HERBARIUM SAMPLES IN 96-WELLFORMATE REQUIREMENTS (Qiagen DNeasy 96 plant kit) Commercial extraction kit, 100% Ethanol, tissue grinding mill, 3-mm tungsten carbidebeads, Centrifuge for 96-well plates capable of achieving 6000xg, Pipettes; multi- channel and single channel, Measuring cylinders and buffer reservoirs, Burner for flaming, Water bath, forceps, Proteinase K, DTT, Fridge and Freezer. There are several ways for extracting DNA from plant material. A commercial kit (Qiagen DNeasy 96 plant kit) has been accepted for usage with herbarium specimens. Two 96-well plates are used per extraction in a 96-well configuration. PCR AMPLIFICATION REQUIREMENTS Taq polymerase, Forward and Reverse primers, (Bovine serum Albumin)PCR additive, DNA, Molecular biology grade water, PCR tubes or 96-well PCR plates, Heat-sealing PCR film, Thermocycler with96-well plates. The following rbcL and matK primers are commonly used to amplify plant species: The amplification of the DNA barcode markers rbcL and matK is discussed here. It works with herbarium material as well as newly obtained material that has been preserved in silica gel before extraction. Table 1 lists the most frequent rbcL and matK primers. The rbcL primers are typically ubiquitous, operating across a wide taxonomic range; for the first PCR, we used rbcLaF and rbcLr590. If this doesn't work, we'll try a different reverse primer. When employing herbarium material, matK is more difficult to work with and requires more primer combinations. It might also be difficult for non-seed plants, necessitating more primer development (Fazekas et al., 2012). PCR AMPLIFICATION The components required for PCR are listed in Table 2 GEL ELECTROPHORESIS Requirements. Agarose gel, 1XTAE buffer, SYBR dye, Size standard, loading buffer, Electrophoresis tank, Combs and gel support, Masking tape, Microwave, Conical flask, Power pack, UV rays gel imaging system, Amplified DNA for running in gel. Gel support and combs come in a variety of sizes and shapes, and this approach may be utilized to run a 96-well plate of samples at once. DNA SEQUENCING Samples must be sanger sequenced in both directions for DNA barcoding, therefore each PCR plate will yield two sequencing plates. For DNA sequencing, the same primers which are used for PCR can be utilized. Because of its precision and long read length, DNA sequencing is an excellent method for creating or forming reference DNA barcoding libraries. MANUAL EDITING, ALIGNMENTAND DATA CHECKS There are numerous software programs available for manual editing and data checks, such as Codon code Aligner, Sequencher, and Geneious. FUTURE PROSPECTS FOR PLANT DNA BARCODING DNA barcodes were first offered to the botanical community over a decade ago and have since been used in a range of inquiries in both applied and fundamental plant study. One of the primary reasons that DNA barcoding has not been widely used for species identification is because no one marker can completely distinguish between species in most taxonomic categories. Plant DNA barcoding will improve in two essential ways to benefit the botanical group in the future: 1. building a worldwide plant DNA barcode library for universal or worldwide usage. 2. Developing and implementing novel marker technologies, as well as implementing latest sequencing techniques. BUILDING THE GLOBAL PLNAT DNA BARCODE LIBRARY One of the biggest challenges for the next years is populating the global plant DNA barcode library. The forest monitoring plants provide a wealth of information for the creation of a universal plant DNA barcode library. Additional paths for establishing the universal library for plants include lineage-based and floristic attempts. Recently, large initiatives have begun to develop DNA barcodes for whole regional floras, one of the most spectacular libraries yet built for identifying Canada's vascular plants (Braukmann et al., 2017). Braukmann and colleagues successfully created barcode sequence data for 96 percent of the five thousand (5108) species known from Canada using three markers (rbcL, matK, and ITS2). The most difficult aspect of this approach is identifying the financing resources to cover the sequencing and laboratory expenditures. However, once this money is available, both fundamental and applied research will considerably benefit. ADOPTING NEW DNA MARKERS AND NEW SEQUENCING TECHNOLOGIES Suppositions and predictions about the future of DNA barcoding began almost simultaneously with research using these markers to taxonomy, evolution, and ecological concerns. ("Edna" or "Metabarcoding") (Taberlet et al., 2012) is one of the available DNA barcoding modifications that uses genetic markers to identify species in environmental samples like soil, seawater, or coral reefs (Leray and Knowlton, 2015). It necessitates the use of "mini barcodes," which are short and unique genetic markers that use a sub-region of standardized markers to overcome the problem of degraded DNA in these samples (Hajibabaei and Mckenna, 2012). Meta barcoding is rapidly evolving due to advances in methodology such as short DNA fragment recovery, sequencing, and amplifying. In addition, new bioinformatics methods for converting a list of DNA sequences found in a sample into a list of recognizable species are being developed. Other sequencing methods, such as "Micro fluidic PCR based" target amplification, may provide a cheaper and faster option for manufacturing large-scale multi-locus plant DNA barcoding (Gostel M, pers.comm.), are examples of the present status of genomics innovation. Many of these approaches and technologies are still in their infancy, and they may still prove to advance our capacity to use genetic markers to achieve DNA barcoding goals.

How to cite this article

Vasa Dileep Reddy, Suhel Mehandi, Harmeet S. Janeja, Kanak Saxena and Satya Prakash (2022). Concept on Plant DNA Barcodes and their Application in Identification of Plants. Biological Forum – An International Journal, 14(2a): 360-368.