* Metagenomics applied to surveillance of pathogens and antimicrobial resistance ** Week 1 *** Introduction to Metagenomics In order to fight against antimicrobial resistance, we need to know where the pathogens and the transmisison routes are. Metagenomics allows us to identify individuals in a complex sample, using DNA sequences. It is a sequencing-based analysis of genomes contained in an environmental sample. We need to isolate the genomes of the microbes, the same applies to all microbial genomics techniques, which we'll compare to metagenomics: **** Microbial genomics ***** 16S rRNA profiling We amplify the 16S ribosomal RNA obtaining as a result an idea of the taxonomic composition of the sample. The 16S analysis offers deep insight, such as presence of low abundance organisms. Metagenomics is a more generic approach, by randomly amplifying whole genomes we can learn more about the function. ***** Whole Genome Sequencing (WGS) We isolate and cultivate specific microbes from a complex sample. We obtain taxonomic identity information and specific information such as virulence, antimicrobial resistance, etc. ***** Single Cell Sequencing It has the same principles as the previous method and it's used when it's not possible to cultivate the microbe. **** Metagenomics project A metagenomics project involves many steps, we will group them in 2 categories: ***** Laboratory steps 1. DNA/RNA isolation: RNA is translated to cDNA 2. Fragmentation: current next generation sequencing methods cannot sequence long fragments (exception: Nanopore sequencing) 3. DNA sequencing: we obtain files with the sequence and quality information ***** Data analysis - Reads: we analyze directly - Binning and assembly **** Current challenges Metagenomics still faces some challenges, mostly due to the sequencing process: - Sequencing data can belong to the tip of the iceberg (for complex samples) - Background DNA sequences can make up a considerable percentage of the data (e.g. host environment) - Lack of reference genomes due to novel organisms, although some methods exist to allow us to identify them - Low-quality reference genome (contaminant sequences -> false-positive) - Quantification of the pathogens is difficult due to the different genome sizes - Assembly of reads from different organisms (chimera formation) - Unknown function of novel proteins