4.8 KiB
- Metagenomics applied to surveillance of pathogens and antimicrobial resistance
Metagenomics applied to surveillance of pathogens and antimicrobial resistance
Week 1
Introduction to Metagenomics
In order to fight against antimicrobial resistance, we need to know where the pathogens and the transmisison routes are.
Metagenomics allows us to identify individuals in a complex sample, using DNA sequences. It is a sequencing-based analysis of genomes contained in an environmental sample.
We need to isolate the genomes of the microbes, the same applies to all microbial genomics techniques, which we'll compare to metagenomics:
Microbial genomics
16S rRNA profiling
We amplify the 16S ribosomal RNA obtaining as a result an idea of the taxonomic composition of the sample. The 16S analysis offers deep insight, such as presence of low abundance organisms.
Metagenomics is a more generic approach, by randomly amplifying whole genomes we can learn more about the function.
Whole Genome Sequencing (WGS)
We isolate and cultivate specific microbes from a complex sample. We obtain taxonomic identity information and specific information such as virulence, antimicrobial resistance, etc.
Single Cell Sequencing
It has the same principles as the previous method and it's used when it's not possible to cultivate the microbe.
Metagenomics project
A metagenomics project involves many steps, we will group them in 2 categories:
Laboratory steps
- DNA/RNA isolation: RNA is translated to cDNA
- Fragmentation: current next generation sequencing methods cannot sequence long fragments (exception: Nanopore sequencing)
- DNA sequencing: we obtain files with the sequence and quality information
Data analysis
- Reads: we analyze directly
- Binning and assembly
Current challenges
Metagenomics still faces some challenges, mostly due to the sequencing process:
- Sequencing data can belong to the tip of the iceberg (for complex samples)
- Background DNA sequences can make up a considerable percentage of the data (e.g. host environment)
- Lack of reference genomes due to novel organisms, although some methods exist to allow us to identify them
- Low-quality reference genome (contaminant sequences -> false-positive)
- Quantification of the pathogens is difficult due to the different genome sizes
- Assembly of reads from different organisms (chimera formation)
- Unknown function of novel proteins
Consideration for metagenomic/microbiome projects
A surveillance/microbiome study contains multiple steps. A project starts with a goal/hypothesis, so it is important to think about the whole structure.
Timeline
Project design
- Project approval: we need to contact the relevant authorities
- Subject recruitment: find out if we need controls
Sampling
- Sample collection: aseptic collection of samples and inclusion of replicates. We can use empty swabs and tubes as controls, to check for background DNA.
- Sample handling and storage: freeze the sample immediately, consider making a licquate if the sample is going to be used for multiple examinations.
Laboratory steps
The different steps would be conducted in different areas, ideally in different rooms, to avoid cross contamination.
- DNA/RNA isolation: clean environment to avoid contamination of the sample. We include blank control tubes, tubes without a sample but processed in the same way. We can also include samples with Spike organisms or Mock communities.
- Library preparation: we prefer PCR-free approaches, we consider splitting the sample and using 2 different barcodes. We should ideally see the same composition in both samples.
- Sequencing: the technology we choose depends on the research question (size of genome, number of samples, depth). We might use multiple rounds of sequencing, even considering a pilot study to get an idea of the complexity of our sample.
Analysis
- DNA sequence analysis: we can map the reads to reference databases (read classification) or use assembly.
- Statistical analysis: the methods vary according to the research question. We can analyze the diversity, compare the samples using multivariate analysis, compare them to metadata using regression.
Additional considerations
- Process all the samples in the same way: the processing can introduce changes that make comparing them difficult. Usually the microbial population is not modified, but the abundance of each species can be altered.
- Randomization of the samples accross sequencing runs: to mix treatment and control groups.
-
Contaminations: they can happen at any level:
- Host: sometimes they might be a majoritary part of the genomes
- Reference genomes
- Environmental background: during sample handling/DNA sequencing
- Accessibility: upload the sequences and the studies to open-access journals, with all the supplementary information (code, notebooks), to facilitate reproducibility.