To address our goal, we are addressing three fundamental challenges related with MGEs:
1) Our first aim is to evaluate and improve sequencing strategies to determine MGEs. Currently, short-read next-generation sequencing (NGS) is usually performed to characterize these elements. However, MGEs are hard to correctly retrieve from short-read sequencing due to their highly repetitive, modular and chimeric nature. For this reason, it is often required to combine short-read with long-read sequencing technologies, to be able to span over long repetitive regions with the accuracy of short-read sequencing. Our approach will include DNA extraction, short- and long-read library preparations, followed by short- and long-read sequencing.
2) The second aim is to provide good practices on how to analyze NGS data. The two major approaches for NGS data analysis rely in mapping the reads back to a reference or to perform de novo assembly. An optimized workflow for MGEs bioinformatics analysis has not been yet defined. Our proposed approaches, will include quality control and validation, quality trimming, assembly/mapping of the short- and long-reads, followed by gene annotation.
3) The third aim is to develop sequence-based typing schemes for plasmids. In order to improve MGEs surveillance and data sharing, we need a common nomenclature that can be shared and understood by everyone working with MGEs and correlate such nomenclature with antimicrobial resistance and pathogenicity. Additionally, we can better infer on plasmid transmission events and also on source- attribution of resistance/virulence based on high-resolution plasmid typing and nomenclature.