This pipeline performs mutation analysis of SARS-CoV-2 and reports and quantifies the occurrence of variants of concern (VOC) and signature mutations by which they are characterised.

The visualizations below provide an overview of the evolution of VOCs found in the analyzed samples across given time points and locations. The abundance values for the variants are derived by deconvolution (for details please see the variant report) The frequencies of the mutations are the output of LoFreq.

1 Variants Of Concern

These plots provide an overview of the relative frequencies of identified variants of concern (VOC) of SARS-CoV-2 at specific wastewater sampling locations over time.

1.1 Frequency of variants of concern over time for each location

These plots show the relative frequency of detected variants of concern in samples at specific wastewater sampling locations, and how the frequencies change over time.

Since not all variants of concern are listed, the relative frequencies at a given location and time will not necessarily add up to one.

1.2 Variants of concern per date and location

This plot visualizes the development of local outbreaks and the proportions of identified VOCs in that area. The gray marker at every sampling location is overlayed by a colored disks for every variant.

Locations of wastewater processing plants have been generated arbitrarily and do not correspond to actual locations.

Use the slider to select a specific date or hit the Play button to display all snapshots successively. Click on a variant in the legend to toggle its visibility in the map; double-click to view only the selected variant.

1.3 Raw data VOC frequencies per sample

Download Variant_frequencies.csv

2 Signature Mutations

Signature mutations are characterizing variants of concern (VOC) of SARS-CoV-2. The following plots provide an overview of detected mutations in different locations, and how their relative frequency changes over time.

Note that these plots only display signature mutations which where reported for VOCs.

2.1 Mutations by location

These plots show the relative frequency of detected signature mutations in samples at specific wastewater sampling locations, and how the frequencies change over time.

2.2 Signature mutations by location

This plot visualizes the development of local outbreaks on the level of single mutations. The gray marker at every sampling location is overlayed by a colored disks for each mutation.

Locations of wastewater processing plants have been generated arbitrarily and do not correspond to actual locations.

Use the slider to select a specific date or hit the Play button to display all snapshots successively.

2.3 Raw data signature mutation frequencies per sample

Download Mutation_frequencies.csv

3 Detailed per-sample reports

For every sample three reports are generated:

  • a QC report reporting general statistics and amplicon coverage,

  • a variant report including tables summarizing the mutation calling and the deconvolution results for the abundance of VOCs,

  • a taxonomic classification report including a pie chart showing the analysis of the unaligned reads.

The reports for each sample can be accessed here:

library(data.table)
library(DT)

sample_sheet <- fread(params$sample_sheet)
sample_names <- sample_sheet$name
reports <- list(list("suffix" = ".variantreport_p_sample.html",
                     "name"   = "variant report"),
                list("suffix" = ".qc_report_per_sample.html",
                     "name"   = "QC report"),
                list("suffix" = ".taxonomic_classification.html",
                     "name"   = "taxonomic classification"))

df <- as.data.frame(select(sample_sheet, name, location_name, date))

links <- lapply(sample_names, function (sample) {
    as.vector(lapply(reports, function (report) {
        paste0("<a href=", sample, report$suffix, ">", report$name, "</a>")
    }))
})
df$reports <- links

datatable(df, escape=FALSE)

PiGx: Pipelines in Genomics

PiGx is a collection of highly reproducible genomics pipelines. The original set provides pipelines for the analysis of RNA sequencing, chromatin immunoprecipitation sequencing, bisulfite-treated DNA sequencing, and single-cell resolution RNA sequencing. All of them process raw experimental data and generate reports containing publication-ready plots and figures, with interactive report elements and standard observables. For more information please see: http://bioinformatics.mdc-berlin.de/pigx