
Run COSMIC signatures comparison using SigProfilerAssignment
Source:R/signature_fitting.R
signature_fitting.Rd
Run COSMIC signatures comparison using SigProfilerAssignment
Usage
signature_fitting(
mutation_data,
project_name = "Default",
project_genome = "GRCh38",
env_name = "MutSeqR",
group = "sample",
output_path = NULL,
python_version
)
Arguments
- mutation_data
A data frame containing mutation data.
- project_name
The name of the project. This is used to format the data into required .txt format for SigProfiler tools.
- project_genome
The reference genome to use. On first use, the function will install the genome using SigProfilerMatrixGeneratorR::install. e.x. GRCh37, GRCH38, mm10, mm9, rn6
- env_name
The name of the virtual environment. This will be created on first use.
- group
The column in the mutation data used to aggregate groups. Signature assignment will be performed on each group separately.
- output_path
The filepath to the directory in which the output folder will be created to store results. Default is NULL. This will store results in the current working directory.
- python_version
The version of python installed on the user's computer.
Value
Creates a subfolder "SigProfiler" in the output directory with SigProfiler tools results. For a complete breakdown of the results, see the Readme file for MutSeqR. Most relevant results are stored in SigProfiler > group > matrices > output > Assignment_Solution > Activities > SampleReconstruction > WebPNGs. These plots show a summary of the signature assignment results for each group. In each plot, the top left panel represents the base_96 mutation count for the group. The bottom left panel represents the reconstructed profile. Below the reconstruction are the solution statistics that indicate the goodness of fit of the reconstructed profile to the observed profile. (Recommended cosine similarity > 0.9). The panels on the right represent the SBS signatures that contribute to the reconstructed profile. The signature name and its contribution % are shown in the panel. A high contribution means a high association of the signature with the group's mutation spectra.
Details
Assign COSMIC SBS signatures to mutation data using SigProfilerAssignment. Data is cleaned and formatted for input into SigProfiler tools. This function will create a virtual environment using reticulate to run python, as this is a requirement for the SigProfiler suite of tools. Note that it will also install several python dependencies using a conda virtual environment on first use. Please be aware of the implications of this. For advanced use, it is suggested to use the SigProfiler python tools directly in python as described in their respective documentation. Users must have python installed on their computer to use this function.
Mutation data will be filtered to only include SNVs. Variants flagged by the filter_mut column will be excluded.
Examples
if (FALSE) { # \dontrun{
example_file <- system.file("extdata", "Example_files",
"example_mutation_data_filtered.rds",
package = "MutSeqR")
example_data <- readRDS(example_file)
signature_fitting(mutation_data = example_data,
project_name = "Example",
project_genome = "mm10",
env_name = "MutSeqR",
group = "dose",
python_version = "3.11")
} # }