Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation, topic
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
The QTL-Snakemake-Workflow consists of 3 main fuctions which helps to utilize the QTLseqr tool for Next Generation sequencing bulk segregant analysis from a VCF file.
.. _main-getting-started:
Getting started
To run this package first we need to install the required libraries for the workflow which can be installed by running
.. code-block:: python
conda env create --file envs/condaenv.yml
Here we can define any name we want instead of the QTL_Test. This command will make a conda environment to run this package.
Once we have created the environment we need to configure the config.yaml file with the required parameters so that it can take input for different rules. After this to run the workflow we can use the command
.. code-block:: python
snakemake -pr
This command will run the workflow and generate the output in the QTL_Plots folder.
.. _manual-Work_Flow_Rules:
Work FLow Rules
The QTL workflow consists of three rules:
-
VCF_Homozygous_Filtering
-
QTL_VCF_to_Table_Parser
-
QTL_Plotting
-
VCF_Homozygous_Filtering
This rule takes any vcf file and filters the parent for selecting only the homozygous SNPS in the vcf file.
.. code-block:: python
input:
expand("{vcf_file_name}.vcf.gz", vcf_file_name= config["vcf_file"]),
output:
expand("Homozygous_Filtered_VCF/Homozygous_{vcf_file_name}.vcf.gz", vcf_file_name=config["vcf_file"])
params:
filter_data= config["vcffilter"]["filters"]
shell:
"( bcftools view"
" {params.filter_data}"
" {input}"
" -O z"
" -o {output}"
")"
The input is a vcf file which needs to be defined in the config file under the vcf_file name and the output of this rule will be saved into the Homozygous_Filtered_VCF folder with the same vcf file name just a Homozygous title will be added in the start of the name so that we can distinguish between different files if the workflow is run for multiple vcf files. The filtering is done on the parents and the shell command takes input from the config file.
.. code-block:: python
-i "GT[0]='0/0' && GT[1]='1/1' || GT[0]='1/1' && GT[1]='0/0'"
In the above script for filtering the parents which comes from the config file it is taking the parents and from our reference vcf file the parents are on GT[0] and GT[1] , but it should be adjusted according to the given vcf file.
- QTL_VCF_to_Table_Parser
This rule is for converting a vcf file into a table format file which is required by the QTL tool if the VCF file is not generated from GATK . This rule runs a R scripts which parses the vcf file.
.. code-block:: python
input:
expand("Homozygous_Filtered_VCF/Homozygous_{vcf_file_name}.vcf.gz", vcf_file_name=config["vcf_file"]),
output:
"QTL_VCF_to_Table/QTL_Table.csv",
script:
"Scripts/QTL_Parser.R"
It takes an input a vcf file from the Homozygous_Filtered_VCF folder with a specified name as Homozygous_{vcf_file_name}.vcf.gz the vcf_file_name comes from the config file where the name of the vf file was given. The output is saved in QTL_VCF_to_Table folder The QTL_VCF_to_Table script takes input from the config file where the names of the High Bulk, Low Bulk are defined along with that it also takes the parameter Number_of_Chromosomes from the config file.
- QTL_Plotting
This rule runs the QTLseqr tool and generates the plots for Gprime Analysis and QTLseq Analysis . This rules runs an R script for generating the plots.
.. code-block:: python
input:
"QTL_VCF_to_Table/QTL_Table.csv"
output:
"QTL_Plots/DP_Filtering Data.pdf",
"QTL_Plots/REF Frequency for Filtering Data.pdf",
"QTL_Plots/SNP Index for Filtering Data.pdf",
"QTL_Plots/GPrime Distribution with Hampel Outlier Filter.pdf",
"QTL_Plots/GPrime Distribution with deltaSNP Outlier Filter.pdf",
"QTL_Plots/SNP Density Plot.pdf",
"QTL_Plots/Delta SNP Index Plot with Intervals.pdf",
"QTL_Plots/GPrime Value Plot.pdf"
script:
"Scripts/QTL_Plotting.R"
It takes input the csv file developed by the QTL_VCF_to_Table_Parser along with the parameters defined within the config file for filtering the SNPs for better results and the output is saved into QTL_Plots folder.
.. toctree:: :caption: Getting started :name: getting_started :hidden: :maxdepth: 1
getting_started/installation tutorial/tutorial tutorial/short
.. toctree:: :caption: Executing workflows :name: execution :hidden: :maxdepth: 1
executing/cli executing/cluster-cloud executing/caching executing/interoperability
.. toctree:: :caption: Defining workflows :name: snakefiles :hidden: :maxdepth: 1
snakefiles/writing_snakefiles
snakefiles/rules
snakefiles/configuration
snakefiles/modularization
snakefiles/remote_files
snakefiles/utils
snakefiles/deployment
snakefiles/reporting
.. toctree:: :caption: API Reference :name: api-reference :hidden: :maxdepth: 1
api_reference/snakemake
api_reference/snakemake_utils
api_reference/internal/modules
.. toctree:: :caption: Project Info :name: project-info :hidden: :maxdepth: 1
project_info/citations
project_info/more_resources
project_info/faq
project_info/contributing
project_info/authors
project_info/history
project_info/license
Code Snippets
22 23 24 25 | shell: """ ( printf "#Samples:con-all,D2,D2_F2_tt,D2_F2_TT\nCHROM\\tPOS\\tREF\\tALT\\tRO\\tAO\\tGT\\tGQ\\tSampleRO\\tSampleAO\n" > {output} bcftools view {input} {params.samples} | bcftools query {params.data} >> {output})""" |
36 37 | script: "../Scripts/plot_allele_freqs.py" |
14 15 16 17 18 19 20 | shell: "( bcftools view" " {params.filter_data}" " {input}" " -O z" " -o {output}" ")" |
26 27 | script: "../Scripts/QTL_Plotting.R" |
12 13 | script: "../Scripts/QTL_Parser.R" |
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 | import os import matplotlib #%matplotlib inline #get_ipython().run_line_magic('matplotlib', 'inline') from pathlib import Path import pandas as pd #from IPython.display import display import numpy as np import seaborn as sns import re from collections import OrderedDict from matplotlib import pyplot as plt import yaml sns.set() pd.options.display.max_columns = None # default is 20 pd.options.display.max_rows = 60 # default is 60 # # Plotting mutant allele frequencies # # the VCF sample format fields output by freebayes are: # # GT:GQ:DP:AD:RO:QR:AO:QA:GL # # The fields used for allele frequency plotting are: # # - RO: Reference allele observation count # - AO: Alternate allele observation count # # TSV tables from filtered VCFs have to be created like this: # # ```sh # printf "#Samples:con-all,D2,D2_F2_tt,D2_F2_TT\nCHROM\\tPOS\\tREF\\tALT\\tRO\\tAO\\tGT\\tGQ\\tSampleRO\\tSampleAO\n" > filtered.tsv # bcftools view freebayes_D2.filtered.vcf -s con-all,D2,D2_F2_tt,D2_F2_TT | bcftools query -f "%CHROM\t%POS\t%REF\t%ALT\t%RO\t%AO\t[,%GT]\t[,%GQ]\t[,%RO]\t[,%AO]\n" >> freebayes_D2.filtered.tsv # # printf "#Samples:con-all,D3,D3_F2_tt,D3_F2_TT\nCHROM\\tPOS\\tREF\\tALT\\tRO\\tAO\\tGT\\tGQ\\tSampleRO\\tSampleAO\n" > filtered.tsv # bcftools view freebayes_D3.filtered.vcf -s con-all,D3,D3_F2_tt,D3_F2_TT | bcftools query -f "%CHROM\t%POS\t%REF\t%ALT\t%RO\t%AO\t[,%GT]\t[,%GQ]\t[,%RO]\t[,%AO]\n" >> freebayes_D3.filtered.tsv # # ``` # # The tables should look like this: # # CHROM POS REF ALT RO AO GT GQ SampleRO SampleAO # Chr01 344698 C T 39 43 ,0/0,1/1,0/1,0/1 ,22,19,137,155 ,6,0,14,19 ,0,2,23,18 # Chr01 2943267 T A 140 109 ,0/0,1/1,0/1,0/1 ,134,45,140,140 ,30,0,66,44 ,0,16,51,42 # Chr01 3751995 T C 27 20 ,1/1,0/0,0/0,0/0 ,21,16,77,55 ,2,2,15,8 ,6,0,10,4 # # # ### Set input # # All user input is set in this section # # --- # #### File paths # In[5]: # Added by Anza to get the name of the file from the config file config = yaml.load(open('config.yaml')) vcf_file_name=config['AllelPlots']['vcffile'] #vcf_file_name=vcf_file_name.replace('.','_') print(vcf_file_name) tsv_file= "Allel_Frequency_Plots_Computomics/"+vcf_file_name+".tsv" raw_output="Allel_Frequency_Plots_Computomics/"+vcf_file_name+".pdf" raw_output_window="Allel_Frequency_Plots_Computomics/"+vcf_file_name+"_window.pdf" # Define input table paths tsv_file = Path(tsv_file) print(tsv_file) raw_output_pdf = Path(raw_output) window_output_pdf = Path(raw_output_window) # Define input table paths #tsv_file = Path('freebayes_D2.filtered.tsv') #raw_output_pdf = Path('freebayes_allele_freq_D2.pdf') #window_output_pdf = Path('freebayes_allele_freq_window_D2.pdf') #tsv_file = Path('freebayes_D3.filtered.tsv') # raw_output_pdf = Path('freebayes_allele_freq_D3.pdf') # window_output_pdf = Path('freebayes_allele_freq_window_D3.pdf') # #### Sample names # In[ ]: # Order as: control, mutant, dwarf/mutant F2 pool, WT F2 pool samples = ['con-all', 'D2', 'D2_F2_tt', 'D2_F2_TT'] sample_F2_WT = 'D2_F2_TT' sample_F2_mutant = 'D2_F2_tt' # samples = ['con-all', 'D3', 'D3_F2_tt', 'D3_F2_TT'] # sample_F2_WT = 'D3_F2_TT' # sample_F2_mutant = 'D3_F2_tt' # #### Chromosome lengths # # These lengths are from the `Sbicolor_454_v3.0.1.fa` genome # In[6]: # All chromosomes that should be included in the plots. chromosome_lengths = OrderedDict([('Chr01', 80884392), ('Chr02', 77742459), ('Chr03', 74386277), ('Chr04', 68658214), ('Chr05', 71854669), ('Chr06', 61277060), ('Chr07', 65505356), ('Chr08', 62686529), ('Chr09', 59416394), ('Chr10', 61233695)]) # #### Plot window and step size # In[18]: # Set window six windowsize = 5000000 stepsize = 1000000 # No user input necessary below # # --- # # ### Define functions for data wrangling # # # # In[7]: # This function generates a tidy/longform table, which makes is easier to plot with seaborn. # Will have one row per sample per variant, sample name in "sample", and rows for control/mutant observations and # frequency of mutant alleles # def getTidyTable(raw_table, samples): # collects new rows row_accumulator = [] def splitListToRows(row): '''Split one row into long form, one new row per sample''' if row['CHROM'] not in chromosome_lengths: # Only keep markers on the included chromosomes return split_gt = row['GT'].split(',') if '.' in split_gt: # Remove rows where not all samples are genotyped return split_gq = row['GQ'].split(',') split_ro = row['SampleRO'].split(',') split_ao = row['SampleAO'].split(',') control_ref = split_gt[0] == '0/0' new_rows = [] for i in range(4): new_row = row.to_dict() # Remove rows unnecessary for plotting new_row.pop('REF') new_row.pop('ALT') new_row.pop('RO') new_row.pop('AO') new_row['GT'] = split_gt[i] new_row['GQ'] = int(split_gq[i]) new_row['SampleRO'] = split_ro[i] new_row['SampleAO'] = split_ao[i] new_row['sample'] = samples[i] # Get observations of WT/mutant alleles if control_ref: new_row['controlO'] = int(new_row['SampleRO']) new_row['mutantO'] = int(new_row['SampleAO']) else: new_row['controlO'] = int(new_row['SampleAO']) new_row['mutantO'] = int(new_row['SampleRO']) if (new_row['mutantO'] + new_row['controlO']) == 0: # Remove rows where there are no observations for a sample break else: # Get mutant allele frequency new_row['mutant_freq'] = new_row['mutantO'] / (new_row['mutantO'] + new_row['controlO']) new_rows.append(new_row) if len(new_rows) == 4: # only keep rows if there is one for each sample row_accumulator.extend(new_rows) raw_table.apply(splitListToRows, axis=1) table = pd.DataFrame(row_accumulator) table = table[['CHROM', 'POS', 'sample', 'GT', 'GQ', 'SampleRO', 'SampleAO', 'controlO', 'mutantO', 'mutant_freq']] # table = table[table['CHROM'].str.startswith('Chr')] return table # In[15]: # Create a table with averaged mutant frequencies for the F2 pools with a sliding window def get_window_table(table, sample_tt, sample_TT, windowsize, stepsize): # positions to include in window before/after the current pos, i.e. half the window size w = int(windowsize/2) rows = [] for chrom in chromosome_lengths: # get F2 pool genotypes for current chromosomes chrom_table = table[(table['CHROM'] == chrom) & ((table['sample'] == sample_TT) | (table['sample'] == sample_tt))].reset_index(drop=True) for i in range(1, chromosome_lengths[chrom]+1, stepsize): # Loop over windows wstart = max(1, i-w) wend = min(chromosome_lengths[chrom], i+w) # Get table for window wtable = chrom_table[(chrom_table['POS'] >= wstart) & (chrom_table['POS'] <= wend)] freqs_tt = wtable[wtable['sample'] == sample_tt]['mutant_freq'] freqs_TT = wtable[wtable['sample'] == sample_TT]['mutant_freq'] varcount = int(wtable.shape[0] / 2) # because of two samples per marker avg_TT = np.mean(freqs_TT) avg_tt = np.mean(freqs_tt) std_TT = np.std(freqs_TT) std_tt = np.std(freqs_tt) wsize = wend - wstart row_tt = {'CHROM': chrom, 'POS': i, 'sample': sample_tt, 'avg_mutant_freq': avg_tt, 'std': std_tt, 'varcount': varcount, 'window_size': wsize} rows.append(row_tt) row_TT = {'CHROM': chrom, 'POS': i, 'sample': sample_TT, 'avg_mutant_freq': avg_TT, 'std': std_TT, 'varcount': varcount, 'window_size': wsize} rows.append(row_TT) window_table = pd.DataFrame(rows) window_table = window_table[['CHROM', 'POS', 'sample', 'avg_mutant_freq', 'std', 'varcount', 'window_size']] return window_table # ### Load input # In[9]: # Load TSV table raw_table = pd.read_csv(tsv_file, sep='\t', na_values=['.'], comment='#') # Remove leading commas for col in ['GT', 'GQ', 'SampleRO', 'SampleAO']: raw_table.loc[:,col] = raw_table[col].str[1:] table = getTidyTable(raw_table, samples) # ### Plot raw frequencies # # no smoothing/averaging with a window applied, raw frequencies are plotted # In[10]: # Plot raw mutant allele frequencies of F2 pools plot_table = table[(table['sample'] == sample_F2_mutant) | (table['sample'] == sample_F2_WT)] plot = sns.relplot(data=plot_table, x='POS', y='mutant_freq', row='CHROM', style='sample',hue='sample', aspect=7.0, height=4., kind='line', markers=True, dashes=False, linewidth=0.5) # In[11]: plot.savefig(raw_output_pdf) # ### Plot sliding-window averaged frequencies # # use a sliding window and plot the averaged allele frequency per window, scaling the dot by the number of snps/indels in the window. # # Values used (determined manually to show clear signal): # # - window size: 5 Gbp # - step size: 1 Gbp # In[12]: windowsize=5000000 stepsize=1000000 # In[16]: wtable = get_window_table(table, sample_F2_mutant, sample_F2_WT, windowsize, stepsize) # Plot the per-window average allele frequencies # Create one subplot per chromosome wgrid = sns.FacetGrid(wtable, row='CHROM',aspect=7.0, height=4., hue='sample',legend_out=True) # Plot lines wgrid = wgrid.map(sns.lineplot, 'POS', 'avg_mutant_freq', linewidth=0.5).add_legend() # Overlay with scatterplot with densities per chromosome for i, chrom in enumerate(chromosome_lengths): plotdata = wtable[wtable['CHROM']==chrom] mincount = min(plotdata['varcount']) maxcount = max(plotdata['varcount']) sizes=(np.log2(mincount+1)*100,np.log2(maxcount+1)*100) sns.scatterplot(data=wtable[wtable['CHROM']==chrom], x='POS', y='avg_mutant_freq', size='varcount', hue='sample', legend=False, ax=wgrid.axes[i,0], sizes=sizes) # In[17]: wgrid.savefig(window_output_pdf) |
7
of
Scripts/plot_allele_freqs.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | install.packages('vcfR', repos='http://cran.us.r-project.org') library(VariantAnnotation) #library(vcfR) library(yaml) config <- yaml.load_file("config.yaml") QTL_VCF_to_Table_Parser<-function(vcf_file_path,No_of_Chromosomes,HighBulk,LowBulk){ rice_vcf <- readVcf(vcf_file_path) scan_vcf<- scanVcf(vcf_file_path) ADREF_HighBulk <- paste0("AD_REF.",HighBulk) ADALT_HighBulk <- paste0("AD_ALT.",HighBulk) ADREF_LowBulk <- paste0("AD_REF.",LowBulk) ADALT_LowBulk <- paste0("AD_ALT.",LowBulk) GQ_HighBulk <- paste0("GQ.",HighBulk) GQ_LowBulk <- paste0("GQ.",LowBulk) table_colnames<- c("CHROM","POS","REF","ALT",ADREF_HighBulk,ADALT_HighBulk,GQ_HighBulk,ADREF_LowBulk,ADALT_LowBulk,GQ_LowBulk) vcf_len<- scan_vcf$`*:*-*`$rowRanges@elementMetadata@nrows chrom_vcf_names<- as.character.Array(droplevels(scan_vcf$`*:*-*`$rowRanges@seqnames@values[1:No_of_Chromosomes])) QTL_Rice <- as.data.frame(matrix(ncol=10,nrow = vcf_len)) QTL_Rice$V2<-as.array(rice_vcf@rowRanges@ranges@start) QTL_Rice$V1<-unlist(lapply(strsplit(as.array(rice_vcf@rowRanges@ranges@NAMES),':'),'[[',1)) QTL_Rice$V3<-scan_vcf$`*:*-*`$REF QTL_Rice$V4<-scan_vcf$`*:*-*`$ALT QTL_Rice$V7<- scan_vcf$`*:*-*`$GENO$GQ[,HighBulk] QTL_Rice$V10<- scan_vcf$`*:*-*`$GENO$GQ[,LowBulk] QTL_Rice$V5<-lapply(scan_vcf$`*:*-*`$GENO$AD[,HighBulk],'[[',1) QTL_Rice$V6<-lapply(scan_vcf$`*:*-*`$GENO$AD[,HighBulk],'[[',2) QTL_Rice$V8<-lapply(scan_vcf$`*:*-*`$GENO$AD[,LowBulk],'[[',1) QTL_Rice$V9<-lapply(scan_vcf$`*:*-*`$GENO$AD[,LowBulk],'[[',2) QTL_Rice$V2<- as.integer.Array(QTL_Rice$V2) QTL_Rice$V3<- as.character.Array(QTL_Rice$V3) QTL_Rice$V4<- as.character.Array(QTL_Rice$V4) QTL_Rice$V5<- as.integer.Array(QTL_Rice$V5) QTL_Rice$V6<- as.integer.Array(QTL_Rice$V6) QTL_Rice$V7<- as.integer.Array(QTL_Rice$V7) QTL_Rice$V8<- as.integer.Array(QTL_Rice$V8) QTL_Rice$V9<- as.integer.Array(QTL_Rice$V9) QTL_Rice$V10<- as.integer.Array(QTL_Rice$V10) colnames(QTL_Rice) <- table_colnames write.table(QTL_Rice,'QTL_VCF_to_Table/QTL_Table.csv',row.names = FALSE,col.names = TRUE,sep = '\t') } QTL_VCF_to_Table_Parser(config$QTL_Parser$VCF_File_Name_or_Path,config$QTL_Parser$Number_of_Chromosomes,config$QTL_Parser$HighBulk,config$QTL_Parser$LowBulk) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | library(yaml) library(data.table) library(devtools) install_github("bmansfeld/QTLseqr",upgrade_dependencies= TRUE) library("QTLseqr") library("ggplot2") config <- yaml.load_file("config.yaml") QTL_Plotting <- function(HighBulk,LowBulk, No_of_Chromosomes){ options(warn=-1) table_file <- 'QTL_VCF_to_Table/QTL_Table.csv' chrom_names<-as.list(unique(fread(table_file,sep = "\t", select = c("CHROM") ))) #chrom_vcf_names<- as.character.Array(droplevels(scan_vcf$`*:*-*`$rowRanges@seqnames@values[1:No_of_Chromosomes])) df <- importFromTable(file = table_file, highBulk = HighBulk, lowBulk = LowBulk, chromList = chrom_names$CHROM[1:No_of_Chromosomes],sep = '\t') ggplot(data = df)+geom_histogram(aes(x = DP.HIGH + DP.LOW))+xlim(0,1000) ggsave(filename="QTL_Plots/DP_Filtering Data",device= "pdf", width=20, height=10) ggplot(data = df)+geom_histogram(aes(x = REF_FRQ)) ggsave(filename="QTL_Plots/REF Frequency for Filtering Data",device = "pdf", width=20, height=10) ggplot(data = df)+geom_histogram(aes(x = SNPindex.HIGH)) ggsave(filename = "QTL_Plots/SNP Index for Filtering Data",device="pdf", width=20, height=10) df_filt <- filterSNPs( SNPset = df, refAlleleFreq = config$QTL_Parser$REF_Allel_Frequency, minTotalDepth = config$QTL_Parser$Min_Total_Depth, maxTotalDepth = config$QTL_Parser$Max_Total_Depth, depthDifference = config$QTL_Parser$Depth_Difference, minSampleDepth = config$QTL_Parser$Min_Sample_Depth, minGQ = config$QTL_Parser$Min_GQ, verbose = TRUE ) df_filt <- runQTLseqAnalysis(df_filt, windowSize = config$QTL_Parser$WindowSize, popStruc = "F2", bulkSize = c(config$QTL_Parser$High_Bulk_Size, config$QTL_Parser$Low_Bulk_Size), replications = 10000, intervals = c(95, 99)) df_filt <- runGprimeAnalysis(df_filt,windowSize = config$QTL_Parser$WindowSize,outlierFilter = "deltaSNP",filterThreshold = config$QTL_Parser$Filter_Threshold) plotGprimeDist(SNPset = df_filt, outlierFilter = "Hampel") ggsave(filename = "QTL_Plots/GPrime Distribution with Hampel Outlier Filter",device = "pdf", width=20, height=10) plotGprimeDist(SNPset =df_filt, outlierFilter = "deltaSNP", filterThreshold = config$QTL_Parser$Filter_Threshold) ggsave(filename = "QTL_Plots/GPrime Distribution with deltaSNP Outlier Filter",device = "pdf", width=20, height=10) p1 <- plotQTLStats(SNPset = df_filt, var = "nSNPs") ggsave(filename = "QTL_Plots/SNP Density Plot",plot=p1,device = "pdf", width=20, height=10) p2 <- plotQTLStats(SNPset = df_filt, var = "deltaSNP", plotIntervals = TRUE) ggsave(filename = "QTL_Plots/Delta SNP Index Plot with Intervals",plot=p2,device = "pdf", width=20, height=10) p3 <- plotQTLStats(SNPset = df_filt, var = "Gprime", plotThreshold = TRUE, q = config$QTL_Parser$FDR_q) ggsave(filename = "QTL_Plots/GPrime Value Plot",plot=p3,device = "pdf", width=20, height=10) } QTL_Plotting(config$QTL_Parser$HighBulk,config$QTL_Parser$LowBulk,config$QTL_Parser$Number_of_Chromosomes) |
Support
- Future updates
Related Workflows





