Help improve this workflow!
This workflow has been published but could be further improved with some additional meta data:- Keyword(s) in categories input, output, operation
You can help improve this workflow by suggesting the addition or removal of keywords, suggest changes and report issues, or request to become a maintainer of the Workflow .
Code_for_Genetic_Diversity_SAmpling
How to re-run the 'genetic variation' analysis described in Madupe et al 2023 The details of the analysis are described in the supplementary of the paper.
Download and Instalation
First Clone this
Code Snippets
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 | import os import os.path from os import listdir from os.path import isfile, join import sys import random from itertools import combinations import statistics import matplotlib.pyplot as plt ################################################################################################################################################################################################################################################################################################## ##Set Up GENOTYPE_FILE=open(sys.argv[1],'r') SAMPLE_NAME=sys.argv[1].split('.gntp')[0] OUTPUT=open(F'{SAMPLE_NAME}.metric','w') PROTEIN_COVERAGE_FILE=open('Protein_Coverage.txt','r') ########### Data Strucutres GENOTYPES=[] ## List of lists, of lists - paired genotypes for each locus FLAT_GENOTYPES=[] ## List of lists, genotypes for each locus EXPECTED_HETEROZ=[] ## List of floats ############################### ################# Prepare data ###### Protein Data, get total number of AA covered by this analysis and use it to calculate the length of the underlying sequence Number_of_AA=[] for line in PROTEIN_COVERAGE_FILE: line=line.strip().split() if len(line)>1: AA=line[1].split(',') for K in AA: Number_of_AA.append(K) Number_of_AA=len(Number_of_AA) Length_of_Sequence=3*Number_of_AA ###### Prep Genotypes count=1 for LINE in GENOTYPE_FILE: LINE=LINE.strip().split('\t') ##### N of dataset NUMBER_OF_SAMPLES=len(LINE) ## How many samples ##### Get Genotypes GENOTYPES_HERE=[ x.split('/') for x in LINE ] FLAT_GENOTYPES_HERE=[ l for x in GENOTYPES_HERE for l in x] FREQ_GENOTYPES_HERE=[int(x) for x in FLAT_GENOTYPES_HERE if x!='.'] FREQ=statistics.mean(FREQ_GENOTYPES_HERE) print(F'\nVariant {count} with frequency: {FREQ}\n') count+=1 GENOTYPES.append(GENOTYPES_HERE) ## List of lists of pairs of genotypes FLAT_GENOTYPES.append(FLAT_GENOTYPES_HERE) ## All Genotypes in one list ################################################################################################################################################# ################################################################################################################################################# ##### Calculate PIE - Expected Heterozygosity ALLELE_NUMBER=len([x for x in FLAT_GENOTYPES_HERE if x!='.']) Q_FREQ=FLAT_GENOTYPES_HERE.count('0')/ALLELE_NUMBER P_FREQ=FLAT_GENOTYPES_HERE.count('1')/ALLELE_NUMBER EXPECTED_HETEROZ.append(2*Q_FREQ*P_FREQ) EXPECTED_HETEROZ=sum(EXPECTED_HETEROZ) print(F'\nFound {len(GENOTYPES)} potentially polymorphic sites for {NUMBER_OF_SAMPLES} individuals\n') ################################################################################################################################################# ################################################################################################################################################# ##### Calculate PIE - AVG Heterozygosity OBSERVED_HETEROZ=[] for IND in range(0,NUMBER_OF_SAMPLES): IND_HETEROZ=[] for GENOTYPES_HERE in GENOTYPES: SAMPLE=GENOTYPES_HERE[IND] if (SAMPLE.count('0')==1) and (SAMPLE.count('1')==1): IND_HETEROZ.append(1) if (SAMPLE.count('0')==2) or (SAMPLE.count('1')==2): IND_HETEROZ.append(0) OBSERVED_HETEROZ.append(sum(IND_HETEROZ)) OBSERVED_HETEROZ=sum(OBSERVED_HETEROZ)/len(OBSERVED_HETEROZ) ################################################################################################################################################################################################################################################################################################## ##### Calculate Watterssons estimate ##### Check that sites are indeed polymorphic in population SEG_SITES=0 FIXED_SITES=0 for J in range(0,len(FLAT_GENOTYPES)): FLAT_GENOTYPES_HERE=[int(x) for x in FLAT_GENOTYPES[J] if x != '.'] FLAT_GENOTYPES_HERE=list(set(FLAT_GENOTYPES_HERE)) if len(FLAT_GENOTYPES_HERE)>1: SEG_SITES+=1 if len(FLAT_GENOTYPES_HERE)==1: FIXED_SITES+=1 HARMONIC=sum([ 1/x for x in range(1,SEG_SITES*2)]) #### Harmonic for number of segregating sites * 2 for diploidy if HARMONIC!=0: WTRSNS_E = SEG_SITES/HARMONIC else: WTRSNS_E=0 print(F'\nNumber of Segregating sites: {SEG_SITES}\n') print(F'Number of Fixed sites: {FIXED_SITES}\n') print(F'Harmonic Number of Samples: {HARMONIC}\n') print(F'\nThis results to a Watterssons Estimate of: {WTRSNS_E}\n') ############################################################################################################################################################################################################################################################################################################################### ###### Random Sampling Loop ### Preselect quartets of individuals for the loop MAX_LOOP=1000 if NUMBER_OF_SAMPLES >= 4*MAX_LOOP: ### Sample without replacement SAMPLINGS=random.sample(range(NUMBER_OF_SAMPLES), 4*MAX_LOOP) random.shuffle(SAMPLINGS) if NUMBER_OF_SAMPLES < 4*MAX_LOOP: ### Sample with replacement SAMPLINGS=[] for k in range(0,MAX_LOOP): CHOICES=random.choices(range(NUMBER_OF_SAMPLES), k=4) for j in CHOICES: SAMPLINGS.append(j) random.shuffle(SAMPLINGS) ##### These will be used to get the sampling metrics across the quartets SAMPLING_SUCCESS_OR_NOT=[] HOMOZYGOTE_SUCCESS_OR_NOT=[] TOTAL_VARIANT_SUCCESS_OR_NOT=[] TWO_ALTERNATIVES_SUCCESS_OR_NOT=[] ##### These will be used to get the average diveristy metrics for the quartets WTRSNS_E_QUARTET=[] OBSERVED_HETEROZ_QUARTET=[] EXPECTED_HETEROZ_QUARTET=[] #### Sampling Loop, Sample 4 individuals and check all segregating sites, do you find varaition in any of them? for LOOP in range(0,MAX_LOOP): #### These will be used to get the sampling metrics for this quartet VARIANT_SPOTTED=0 TOTAL_VARIANT_SPOTTED=0 HOMOZYG_SPOTTED=0 TWO_ALTERNATIVES_SPOTTED=0 SAMPLING_NUMBER=SAMPLINGS[LOOP*4:LOOP*4+4] ### This will be used for the Diversity Metricsf for this quartet SEG_SITES_LOCAL=0 WTRSNS_E_LOCAL=0 OBSERVED_HETEROZ_LOCAL=[] EXPECTED_HETEROZ_LOCAL=[] #### For each site for SNP in range(0,len(GENOTYPES)): #### Load site data from matrix GENOTYPES_HERE=GENOTYPES[SNP] ##### Sample using method 1 # Sample diploid individuals SAMPLING=[ GENOTYPES_HERE[x] for x in SAMPLING_NUMBER] #### Get 4 individuals (diploid) using flat genotypes SAMPLING_DIPLOID=SAMPLING #### Keep 4 individuals (diploid) using flat genotypes SAMPLING=[l for x in SAMPLING for l in x] ### clean up data to be a flat list of 8 alleles if '.' in SAMPLING: SAMPLING.remove('.') SAMPLING_UNIQ=list(set(SAMPLING)) #### either a (0) a (1) or (1,0) if len(SAMPLING_UNIQ)>1: ##### Check if any variation exists VARIANT_SPOTTED=1 ##### Count sucessful test TOTAL_VARIANT_SPOTTED+=1 ##### Count how many times Variant spotted has been triggered within a quartet! SEG_SITES_LOCAL+=1 if (['1','1'] in SAMPLING_DIPLOID) and (['0','0'] in SAMPLING_DIPLOID): #### Check if homozygous individuals for the variant exist HOMOZYG_SPOTTED=1 if (SAMPLING.count('1'))>=2: ### Check if more than 1 alternative allele exists TWO_ALTERNATIVES_SPOTTED=1 #### Calc Expected Heterozygosity for this SNP in quartet ALLELE_NUMBER_LOCAL_SNP=len(SAMPLING) #number of non missing alleles Q_FREQ=SAMPLING.count('0')/ALLELE_NUMBER_LOCAL_SNP #Freq of allele 1 P_FREQ=SAMPLING.count('1')/ALLELE_NUMBER_LOCAL_SNP #Freq of allele 2 EXPECTED_HETEROZ_LOCAL.append(2*Q_FREQ*P_FREQ) SAMPLING_SUCCESS_OR_NOT.append(VARIANT_SPOTTED) TOTAL_VARIANT_SUCCESS_OR_NOT.append(TOTAL_VARIANT_SPOTTED) TWO_ALTERNATIVES_SUCCESS_OR_NOT.append(TWO_ALTERNATIVES_SPOTTED) HOMOZYGOTE_SUCCESS_OR_NOT.append(HOMOZYG_SPOTTED) ### Calc Expected Heterozygosity for quartet EXPECTED_HETEROZ_LOCAL=sum(EXPECTED_HETEROZ_LOCAL) ### Calc Watterssons E for quartet HARMONIC_LOCAL=sum([ 1/x for x in range(1,SEG_SITES_LOCAL*2)]) #### Harmonic for number of segregating sites * 2 for diploidy if HARMONIC_LOCAL==0: WTRSNS_E_LOCAL=0 if HARMONIC_LOCAL!=0: WTRSNS_E_LOCAL = SEG_SITES_LOCAL/HARMONIC_LOCAL #### Calc Observed Heterozygosity for quartet for IND in SAMPLING_NUMBER: IND_HETEROZ=[] for GENOTYPES_HERE in GENOTYPES: SAMPLE=GENOTYPES_HERE[IND] if (SAMPLE.count('0')==1) and (SAMPLE.count('1')==1): IND_HETEROZ.append(1) if (SAMPLE.count('0')==2) or (SAMPLE.count('1')==2): IND_HETEROZ.append(0) OBSERVED_HETEROZ_LOCAL.append(sum(IND_HETEROZ)) OBSERVED_HETEROZ_LOCAL=sum(OBSERVED_HETEROZ_LOCAL)/len(OBSERVED_HETEROZ_LOCAL) WTRSNS_E_QUARTET.append(WTRSNS_E_LOCAL) OBSERVED_HETEROZ_QUARTET.append(EXPECTED_HETEROZ_LOCAL) EXPECTED_HETEROZ_QUARTET.append(OBSERVED_HETEROZ_LOCAL) ###### Sampling metrics averages across loop AT_LEAST_ONE_VARIANT=sum(SAMPLING_SUCCESS_OR_NOT)/len(SAMPLING_SUCCESS_OR_NOT) ONE_OR_MORE_VARIANT=sum(TOTAL_VARIANT_SUCCESS_OR_NOT)/len(TOTAL_VARIANT_SUCCESS_OR_NOT) AT_LEAST_ONE_HOMOZ=sum(HOMOZYGOTE_SUCCESS_OR_NOT)/len(HOMOZYGOTE_SUCCESS_OR_NOT) AT_LEAST_TWO_ALTERNATIVES=sum(TWO_ALTERNATIVES_SUCCESS_OR_NOT)/len(TWO_ALTERNATIVES_SUCCESS_OR_NOT) ###### Diversity Metrics average across quartets WTRSNS_E_QUARTET=sum(WTRSNS_E_QUARTET)/len(WTRSNS_E_QUARTET) OBSERVED_HETEROZ_QUARTET=sum(OBSERVED_HETEROZ_QUARTET)/len(OBSERVED_HETEROZ_QUARTET) EXPECTED_HETEROZ_QUARTET=sum(EXPECTED_HETEROZ_QUARTET)/len(EXPECTED_HETEROZ_QUARTET) print(F'Expected Heterozygosity: {EXPECTED_HETEROZ}\nObserved Heterozygosity: {OBSERVED_HETEROZ}\nWattersons Estimator {WTRSNS_E}\n\n\n') print(F'Average Expected Heterozygosity per Quartet: {EXPECTED_HETEROZ_QUARTET}\nAverage Observed Heterozygosity per Quartet: {OBSERVED_HETEROZ_QUARTET}\nAverage Wattersons Estimator per Quartet {WTRSNS_E_QUARTET}\n\n\n') print(F'Probability of at least one variant: {AT_LEAST_ONE_VARIANT}\n') print(F'Average number of successes per individual test: {ONE_OR_MORE_VARIANT}\n') print(F'Probability of at least one homozygous individual: {AT_LEAST_ONE_HOMOZ}\n') print(F'Probability of at least 2 alternative alleles: {AT_LEAST_TWO_ALTERNATIVES}\n\n\n') print(F'Total length of underlying sequence: {Length_of_Sequence}\n') plt.hist(TOTAL_VARIANT_SUCCESS_OR_NOT) plt.savefig('Variants_Histogram.pdf',format='pdf') |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | import os import os.path from os import listdir from os.path import isfile, join import sys ################################################################################################################################################################################################################################################################################################################################################ ################################################################### Extract and Filter variants from VEP Output PROTEIN_POSITIONS_COVERAGE=open('Protein_Coverage.txt','r') VEP_OUTPUT=open(sys.argv[1],'r') SAMPLE_NAME=sys.argv[1].split('_VEP.VEP')[0] OUTPUT=open(F'{SAMPLE_NAME}_Processed_Variants.PV','w') ##### Get which parts of the protein are covered by our ancient samples COVERAGE={} for line in PROTEIN_POSITIONS_COVERAGE: line=line.strip().split() Protein_Name=line[0] if len(line)>1: Positions=line[1].split(',') else: Positions=[] COVERAGE[Protein_Name]=[int(x) for x in Positions] print(COVERAGE) ### Go through VEP output FILTERED=[] for LINE in VEP_OUTPUT: if LINE[0]=='#'and LINE[1]!='#': LABELS=LINE.strip().split() if LINE[0]!='#': LINE=LINE.strip().split() Uploaded_variation=LINE[LABELS.index('#Uploaded_variation')] Location=LINE[LABELS.index('Location')] Gene=LINE[LABELS.index('Gene')] ### Ensemble ID Feature=LINE[LABELS.index('Feature')] ### Ensemble ID Feature_type=LINE[LABELS.index('Feature_type')] ## e.g. 'Transcript' VARIANT_CLASS=LINE[LABELS.index('VARIANT_CLASS')] ## e.g. SNV or Insertion SYMBOL=LINE[LABELS.index('SYMBOL')] ### Which Protein Consequence=LINE[LABELS.index('Consequence')] ## Important! can be multiple things including: synonimous_variant, upstream_variant, missense_variant, splice_donor_variant(?), frameshift_variant IMPACT=LINE[LABELS.index('IMPACT')] #### HIGH,LOW other? CANONICAL=LINE[LABELS.index('CANONICAL')] Protein_position=LINE[LABELS.index('Protein_position')] print(Uploaded_variation,Location,Feature_type,VARIANT_CLASS,SYMBOL,Consequence,IMPACT,Protein_position,CANONICAL) ### Filter based on criteria here if (VARIANT_CLASS=='SNV') and (Consequence=='missense_variant') and (CANONICAL=='YES'): FILTERED.append([SYMBOL,Location,Protein_position]) print(LABELS) for SNP in FILTERED: Loc=SNP[1] Loc=Loc.split(':') Loc='\t'.join(Loc) Prot=SNP[0] Prot_Pos=int(SNP[2]) ##COVERAGE[Prot] ## all positions covered by all 4 samples print(Loc,Prot,Prot_Pos,COVERAGE[Prot]) if Prot_Pos in COVERAGE[Prot]: OUTPUT.write(Loc+'\n') |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | import os import os.path from os import listdir from os.path import isfile, join import sys from Bio import SeqIO FILES_IN_FOLDER = [f for f in os.listdir('.') if os.path.isfile(f)] FASTA_FILE_LIST= [f for f in FILES_IN_FOLDER if '.fa' in f] OUTPUT=open('Protein_Coverage.txt','w') COVERAGE={} for FILE in FASTA_FILE_LIST: fasta_sequences = SeqIO.parse(open(FILE),'fasta') for fasta in fasta_sequences: name, sequence = fasta.id, str(fasta.seq) name=name.split('/')[0] name=name.split('_') sample='_'.join(name[0:len(name)-1]) protein=name[len(name)-1] if 'Paranthropus' in sample: if sample not in COVERAGE.keys(): COVERAGE[sample]={} counter=1 for POS in sequence: counter+=1 if POS!='?': if protein not in COVERAGE[sample].keys(): COVERAGE[sample][protein]=[] COVERAGE[sample][protein].append(counter) # print(sample,protein) TOTAL_COVERAGE={} for SMPL in COVERAGE.keys(): for PRTN in COVERAGE[SMPL].keys(): if PRTN not in TOTAL_COVERAGE.keys(): TOTAL_COVERAGE[PRTN]=[] for PSTN in COVERAGE[SMPL][PRTN]: TOTAL_COVERAGE[PRTN].append(PSTN) ####### For getting positions covered by ALL 4 P.rob samples for PRTN in TOTAL_COVERAGE.keys(): TOTAL_COVERAGE[PRTN]=sorted(TOTAL_COVERAGE[PRTN]) TOTAL_COVERAGE_UNIQUE=list(set(TOTAL_COVERAGE[PRTN])) ### Sites only once, so we can loop through them ### Only select positions that are counted 4 times TOTAL_COVERAGE[PRTN]=[ str(SITE) for SITE in TOTAL_COVERAGE_UNIQUE if TOTAL_COVERAGE[PRTN].count(SITE)==4 ] print(PRTN,TOTAL_COVERAGE[PRTN]) POSITIONS=','.join(TOTAL_COVERAGE[PRTN]) OUTPUT.write(F'{PRTN}\t{POSITIONS}\n') ####### For getting positions covered by at least one P.rob sample! # for PRTN in TOTAL_COVERAGE.keys(): # TOTAL_COVERAGE[PRTN]=sorted(list(set(TOTAL_COVERAGE[PRTN]))) # TOTAL_COVERAGE[PRTN]=[str(x) for x in TOTAL_COVERAGE[PRTN]] # print(PRTN,TOTAL_COVERAGE[PRTN]) # POSITIONS=','.join(TOTAL_COVERAGE[PRTN]) # OUTPUT.write(F'{PRTN}\t{POSITIONS}\n') |
83 84 | run: shell(F"bcftools index {input.VCF_FILE} -f --threads {threads}") |
93 94 | run: shell(F"python3 Get_Protein_Coverage.py") |
118 119 | run: shell(F"bcftools view {input.VCF_FILE} -R {input.GENE_LOCATIONS} --threads {threads} -O v -o {output.GENE_FILTERED_VCF}") |
143 144 | run: shell(F"vep --i {input.GENE_FILTERED_VCF} --tab --species homo_sapiens --offline --dir_cache VEP_Cache/ --output_file {output.VEP_OUTPUT} --force_overwrite --everything") |
168 169 | run: shell(F"python3 Extract_Info_From_VEP_Output.py {input.VEP_OUTPUT}") |
187 188 189 | run: shell(F"bgzip -i -k -f --threads {threads} {input.GENE_FILTERED_VCF}") shell(F"bcftools index {output.GENE_FILTERED_GZVCF} -f --threads {threads}") |
208 209 | run: shell(F"bcftools view {input.GENE_FILTERED_GZVCF} -R {input.PROCESSED_VARIANTS} --threads {threads} -O v -o {output.SECOND_FILTERING_VCF}") |
234 235 236 | run: shell(F"bcftools query -f '[%GT\t]\n' {input.SECOND_FILTERING_VCF} > {output.GENOTYPES}") shell(F"bcftools query -f '%CHROM %POS %ID %REF %ALT\n' {input.SECOND_FILTERING_VCF} > {output.SNP_LOCATIONS}") |
257 258 | run: shell(F"python3 Do_Random_Sampling.py {input.GENOTYPES}") |
Support
Do you know this workflow well? If so, you can
request seller status , and start supporting this workflow.
Created: 1yr ago
Updated: 1yr ago
Maitainers:
public
URL:
https://github.com/johnpatramanis/Code_for_Genetic_Diversity_Sampling
Name:
code_for_genetic_diversity_sampling
Version:
1
Downloaded:
0
Copyright:
Public Domain
License:
None
- Future updates
Related Workflows

ENCODE pipeline for histone marks developed for the psychENCODE project
psychip pipeline is an improved version of the ENCODE pipeline for histone marks developed for the psychENCODE project.
The o...

Near-real time tracking of SARS-CoV-2 in Connecticut
Repository containing scripts to perform near-real time tracking of SARS-CoV-2 in Connecticut using genomic data. This pipeli...

snakemake workflow to run cellranger on a given bucket using gke.
A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...

ATLAS - Three commands to start analyzing your metagenome data
Metagenome-atlas is a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, t...
raw sequence reads
Genome assembly
Annotation track
checkm2
gunc
prodigal
snakemake-wrapper-utils
MEGAHIT
Atlas
BBMap
Biopython
BioRuby
Bwa-mem2
cd-hit
CheckM
DAS
Diamond
eggNOG-mapper v2
MetaBAT 2
Minimap2
MMseqs
MultiQC
Pandas
Picard
pyfastx
SAMtools
SemiBin
Snakemake
SPAdes
SqueezeMeta
TADpole
VAMB
CONCOCT
ete3
gtdbtk
h5py
networkx
numpy
plotly
psutil
utils
metagenomics

RNA-seq workflow using STAR and DESeq2
This workflow performs a differential gene expression analysis with STAR and Deseq2. The usage of this workflow is described ...

This Snakemake pipeline implements the GATK best-practices workflow
This Snakemake pipeline implements the GATK best-practices workflow for calling small germline variants. The usage of thi...