Partial de novo workflow: ustacks only

public 1yr ago Version: Version 1 0 bookmarks

View Workflow

partial-de-novo-workflow-ustacks-only — View Workflow

partial-de-novo-workflow-ustacks-only — View Workflow

workflow-partial-ustacks-only

These workflows are part of a set designed to work for RAD-seq data on the Galaxy platform, using the tools from the Stacks program.

Stacks: http://catchenlab.life.illinois.edu/stacks/

For the full de novo workflow see https://workflowhub.eu/workflows/348

You may want to run ustacks with different batches of samples.

To be able to combine these later, there are some necessary steps - we need to keep track of how many samples have already run in ustacks, so that new samples can be labelled with different identifying numbers.
In ustacks, under "Processing options" there is an option called "Start identifier at".
The default for this is 1, which can be used for the first batch of samples. These will then be labelled as sample 1, sample 2 and so on.
For any new batches of samples to process in ustacks, we will want to start numbering these at the next available number. e.g. if there were 10 samples in batch 1, this should then be set to start at 11.

To combine multiple outputs from ustacks, providing these have been given appropriate starting identifiers:

Find the ustacks output in the Galaxy history. This will be a list of samples.
Click on the cross button next to the filename to delete, but select "Collection only". This releases the items from the list, but they will now be hidden in the Galaxy history.
In the history panel, click on "hidden" to reveal any hidden files. Unhide the samples.
Do this for all the batches of ustacks outputs that are needed.
Click on the tick button, tick all the samples needed, then "For all selected" choose "Build dataset list"
This is now a combined set of samples for input into cstacks.

Comments

Login to post a comment if you would like to share your experience with this workflow.

Support

Do you know this workflow well? If so, you can request seller status , and start supporting this workflow.

Share:

Created: 1yr ago

Updated: 1yr ago

Maitainers: public

URL: https://github.com/AnnaSyme/workflow-partial-ustacks-only.git

Name: partial-de-novo-workflow-ustacks-only

Version: Version 1

Badge:

Insert copied code into your website to add a link to this workflow.

Copyright: Public Domain

License: None

Keywords:

Data Sequence clustering Data Galaxy Stacks WorkflowHub Population genomics

Refs:

https://workflowhub.eu/workflows/349

Future updates

Related Workflows

psychip_snakemake — Show Details View Workflow

ENCODE pipeline for histone marks developed for the psychENCODE project

public

psychip pipeline is an improved version of the ENCODE pipeline for histone marks developed for the psychENCODE project. The o...

raw sequence reads Alignment Sequence alignment report macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

74

ncov_2 — Show Details View Workflow

Near-real time tracking of SARS-CoV-2 in Connecticut

public

Repository containing scripts to perform near-real time tracking of SARS-CoV-2 in Connecticut using genomic data. This pipeli...

JSON nextclade Augur Biopython FOCUS Pandas Snakemake bs4 epiweeks geopy matplotlib numpy pycountry pycountry-convert uszipcode

Free

28

cellranger-snakemake-gke — Show Details View Workflow

snakemake workflow to run cellranger on a given bucket using gke.

public

A Snakemake workflow for running cellranger on a given bucket using Google Kubernetes Engine. The usage of this workflow ...

macs2 ucsc-bedclip bedGraphToBigWig BEDTools BWA Picard SAMtools Snakemake

Free

74

atlas — Show Details View Workflow

ATLAS - Three commands to start analyzing your metagenome data

public

Metagenome-atlas is a easy-to-use metagenomic pipeline based on snakemake. It handles all steps from QC, Assembly, Binning, t...

raw sequence reads Genome assembly Annotation track checkm2 gunc prodigal snakemake-wrapper-utils MEGAHIT Atlas BBMap Biopython BioRuby Bwa-mem2 cd-hit CheckM DAS Diamond eggNOG-mapper v2 MetaBAT 2 Minimap2 MMseqs MultiQC Pandas Picard pyfastx SAMtools SemiBin Snakemake SPAdes SqueezeMeta TADpole VAMB CONCOCT ete3 gtdbtk h5py networkx numpy plotly psutil utils metagenomics

Free

175

rna-seq-star-deseq2 — Show Details View Workflow

RNA-seq workflow using STAR and DESeq2

public

This workflow performs a differential gene expression analysis with STAR and Deseq2. The usage of this workflow is described ...

Free

41

dna-seq-gatk-variant-calling — Show Details View Workflow

This Snakemake pipeline implements the GATK best-practices workflow

public

This Snakemake pipeline implements the GATK best-practices workflow for calling small germline variants. The usage of thi...

VCF raw sequence reads Variant calling genetic variants gatk rust-bio-tools snakemake-wrapper-utils tabix BCFtools BWA FastQC MultiQC Pandas Picard SAMtools Snakemake Trimmomatic Variant Effect Predictor (VEP) common matplotlib numpy seaborn DNA

Free

57