Keywords and Expertise

Use keywords to characterize workflows and forum posts, and reach out to sellers with relevant expertise

tool / biotools
Snakemake

Workflow engine and language. It aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern domain specific specification language (DSL) in python style.


edam biotools URL URL URL pmid
tool / biotools
SAMtools

SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.


edam edam edam edam edam edam edam edam edam edam edam doi URL URL URL doi biotools GitHub URL URL doi email email
tool / biotools
Pandas

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.


edam edam biotools URL URL URL URL URL URL email
tool / pypi
numpy

The fundamental package for scientific computing with Python


URL pypi URL email
tool / biotools
FastQC

This tool aims to provide a QC report which can spot problems or biases which originate either in the sequencer or in the starting library material. It can be run in one of two modes. It can either run as a stand alone interactive application for the immediate analysis of small numbers of FastQ files, or it can be run in a non-interactive mode where it would be suitable for integrating into a larger analysis pipeline for the systematic processing of large numbers of files.


edam edam edam edam edam edam edam edam edam edam doi biotools URL GitHub URL email
tool / bioconda
gatk

The full Genome Analysis Toolkit (GATK) framework, v3


bioconda URL biotools
tool / biotools
BCFtools

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.


edam edam edam edam edam doi biotools URL GitHub URL URL URL URL doi email
tool / cran
ggplot2

Create Elegant Data Visualisations Using the Grammar of Graphics: A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.


email
tool / biotools
MultiQC

MultiQC aggregates results from multiple bioinformatics analyses across many samples into a single report. It searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.


edam edam edam edam edam edam edam bioconda biotools URL GitHub URL URL doi email pypi
tool / biotools
BEDTools

BEDTools is an extensive suite of utilities for comparing genomic features in BED format.


biotools pmid
tool / pypi
matplotlib

Matplotlib produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, Python/IPython shells, web application servers, and various graphical user interface toolkits.


URL email GitHub
tool / cran
tidyverse

Easily Install and Load the 'Tidyverse': The 'tidyverse' is a set of packages that work in harmony because they share common data representations and 'API' design. This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step. Learn more about the 'tidyverse' at .


email cran
tool / biotools
BWA

Fast, accurate, memory-efficient aligner for short and long sequencing reads


edam edam edam edam edam edam edam edam edam edam edam edam edam doi doi doi doi URL doi doi biotools URL URL URL URL bioconda email email
tool / biotools
Picard

A set of command line tools for manipulating high-throughput sequencing (HTS) data in formats such as SAM/BAM/CRAM and VCF. Available as a standalone program or within the GATK4 program.


edam biotools GitHub URL
tool / pypi
snakemake

Snakemake is a workflow management system that aims to reduce the complexity of creating workflows by providing a fast and comfortable execution environment, together with a clean and modern specification language in python style. Snakemake workflows are essentially Python scripts extended by declarative code to define rules. Rules describe how to create output files from input files.


pypi URL email
input / edam
Sequence

One or more molecular sequences, possibly with associated annotation.|This concept is a placeholder of concepts for primary sequence data including raw sequences and sequence records. It should not normally be used for derivatives such as sequence alignments, motifs or profiles.


edam
tool / bioconda
tabix

C library and command line tools for high-throughput sequencing data formats.


GitHub bioconda
tool / cran
dplyr

A Grammar of Data Manipulation: A fast, consistent tool for working with data frame like objects, both in memory and out of memory.


email cran
tool / biotools
STAR

An integrated solution to management and visualization of sequencing data.


edam biotools GitHub doi email
tool / biotools
Biopython

Biopython is a set of freely available tools for biological computation written in Python by an international team of developers.


edam edam edam edam biotools URL URL pmid
format / edam
JSON

JavaScript Object Notation format; a lightweight, text-based format to represent tree-structured data using key-value pairs.


edam URL
tool / biotools
Cutadapt

Find and remove adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.


edam edam edam edam biotools doi doi URL URL URL email
tool / biotools
fastp

A tool designed to provide fast all-in-one preprocessing for FastQ files. This tool is developed in C++ with multithreading supported to afford high performance.


edam edam GitHub biotools doi
tool / pypi
seaborn

-------------------------------------- seaborn: statistical data visualization ======================================= [![PyPI Version](https://img.shields.io/pypi/v/seaborn.svg)](https://pypi.org/project/seaborn/) [![License](https://img.shields.io/pypi/l/seaborn.svg)](https://github.com/mwaskom/seaborn/blob/master/LICENSE) [![DOI](https://joss.theoj.org/papers/10.21105/joss.03021/status.svg)](https://doi.org/10.21105/joss.03021) [![Tests](https://github.com/mwaskom/seaborn/workflows/CI/badge.svg)](https://github.com/mwaskom/seaborn/actions) [![Code Coverage](https://codecov.io/gh/mwaskom/seaborn/branch/master/graph/badge.svg)](https://codecov.io/gh/mwaskom/seaborn) Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. Documentation ------------- Online documentation is available at [seaborn.pydata.org](https://seaborn.pydata.org). The docs include a [tutorial](https://seaborn.pydata.org/tutorial.html), [example


pypi URL email
output / edam
Sequence

One or more molecular sequences, possibly with associated annotation.|This concept is a placeholder of concepts for primary sequence data including raw sequences and sequence records. It should not normally be used for derivatives such as sequence alignments, motifs or profiles.


edam
tool / biotools
QIIME2.0

QIIME 2™ is a next-generation microbiome bioinformatics platform that is extensible, free, open source, and community developed.


edam edam edam edam biotools URL GitHub URL URL URL doi email
tool / bioconda
snakemake-wrapper-utils

A collection of utility functions and classes for Snakemake wrappers.


bioconda GitHub URL
tool / biotools
Bowtie 2

Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.


edam edam edam edam edam edam doi doi doi doi doi biotools URL URL URL GitHub URL URL URL bioconda URL bioconda doi
tool / cran
data.table

Extension of 'data.frame': Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns, friendly and fast character-separated-value read/write. Offers a natural and flexible syntax, for faster development.


email URL
tool / pypi
scipy

.. image:: https://raw.githubusercontent.com/scipy/scipy/main/doc/source/_static/logo.svg :target: https://scipy.org :width: 110 :height: 110 :align: left .. image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: https://numfocus.org .. image:: https://img.shields.io/pypi/dm/scipy.svg?label=Pypi%20downloads :target: https://pypi.org/project/scipy/ .. image:: https://img.shields.io/conda/dn/conda-forge/scipy.svg?label=Conda%20downloads :target: https://anaconda.org/conda-forge/scipy .. image:: https://img.shields.io/badge/stackoverflow-Ask%20questions-blue.svg :target: https://stackoverflow.com/questions/tagged/scipy .. image:: https://img.shields.io/badge/DOI-10.1038%2Fs41592--019--0686--2-blue :target: https://www.nature.com/articles/s41592-019-0686-2 SciPy (pronounced "Sigh Pie") is an open-source software for mathematics, science, and engineering. It includes modules for statistics, optimization, integration, li


pypi URL URL email
tool / biotools
Trimmomatic

A flexible read trimming tool for Illumina NGS data


edam edam edam edam edam edam edam edam edam edam biotools URL URL URL doi email
tool / biotools
Bowtie

Bowtie is an ultrafast, memory-efficient short read aligner.


edam edam edam edam edam edam edam edam edam edam edam biotools doi doi doi URL URL URL email
tool / biotools
Minimap2

Pairwise aligner for genomic and spliced nucleotide sequences


edam GitHub biotools URL doi
tool / cran
tidyr

Tidy Messy Data: Tools to help to create tidy data, where each column is a variable, each row is an observation, and each cell contains a single value. 'tidyr' contains tools for changing the shape (pivoting) and hierarchy (nesting and 'unnesting') of a dataset, turning deeply nested lists into rectangular data frames ('rectangling'), and extracting values out of string columns. It also includes tools for working with missing values (both implicit and explicit).


email cran
tool / biotools
Quant

MATLAB program for protein quantitation by iTRAQ.


edam edam edam edam edam edam edam edam edam edam edam URL URL URL biotools URL URL pmid email
tool / pypi
pysam

pysam - a python module for reading, manipulating and writing genomic data sets. pysam is a lightweight wrapper of the htslib C-API and provides facilities to read and write SAM/BAM/VCF/BCF/BED/GFF/GTF/FASTA/FASTQ files as well as access to the command line functionality of the samtools and bcftools packages. The module supports compression and random access through indexing. This module provides a low-level wrapper around the htslib C-API as using cython and a high-level API for convenient access to the data within standard genomic file formats. See: http://www.htslib.org https://github.com/pysam-developers/pysam http://pysam.readthedocs.org/en/stable


GitHub URL
tool / biotools
DESeq2

R/Bioconductor package for differential gene expression analysis based on the negative binomial distribution. Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.


edam biotools URL URL doi email GitHub
tool / biotools
DeepTools

User-friendly tools for the normalization and visualization of deep-sequencing data.


edam edam biotools URL URL pmid email
tool / pypi
utils

Utils ===== .. image:: https://travis-ci.org/haaksmash/pyutils.svg?branch=master :target: https://travis-ci.org/haaksmash/pyutils Sometimes you write a function over and over again; sometimes you look up at the ceiling and ask "why, Guido, why isn't this included in the standard library?" Well, we perhaps can't answer that question. But we can collect those functions into a centralized place! Provided things +++++++++++++++ Utils is broken up into broad swathes of functionality, to ease the task of remembering where exactly something lives. enum ---- Python doesn't have a built-in way to define an enum, so this module provides (what I think) is a pretty clean way to go about them. .. code-block:: python from utils import enum class Colors(enum.Enum): RED = 0 GREEN = 1 # Defining an Enum class allows you to specify a few # things about the way it's going to behave. class Options: frozen = True # can't change attributes stric


email URL pypi URL
tool / biotools
seqkit

FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations.


edam edam edam edam edam edam edam biotools URL doi
tool / biotools
seqtk

A tool for processing sequences in the FASTA or FASTQ format. It parses both FASTA and FASTQ files which can also be optionally compressed by gzip.


edam edam GitHub biotools doi
tool / pypi
common

The MIT License (MIT) ===================== Copyright (c) 2016 Marcel Hellkamp Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT O


pypi URL email
tool / biotools
BLAST

A tool that finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.


edam edam edam edam edam edam edam edam edam edam edam edam edam biotools URL URL URL doi doi doi doi email
tool / biotools
kraken2

Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm.


edam edam edam edam edam edam URL biotools GitHub URL doi email
tool / cran
stringr

Simple, Consistent Wrappers for Common String Operations: A consistent, simple and easy to use set of wrappers around the fantastic 'stringi' package. All function and argument names (and positions) are consistent, all functions deal with "NA"'s and zero length vectors in the same way, and the output from one function is easy to feed into the input of another.


email cran
tool / biotools
FeatureCounts

featureCounts is a very efficient read quantifier. It can be used to summarize RNA-seq reads and gDNA-seq reads to a variety of genomic features such as genes, exons, promoters, gene bodies and genomic bins. It is included in the Bioconductor Rsubread package and also in the SourceForge Subread package.


edam URL biotools URL pmid email email email
tool / biotools
pLink

A high-speed search engine pLink 2 with systematic evaluation for proteome-scale identification of cross-linked peptides.


edam edam biotools URL doi email email
tool
Trim_Galore

Trim Galore! is a wrapper script to automate quality and adapter trimming as well as quality control, with some added functionality to remove biased methylation positions for RRBS sequence files (for directional, non-directional (or paired-end) sequencing).


URL