Snakemake pipeline for scATAC metacell utilities such as primed and lineage specific scores, chromVAR, in-silico ChIP
This repository contains a
snakemake
pipeline for utilities with ATAC and RNA metacells inferred using SEACells algorithm (ref-1). The following outputs are generated
-
In-silico ChIP (ref-2) TF targets
-
ChromVAR results (ref-3) using In-silico ChIP results
-
Primed and lineage specific peaks in single-cell data as described in (ref-4)
Overview:
Below is a DAG showing the rule dependencies in the
snakemake
pipeline. View the
Snakefile
to see details on each rule.
peak_scores
generates primed and lineage-specific accessibility scores.
compute_ins_chip
performs in-silico ChIP analysis
diff_acc
can be used for differential accessibility between cell types using metacells
chromvar
generatees chromVAR results
Output of the pipeline
The output of the pipeline is to update the user provided anndata objects following the style in
scanpy.
Given the extensive changes, we recommend using copies of your anndata objects to avoid corrupting your original objects.
The input single-cell anndata (
*_sc_ad
) and metacell anndata (
*_mc_ad
) objects are updated with the following information.
-
atac_mc_ad.uns['gp_corrs']
: Metacell level peak accessibility and gene expression correlations computed using SEACells -
atac_mc_ad.varm['<celltypeA>_<celltypeB>_diff_acc']
: Differential accessibiliy between cell type A and cell type B using edgeR -
atac_sc_ad.varm['OpenPeaks']
: Matrix indicating which peak is open in each cell type -
atac_sc_ad.layers[f'<Target cell type>_primed']
: ATAC peaks primed in Target cell type -
atac_sc_ad.layers[f'<Target cell type>_lineage_specific']
: ATAC lineage specific peaks in Target cell type -
rna_sc_ad.layers[f'<Target cell type>_primed']
: Primed accessiblity scores for each gene in the target cell type -
rna_sc_ad.layers[f'<Target cell type>_lineage_specific']
: Lineage specific accessiblity scores for each gene in Target cell type -
atac_sc_ad.varm['FIMO'], atac_sc_ad.uns['FIMOcolumns']
: FIMO motif enrichment for ATAC peaks -
atac_sc_ad.varm['InSilicoChip'], atac_sc_ad.uns['FIMOcolumns']
: In-silico ChIP motif enrichment for ATAC peaks -
atac_sc_ad.varm['FIMO'], atac_sc_ad.uns['FIMOcolumns']
: FIMO motif enrichment for ATAC peaks -
atac_sc_ad.varm['InSilicoChip'], atac_sc_ad.uns['InSilicoChipcolumns']
: In-silico ChIP motif enrichment for ATAC peaks -
rna_sc_ad.obsm['chromVAR_deviations']
: chromVAR results with In-silico ChIP results -
rna_sc_ad.varm['geneXTF']
: Gene X TF targets inferred using SEACells gene-peaks correlations and in-silico ChIP results.
Environment Setup
In order to run this pipeline, certain Python and R packages need to be installed. Importantly, the
environment.yaml
/
requirements.txt
files include the
snakemake
package.
Dependencies
Before creating the environment, make sure the following dependencies are installed.
-
conda
- Conda is required for managing environments, whether you chose to create the environment using conda or pip.
-
R 4.1.0
-
This is the version of R used throughout the pipeline. If you wish to use a different R version / installation, modify the
renv.lock
file and the R version inconfig.yaml
-
This is the version of R used throughout the pipeline. If you wish to use a different R version / installation, modify the
-
MEME (recommended: 5.1.1)
-
This includes the
fimo
tool used to find TF motifs. If you wish to use your own installation of the MEME Suite, update the MEME version inconfig.yaml
-
This includes the
Python Modules
There are two options, using
conda
or using
pip
.
With
conda
:
envName=atac-metacell-utils
conda env create -n "$envName" --file envs/environment.yaml
conda activate "$envName"
With
pip
:
In an activated, bare
conda
environment:
pip install -r envs/requirements.txt
R modules
To ensure the project runs smoothly, an
renv.lock
file is included under the
envs/
directory. Using the
renv
package
makes it easy to distribute the required packages and their correct versions.
If
renv
is not yet installed, run:
snakemake --cores 1 renv_install --config renv_loc=<path/to/user/R/library>
NOTE:
--cores
can tell
snakemake
how many cores are available for use. It will automatically run the jobs in parallel, if possible.
This will run the
snakemake
rule that will install the
renv
package into the library location specified above
Next, we can install all the required R packages:
snakemake --cores 1 renv_init_restore
Modifying the
config.yaml
file
For each project using this pipeline, parameters can be set in the configuration file located at:
config/config.yaml
In order for the pipeline to run, you will have to set the following parameters:
-
atac
str : Path to the SEACell-level ATAC AnnData with normalized and log-transformed counts -
rna
str : Path to the SEACell-level RNA AnnData with normalized and log-transformed counts -
sc_atac
str : Path to the single-cell-level ATAC AnnData with matching peaks toatac
AnnData. Must havenFrags
annotation in.obs
. -
sc_rna
str : Path to the single-cell-level RNA AnnData used to generate the SEACell-level RNA data.
NOTE:
The RNA and ATAC SEACell AnnDatas should have matched/common
obs_names
.
Optional Parameters:
The following values have default values specified in the included
config.yaml
but can be adjusted to suit
Peak and genome params:
NOTE:
Make sure the
in_meme
file is appropriate for the chosen
genome
-
in_meme
str : Path to the.meme
file for FIMO,default=data/cis-bp-tf-information.meme
-
The default MEME motif file was created using the script:
workflow/scripts/make_meme.py
-
Motifs from other databases can also be converted to a format compatible with MEME - for example, the MEME utility
jaspar2meme
can convert JASPAR PFM files.
-
-
width
int : Number of base pairs to resize ATAC peaks to. The summit will be the midpoint,default=150
-
genome
str : Genome to use, will be referenced in theall_seqs
,chromvar
, andgp_corr
rules,default=hg38
-
Currently supported genomes include:
-
hg38
-
hg19
-
hg18
-
mm9
-
mm10
-
-
In silico ChIP params:
-
verbose
str/flag : Whether to print info for each TF,default=""
(False), set to"--verbose"
for True -
min_chip_score
float : minimum in silico ChIP score to pass filtering,default=0.10
-
min_peak_hits
int : minimum TFs passing in silico ChIP filters,default=30
-
vmax
int : vmax for plotting, vmin = -vmax,default=3
-
embedding
str :sc_ad.obsm
field for plotting,default=umap
Gene-Peak correlation params:
-
n_jobs
int : Number of cores for thegp_rule
,default=1
-
test_set
str/flag : Whether to generate a test set,default=""
(False), set to"--test_set"
for True -
n_genes
int : number of genes to use in test set, if applicable,default=20
-
min_corr
float : minimum gene-peak correlation to pass filtering,default=0.0
-
max_pval
float : max p-value for gene-peak correlation to pass filtering,default=0.1
-
min_peaks
int : minimum number of significant peaks to pass gene filtering,default=5
Differential accessibility params:
-
group_variable
str : the name of the column in the single-cell ATAC .obs to annotate metacells by, such as celltype. -
to_compare
str : A string of comma-separated pairs of levels ofgroup_variable
, separated by forward slashes. example: "EryPre1,proB/Mono,proB/EryPre1,HSC/Mono,HSC"
Peak selection params:
-
target
list : Target lineage for identification of primed and lineage specific peaks. List of length 2: 1)Target lineage name for annotation and 2)Value ofgroup_variable
. -
start
str : Value of thegroup_variable
corresponding to the stem cell population -
reference
dict : A dictionary of lineages and their corresponding representative cell type, defined in YAML format. Note that the comparisons should be included for each reference and target combination, and each reference and start combination -
min_logFC
float : the minimum log fold change for a peak to be considered to be differentially accessible in the target lineage. `default=0 -
max_logFC
float : the max allowable log-fold change accessibility of a peak in a reference lineage relative to the starting state.default=0.25
Gene-Peak correlation params
,
Differential accessibility params
,
Peak selection params
should be updated for computing primed and lineage-specific peaks and scores.
Command-line config specification
Instead of altering the
config.yaml
file directly, you can also pass the params using the keys in the config file on the command line.
For example if we wanted to generate a test_set:
snakemake --cores 1 all --config test_set="--test_set"
Passing in a different config file
In the Snakefile, the global variable,
config
, is set with the line:
configfile: "config/config.yaml"
The keys defined in the default
config/config.yaml
can be
overwritten
by passing a different config file (still
.yaml
). For example:
snakemake --cores 1 all --configfile <new_config_file.yaml>
NOTE:
All keys defined by the
configfile:
statement, the
--configfile
and/or
--config
command line arguments are part of the final
config
dictionary.
If two methods define the same key, command line arguments
overwrite
keys from the
configfile:
statement
Snakemake will output a statement to make it clear to the user:
Config file config/config.yaml is extended by additional config specified via the command line
Optional: Download
hg38.gtf
For convenience, a rule is included to download the
hg38.gtf
file into a data directory for use in the
gp_corr
rule.
snakemake --cores 1 dl_hg38
If a different genome is being used, create a
data/
directory and place the
.gtf
file inside. Modify
config.yaml
accordingly.
Running the pipeline
After the environment has been set up and the configuration file is set, the pipeline is ready to run!
We can always conduct a "dry run" to test that all required inputs/parameters have been set before running the pipeline for real:
snakemake -nr all
-
-n
flag tellssnakemake
to make a "dry run" -
-r
flag will ouput the reason each rule is running
To generate all files, run:
snakemake --cores 1 all
NOTE: In the case that a rule fails, it is not necessary to rerun rules that successfully completed.
In fact, calling the
all
rule again will only run the rules for which their output is missing.
snakemake --cores 1 all
For more control, individual rules can also be called:
If you do this, make sure all the input dependencies have already been generated!
-n
can be appended for a dry run to determine if inputs are available.
snakemake --cores 1 name_of_rule
After modifications of inputs or parameters, you can force a rule to rerun using the
-R
flag:
snakemake --cores 1 -R name_of_rule
For even more control, the
--allowed-rules
argument can be used to pass a space-separated list of rules that are allowed to run. This is helpful when trying to avoid re-running time-intensive rules such as
fimo
and
gp_corr
when calling
all
or individual rules that get inputs from upstream rules.
For example, the following will run/re-run the
chromvar
rule and any of the allowed rules if necessary.
snakemake --cores 1 chromvar --allowed_rules peak_tf compute_ins_chip prep_chromvar
When running this pipeline on an HPC system which uses
lmod
to load software, do not load snakemake as a module - this can cause conflicts where the Python version used by
snakemake
is the one provided by the module, not the
gene-TF
conda environment.
To run using the snakemake version installed with
gene-TF
, make sure the
gene-TF
environment is active, then run the pipeline as follows:
python -m snakemake --cores 1 name-of-rule
Submitting to Slurm
To run this pipeline non-interactively on a cluster using the Slurm job manager, create a shell script calling
snakemake
. An template script is shown below.
-
$1
: name ofsnakemake
rule to run -
$2
: number of cores to use
#!/bin/bash
#SBATCH --cpus-per-task=16
#SBATCH --job-name=snakemake
#SBATCH --partition campus-new
#SBATCH --nodes=1
#SBATCH --time=2-00:00:00
#SBATCH --mem=150
snakemake --cores $2 $1
Output (stdout and stderr) captured by Slurm is written to a
.log
file, specified with the
-o
/
--output
flag.
After creating the shell script, and making it executable, run:
sbatch -o /path/to/log/file.log name_of_script.sh rule_name num_cores
Log Files
For the
fimo
rule, separate log files are created and can be accessed in the
logs/
directory.
Other log files can be accessed in the
.snakemake/log
directory.
References
-
SEACells: https://www.nature.com/articles/s41587-023-01716-9
-
In-silico ChIP: https://www.biorxiv.org/content/10.1101/2022.06.15.496239v1
-
chromVAR: https://www.nature.com/articles/nmeth.4401
-
Mellon: https://www.biorxiv.org/content/10.1101/2023.07.09.548272v2 The custom MOTIF annotations are based off of in silico ChIP-seq , introduced by Argelaguet et al in this paper.
Code Snippets
1 2 | args <-commandArgs(trailingOnly=TRUE) install.packages("renv", lib=args[1], repos='http://cran.us.r-project.org') |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | import scanpy as sc import SEACells import pandas as pd import numpy as np def main(): # Load data print('Loading data...') atac_sc_ad = sc.read(snakemake.input["sc_atac"]) group_variable = snakemake.params['cell_type_obs'] # Open peaks per cell type # Aggregate counts per cell type print('Aggregating counts per cell type') ct_ad = SEACells.core.summarize_by_SEACell(atac_sc_ad, group_variable, summarize_layer='X') ct_ad.obs['n_counts'] = np.ravel(ct_ad.X.sum(axis=1)) # SVD for filler - only used for identifying itself as nearest neighbor sc_svd = pd.DataFrame(atac_sc_ad.obsm['X_svd'], index=atac_sc_ad.obs_names) ct_ad.obsm['X_svd'] = sc_svd.groupby( atac_sc_ad.obs[group_variable]).mean().loc[ct_ad.obs_names, :] # Open peaks # Determine open peaks in each metacell print('Open peaks in each cell type') SEACells.accessibility.determine_metacell_open_peaks( ct_ad, n_neighbors=1, n_jobs=1) # Update anndata print('Update and saving anndata') atac_sc_ad.varm['OpenPeaks'] = pd.DataFrame(ct_ad.layers['OpenPeaks'].T, columns=ct_ad.obs_names, index=atac_sc_ad.var_names) # Save atac_sc_ad.write(snakemake.input["sc_atac"]) # CReate directory to mark completion import os os.makedirs(str(snakemake.output["out_dir"]), exist_ok=True) if __name__ == "__main__": main() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 | import scanpy as sc import pandas as pd import numpy as np from tqdm.auto import tqdm def main(): # Load data print('Loading anndata...') atac_ad = sc.read(snakemake.input["sc_atac"]) atac_meta_ad = sc.read(snakemake.input["meta_atac"]) # set target lineage and cell type target = snakemake.params["target"] target_lineage = target[0] target_celltype = target[1] # Start cell type start_celltype = snakemake.params["start"] # Reference lineages ref_lineages = snakemake.params["reference"] # set filtering parameters min_logFC = snakemake.params["min_logFC"] max_logFC = snakemake.params["max_logFC"] max_pval = snakemake.params["max_pval"] min_corr = snakemake.params["min_corr"] print("Loading gene-peak corrs...") gene_peak_corrs = dict(list(atac_meta_ad.uns['gp_corrs'].groupby('gene'))) print("Loading differential results") diff_peaks = dict() start_peaks = dict() for ref_lineage in ref_lineages.keys(): ref_celltype = ref_lineages[ref_lineage] # Differential accessibility compared to target cell type diff_peaks[ref_lineage] = atac_meta_ad.varm[f'{ref_celltype}_{target_celltype}_diff_acc'] # diff_peaks[ref_lineage]['feature'] = diff_peaks[ref_lineage].index # Differential accessibility compared to stem cell type start_peaks[ref_lineage] = atac_meta_ad.varm[f'{ref_celltype}_{start_celltype}_diff_acc'] # start_peaks[ref_lineage]['feature'] = diff_peaks[ref_lineage].index # Select list of peaks which are positively correlated with gene expression and meet p value criteria print("Removing peaks not positively correlated with gene expression...") correlated_peaks = pd.Series(dtype=object) for gene in tqdm(gene_peak_corrs.keys()): if isinstance(gene_peak_corrs[gene], int): continue res = gene_peak_corrs[gene] iter_peaks = res.index[(res['cor'] >= min_corr) & (res['pval'] < max_pval)] correlated_peaks = iter_peaks.union(correlated_peaks) filtered_peaks = correlated_peaks.unique() print(f'Retained {len(filtered_peaks)} peaks') # Peaks that are open in the target lineage print('Retain peaks that are open in the target lineage') open_peaks = atac_ad.varm['OpenPeaks'] filtered_peaks = filtered_peaks[open_peaks.loc[filtered_peaks, [target_celltype]].sum(axis=1) > 0] print(f'Retained {len(filtered_peaks)} peaks') print(f"Retain peaks with greater accessibility in {target_lineage} compared to other lineages") retain_peaks = pd.Series(False, index=filtered_peaks) for ref in diff_peaks.keys(): diff = diff_peaks[ref].loc[filtered_peaks] retain = filtered_peaks[(diff['logFC'] > min_logFC) & ~np.isnan(diff['groupA_N'])] retain_peaks[retain] = True filtered_peaks = filtered_peaks[retain_peaks] print(f'Retained {len(filtered_peaks)} peaks') print("Remove peaks with increased accessibility in reference cell types compared to stem cells") retain_peaks = pd.Series(False, index=filtered_peaks) for ref in start_peaks.keys(): diff = start_peaks[ref].loc[filtered_peaks] retain = filtered_peaks[(diff['logFC'] < max_logFC)] retain_peaks[retain] = True filtered_peaks = filtered_peaks[retain_peaks] print(f'Retained {len(filtered_peaks)} peaks') # Primed and lineage specific peaks primed_peaks = filtered_peaks[open_peaks.loc[filtered_peaks, start_celltype] > 0] lin_spec_peaks = filtered_peaks[open_peaks.loc[filtered_peaks, start_celltype] == 0] print("Peak selection completed! Adding results to anndata ...") atac_ad.var[f'{target_lineage}_primed'] = atac_ad.var_names.isin(primed_peaks) atac_ad.var[f'{target_lineage}_lineage_specific'] = atac_ad.var_names.isin(lin_spec_peaks) print('Saving') atac_ad.write(snakemake.input["sc_atac"]) # CReate directory to mark completion import os os.makedirs(str(snakemake.output["out_dir"]), exist_ok=True) print("Completed!") if __name__ == "__main__": main() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | import scanpy as sc import numpy as np import pandas as pd from tqdm.auto import tqdm def _dot_func(x, y): return x.dot(y) def impute_data_with_weights(ad, data): res = _dot_func(ad.obsp['ImputeWeights'], data) return pd.DataFrame(res, index=ad.obs_names, columns=ad.var_names) def main(): # Load params print('Load data...') atac_sc_ad = sc.read_h5ad(snakemake.input["sc_atac"]) atac_meta_ad = sc.read_h5ad(snakemake.input["meta_atac"]) rna_sc_ad = sc.read_h5ad(snakemake.input["sc_rna"]) target_lineage = snakemake.params["target"][0] min_corr = snakemake.params["min_corr"] max_pval = snakemake.params["max_pval"] min_peaks = snakemake.params["min_peaks"] # Reconstruct gene peak correlations gene_peak_corrs = dict(list(atac_meta_ad.uns['gp_corrs'].groupby('gene'))) # Peak accessibility imputation print('Peak accessibility imputation...') lin_peaks = atac_sc_ad.var_names[ (atac_sc_ad.var[f'{target_lineage}_primed']) | (atac_sc_ad.var[f'{target_lineage}_lineage_specific']) ] imputed_peak_matrix = impute_data_with_weights(atac_sc_ad[:, lin_peaks], pd.DataFrame(atac_sc_ad[:, lin_peaks].layers['tf_idf'].todense(), columns=lin_peaks, index=atac_sc_ad.obs_names)) # Scores from scipy.sparse import csr_matrix acc_scores = dict() for key in ['primed', 'lineage_specific']: print(f'Computing {key} scores') # Imputead accessibility lin_peaks = atac_sc_ad.var_names[atac_sc_ad.var[f'{target_lineage}_{key}']] imp_peaks = imputed_peak_matrix[lin_peaks] # Scores per gene acc_scores[key] = pd.DataFrame( 0.0, index=atac_sc_ad.obs_names, columns=rna_sc_ad.var_names) for gene in tqdm(rna_sc_ad.var_names): if gene not in gene_peak_corrs.keys(): continue # Gene peak correlations res = gene_peak_corrs[gene] if isinstance(res, int): continue gene_peaks = res.index[(res['cor'] > min_corr) & (res['pval'] < max_pval)] if len(gene_peaks) < min_peaks: continue key_peaks = gene_peaks[gene_peaks.isin(imp_peaks.columns)] if len(key_peaks) == 0: continue acc_scores[key].loc[:, gene] = imp_peaks[key_peaks].dot(res.loc[key_peaks, 'cor']) \ / np.sum(res.loc[key_peaks, 'cor']) # Save results to rna andata rna_sc_ad.layers[f'{target_lineage}_{key}'] = csr_matrix(acc_scores[key]) # Save results print('Saving anndata') rna_sc_ad.write_h5ad(snakemake.input["sc_rna"]) # CReate directory to mark completion import os os.makedirs(str(snakemake.output["out_dir"]), exist_ok=True) if __name__ == "__main__": main() |
17 18 | shell: "mkdir -p data && wget https://dp-lab-data-public.s3.amazonaws.com/SEACells-multiome/hg38.gtf -O data/hg38.gtf" |
24 25 26 27 | shell: """ Rscript workflow/scripts/install_renv.R {input.libloc} """ |
30 31 32 33 34 | shell: """ R -e 'renv::init(bare=TRUE)' R -e 'renv::restore(lockfile=\"envs/renv.lock\")' """ |
46 47 | shell: "python {input.script} --atac={input.atac} --outdir={params.out_dir}" |
58 59 60 61 | shell: """ Rscript {input.script} {input.peaks} {output.outfile} {params.span} {params.genome} """ |
71 72 73 74 75 76 | shell: """ # fimo -oc {output.fimo_dir} {input.meme} {input.seqs} 2> {log.out} # rm -f {output.fimo_dir}/cisml.xml """ |
90 91 | shell: "python {input.script} --peak_file={input.peaks} --fimo_res={input.fimo_dir} -o={output.out_dir} --sc_atac={input.sc_atac}" |
106 107 108 109 110 111 112 | shell: """ mkdir -p {output.out_dir} python {input.metacell_script} --atac={input.atac} --sc_atac={input.sc_atac} --cell_type_obs={params.cell_type_obs} -o={output.out_dir} Rscript {input.script} {output.out_dir} {params.to_compare} {params.cell_type_obs} python {input.pyscript} --atac={input.atac} --to_compare={params.to_compare} --data_dir={output.out_dir} """ |
130 131 132 133 134 135 136 | shell: """ mkdir -p {output.gp_dir} python {input.script} --atac={input.atac} --rna={input.rna} -o={output.gp_dir} --n_jobs={params.n_jobs} \\ --gtf_file={input.gtf_file} {params.test_set} --n_genes={params.n_genes} --min_corr={params.min_corr} \\ --max_pval={params.max_pval} """ |
149 150 | script: "scripts/open_peaks.py" |
174 175 | script: "scripts/peak_selection.py" |
196 197 | script: "scripts/primed_lin_scores.py" |
212 213 214 215 216 217 | shell: """ mkdir -p {output.insc_dir} python {input.script} --atac={input.atac} --rna={input.rna} --sc_atac={input.sc_atac} -o={output.insc_dir} {params.verbose} touch "{output.insc_dir}/.ins_chip" """ |
226 227 228 229 | shell: """ python {input.script} --atac={input.atac} --rna={input.rna} --datadir={input.data_dir} """ |
241 242 243 244 245 246 | shell: """ mkdir -p {output.chromvar_indir} python {input.script} --sc_atac={input.sc_atac} --ins_chip_dir={input.ins_chip_out} --min_chip={params.min_chip} \\ --min_peak_hits={params.min_peak_hits} -o={output.chromvar_indir} """ |
261 262 263 264 265 266 | shell: """ mkdir -p {output.chromvar_outdir} Rscript {input.script} {input.peak_file} {input.chromvar_indir} {output.chromvar_outdir} {params.genome} python {input.pyscript} --sc_rna {input.sc_rna} --input {output.chromvar_outdir} """ |
283 284 285 286 287 288 289 | shell: """ mkdir -p {output.gtf_dir} python {input.script} --atac={input.atac} --sc_atac={input.sc_atac} --sc_rna={input.sc_rna}\\ --min_corr={params.min_corr} --max_pval={params.max_pval} \\ --min_peaks={params.min_peaks} -o={output.gtf_dir} """ |
296 297 298 299 300 | run: if config["interactive_clean"]: shell("rm -ri {params.out_dir}") else: shell("rm -r {params.out_dir}") |
Support
- Future updates
Related Workflows





