Skip to content
Snippets Groups Projects
Commit edc8f7eb authored by Alexis Mergez's avatar Alexis Mergez
Browse files

Update README.md

parent 4645f786
No related branches found
No related tags found
No related merge requests found
...@@ -6,7 +6,8 @@ Tools used within the workflow : ...@@ -6,7 +6,8 @@ Tools used within the workflow :
- PanGraTools : https://forgemia.inra.fr/alexis.mergez/pangratools - PanGraTools : https://forgemia.inra.fr/alexis.mergez/pangratools
- Pan1c-Apps : https://forgemia.inra.fr/alexis.mergez/pan1capps - Pan1c-Apps : https://forgemia.inra.fr/alexis.mergez/pan1capps
The file architecture for the workflow is as follow : # File architecture
## Before running the workflow
``` ```
Pan1c/ Pan1c/
├── config.yaml ├── config.yaml
...@@ -20,15 +21,54 @@ Pan1c/ ...@@ -20,15 +21,54 @@ Pan1c/
├── README.md ├── README.md
├── runSnakemake.sh ├── runSnakemake.sh
├── scripts ├── scripts
│ ├── getPanachePAV.sh │ └── ...
│ ├── inputClustering.py
│ ├── ragtagChromInfer.sh
│ ├── statsAggregation.py
│ └── workflowStats.py
└── Snakefile └── Snakefile
``` ```
## After the workflow (Arabidopsis Thaliana example)
The following tree is non-exhaustive for clarity. Temporary files are not listed, but key files are included.
```
Pan1c-06AT-v3
├── chrInputs
├── config.yaml
├── data
│ ├── chrGraphs
│ │ ├── chr<id>
│ │ ├── chr<id>.gfa
│ │ └── graphsList.txt
│ ├── chrInputs
│ │ └── chr<id>.fa.gz
│ ├── haplotypes
│ └── hap.ragtagged
│ ├── <sample>.hap<hid>
│ └── <sample>.hap<hid>.ragtagged.fa.gz
├── logs
│ ├── pan1c.pggb.06AT-v3.logs.tar.gz
│ └── pggb
│ ├── chr<id>.pggb.cmd.log
│ └── chr<id>.pggb.time.log
├── output
│ ├── figures
│ │ ├── chr<id>.1Dviz.png
│ │ └── chr<id>.pcov.png
│ ├── pan1c.pggb.06AT-v3.chrGraph.stats.tsv
│ ├── pan1c.pggb.06AT-v3.gfa
│ ├── pan1c.pggb.06AT-v3.workflow.stats.tsv
│ ├── panacus.reports
│ │ └── chr<id>.histgrowth.html
│ ├── pggb.usage.figs
│ └── stats
│ └── chr<id>.stats.tsv
├── Pan1c-06AT-v3.log
├── README.md
├── runSnakemake.sh
├── scripts
│ └── ...
├── Snakefile
└── workflow.svg
```
# Example DAG # Example DAG (Arabidopsis Thaliana example)
This DAG shows the worflow for a pangenome of `Arabidospis Thaliana` using the `TAIR10.1` reference. This DAG shows the worflow for a pangenome of `Arabidospis Thaliana` using the `TAIR10.1` reference.
![Workflow DAG](example/workflow.svg) ![Workflow DAG](example/workflow.svg)
...@@ -49,4 +89,13 @@ Before running the worflow, some apptainer images needs to be downloaded. Use th ...@@ -49,4 +89,13 @@ Before running the worflow, some apptainer images needs to be downloaded. Use th
Clone this repository and create a `data/haplotypes` directory where you will place all your haplotypes. Clone this repository and create a `data/haplotypes` directory where you will place all your haplotypes.
Update the reference name and the apptainer image directory in `config.yaml`. Update the reference name and the apptainer image directory in `config.yaml`.
Then, modify the variables in `runSnakemake.sh` to match your requirements (number of threads, memory, job name, email, etc.). Then, modify the variables in `runSnakemake.sh` to match your requirements (number of threads, memory, job name, email, etc.).
Navigate to the root directory of the repository and execute `sbatch runSnakemake.sh`! Navigate to the root directory of the repository and execute `sbatch runSnakemake.sh`!
\ No newline at end of file
# Outputs
The workflow generates several key files :
- Aggregated graph including every chromosome scale graphs (`output/pan1c.pggb.<panname>.gfa`)
- Chromosome scale graphs (`data/chrGraphs/chr<id>.gfa`)
- Panacus html reports for each chromosome graph (`output/panacus.reports/chr<id>.histgrowth.html`)
- Statistics on input sequences, graphs and resources used by the workflow (`output/pan1c.pggb.<panname>.workflow.stats.tsv`)
- PAV matrices (optional) for each chromosome graph (`output/pav.matrices/chr<id>.pav.matrix.tsv`)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment