Update README.md

edc8f7eb · Alexis Mergez · 4645f786 · edc8f7eb
Commit edc8f7eb authored 1 year ago by Alexis Mergez
--- a/README.md
+++ b/README.md
@@ -6,7 +6,8 @@ Tools used within the workflow :
 - PanGraTools : https://forgemia.inra.fr/alexis.mergez/pangratools
 - Pan1c-Apps : https://forgemia.inra.fr/alexis.mergez/pan1capps
-The file architecture for the workflow is as follow :
+# File architecture
+## Before running the workflow
 ```
 Pan1c/
 ├── config.yaml
@@ -20,15 +21,54 @@ Pan1c/
 ├── README.md
 ├── runSnakemake.sh
 ├── scripts
-│   ├── getPanachePAV.sh
+│   └── ...
-│   ├── inputClustering.py
-│   ├── ragtagChromInfer.sh
-│   ├── statsAggregation.py
-│   └── workflowStats.py
 └── Snakefile
 ```
+## After the workflow (Arabidopsis Thaliana example)
+The following tree is non-exhaustive for clarity. Temporary files are not listed, but key files are included.  
+```
+Pan1c-06AT-v3
+├── chrInputs
+│   
+├── config.yaml
+├── data
+│   ├── chrGraphs
+│   │   ├── chr<id>
+│   │   ├── chr<id>.gfa
+│   │   └── graphsList.txt
+│   ├── chrInputs
+│   │   └── chr<id>.fa.gz
+│   ├── haplotypes
+│   └── hap.ragtagged
+│       ├── <sample>.hap<hid>
+│       └── <sample>.hap<hid>.ragtagged.fa.gz
+├── logs
+│   ├── pan1c.pggb.06AT-v3.logs.tar.gz
+│   └── pggb
+│       ├── chr<id>.pggb.cmd.log
+│       └── chr<id>.pggb.time.log
+├── output
+│   ├── figures
+│   │   ├── chr<id>.1Dviz.png
+│   │   └── chr<id>.pcov.png
+│   ├── pan1c.pggb.06AT-v3.chrGraph.stats.tsv
+│   ├── pan1c.pggb.06AT-v3.gfa
+│   ├── pan1c.pggb.06AT-v3.workflow.stats.tsv
+│   ├── panacus.reports
+│   │   └── chr<id>.histgrowth.html
+│   ├── pggb.usage.figs
+│   └── stats
+│       └── chr<id>.stats.tsv
+├── Pan1c-06AT-v3.log
+├── README.md
+├── runSnakemake.sh
+├── scripts
+│   └── ...
+├── Snakefile
+└── workflow.svg
+```
-# Example DAG
+# Example DAG (Arabidopsis Thaliana example)
 This DAG shows the worflow for a pangenome of `Arabidospis Thaliana` using the `TAIR10.1` reference.
 ![Workflow DAG](example/workflow.svg)
@@ -49,4 +89,13 @@ Before running the worflow, some apptainer images needs to be downloaded. Use th
 Clone this repository and create a `data/haplotypes` directory where you will place all your haplotypes.  
 Update the reference name and the apptainer image directory in `config.yaml`.  
 Then, modify the variables in `runSnakemake.sh` to match your requirements (number of threads, memory, job name, email, etc.).  
 Navigate to the root directory of the repository and execute `sbatch runSnakemake.sh`!
\ No newline at end of file
+# Outputs
+The workflow generates several key files :
+- Aggregated graph including every chromosome scale graphs (`output/pan1c.pggb.<panname>.gfa`)  
+- Chromosome scale graphs (`data/chrGraphs/chr<id>.gfa`)  
+- Panacus html reports for each chromosome graph (`output/panacus.reports/chr<id>.histgrowth.html`)  
+- Statistics on input sequences, graphs and resources used by the workflow (`output/pan1c.pggb.<panname>.workflow.stats.tsv`)  
+- PAV matrices (optional) for each chromosome graph (`output/pav.matrices/chr<id>.pav.matrix.tsv`)