Directory structure

 

To prove skills in the field off Guerilla Analytics folder structure, a previously unstructured file (named Daur2) has been re-sorted according to the Guerilla analytics principles. To showcase the file structure, the fs package command dir_tree will be performed

Guerilla analytics folder structure gives a couple of guidelines to adhere too:

  • create a separate folder for each analytics project

  • Do not deeply nest folders

  • Keep information about the data close to the data

  • Store each dataset in its own sub-folder.

  • Do not change file names/move them

In my example below, you can see that in my original directory, there was no clear separate folder for each analytics project, with folders just being called “Formatieve_Opdracht1” (Translated: Formative Assignment 1) or “lesson 1-6”. Furthermore, the original directories did not contain any form of structure in terms of .R / .RMD files, they were just scattered throughout the main directory. So, in the organised directories, every project has a separate R and RMD directory for these files.

Organised directory tree

fs::dir_tree(path = here::here("Daur2_Organisation_2.1/Daur2_Organized/"), type = "directory")
## C:/Users/pedro/Documents/R_DS2/DSFB2_Portfolio/Daur2_Organisation_2.1/Daur2_Organized
## ├── Extra
## ├── FinalAssesment
## │   ├── afbeeldingen
## │   ├── counts
## │   ├── R
## │   └── RMD
## │       └── V1
## ├── FormativeAssignment_Metagenomics
## │   ├── fastqc
## │   │   ├── HU2_MOCK2_L001_R1_001_fastqc
## │   │   │   ├── Icons
## │   │   │   └── Images
## │   │   └── HU2_MOCK2_L001_R2_001_fastqc
## │   │       ├── Icons
## │   │       └── Images
## │   └── mock2
## ├── FormativeAssignment_RNAseq
## │   ├── bam_dir
## │   ├── counts_dir
## │   ├── daur2_formativeassignment_2_files
## │   │   └── figure-html
## │   ├── hg38_index
## │   ├── R
## │   └── RMD
## │       └── V1
## ├── MetaGenomics
## │   ├── fastqc_waternet
## │   │   ├── HU1_MOCK1_L001_R1_001_fastqc
## │   │   │   ├── Icons
## │   │   │   └── Images
## │   │   └── HU1_MOCK1_L001_R2_001_fastqc
## │   │       ├── Icons
## │   │       └── Images
## │   └── mock1
## └── RNAsequencing
##     ├── bash
##     ├── data
##     └── R

Original directory tree

fs::dir_tree(path = here::here("Daur2_Organisation_2.1/Daur2_Original/"), type = "directory")
## C:/Users/pedro/Documents/R_DS2/DSFB2_Portfolio/Daur2_Organisation_2.1/Daur2_Original
## ├── eindopdracht
## │   ├── afbeeldingen
## │   └── counts
## ├── formatieve_opdracht1
## │   ├── bam_dir
## │   ├── counts_dir
## │   ├── daur2_formativeassignment_2_files
## │   │   └── figure-html
## │   └── hg38_index
## ├── formatieve_opdracht2
## │   ├── fastqc
## │   │   ├── HU2_MOCK2_L001_R1_001_fastqc
## │   │   │   ├── Icons
## │   │   │   └── Images
## │   │   └── HU2_MOCK2_L001_R2_001_fastqc
## │   │       ├── Icons
## │   │       └── Images
## │   └── mock2
## ├── lesson1
## ├── lesson2
## │   └── counts
## ├── lesson3
## ├── lesson4
## └── lesson6
##     ├── fastqc_waternet
##     │   ├── HU1_MOCK1_L001_R1_001_fastqc
##     │   │   ├── Icons
##     │   │   └── Images
##     │   └── HU1_MOCK1_L001_R2_001_fastqc
##     │       ├── Icons
##     │       └── Images
##     └── mock1