Reference Datasets
Pawsey hosts a number of life science reference datasets centrally to save users from repeatedly downloading the same common datasets. These are hosted on /scratch/references/
. Additional references can be added if there is sufficient user interest. If there is something you would like to have added, please drop us a line at help@pawsey.org.au. Below is the current list of datasets:
Directory | Description |
---|---|
10x_GRCh38_July2024 | Reference datasets for 10X sequencing (Human GRCh38) |
10x_singlecell_gene_expression (2020) | Reference datasets for 10X single cell gene expression for human GRCh38, mouse mm10, and combine human/mouse |
10x_spatial_gene_expression | Reference datasets for 10X spatial gene expression for human GRCh38, mouse mm10, and combine human/mouse |
alphafold | Databases for CPU only AlphaFold2 module |
alphafold_feb2024 | Updated databases and weights for GPU enabled AlphaFold2 container |
ameta | Database to support https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-03083-9 |
arabidopsis_thaliana | Reference genome files for arabidopsis TAIR10 |
blastdb_update | Updated ~every monthly maintenance |
busco_db | Reference to support busco, specifically actinopterygii_odb10 and vertebrata_odb10 |
colabfold_jun2024 | Updated databases and weights for GPU enabled ColabFold container |
diamond | A faster alternative to Blast |
Foreign_Contamination_Screening |
|
human | Human reference genomes and associated files for annotSV, broad_hg19, broad_hg38, and GRCh38 |
interproscan-5.56-89.0 | References for interproscan |
kaiju | Kaiju pre-built indexes for protein sequences from RVDB-prot v26.0. |
kraken2 | Contains nt_20230502 and pluspfp_20230605 |
metagenome_atlas_2.9 | Datasets to support this tool, including adapters.fa, checkm, Dram, EggNOG_V5, GTDB_V06, GTDB_V07, phiX174_virus.fa |
mouse | Mouse reference datasets including broad_mm10, GRCm38, mm10, RNA_M25 |
qiime | References to support qiime |
sarek | References to support NF-core Sarek pipeline, including:
|
slorado | test dataset for slorado |
veba_database | Veba databases |
veba_db_v8 | Veba databases with newer version |
vep | VEP Human databases for 109_GRCh37, 109_GRCh38, 111_GRCh38 |
VirDB_20230913 | Support for https://github.com/eresearchqut/ontvisc |