User guide

Varieties module



Figure 1. The overview of cannabis genomes from 8 varieties

01.    Variety – Eight cannabis varieties collected by CannabisGDB. Click the name will go to the introduction page of a specific variety.

02.    Symbol – A abbreviation that is used to represent certain variety in the database

03.    Gender – The Sex of Cannabis plant

04.    Genome size – The total amount of DNA contained within one copy of a single complete genome

05.    Repeat – Repetitive sequence length as a percentage of total genome length

06.    Number of genes – The number of protein-coding genes

07.    Anchoring rate – The percentage of the sequence mapped to the chromosome to the total genome sequence

08.    Completeness – BUSCO Integrity Assessment score

09.    Study – The source of genome


Figure 2. The variety information page (showing an example of CsJLD)

01.    Variety image – The image of mature female inflorescences

02.    Overview – Background information of the variety

03.    Statistics – Statistics of assembly information and analysis results

04.    Go to GenomeBrowser – Click it will go to genome browser


Gene loci module 


Figure 3.The gene loci table (showing an example of CsFN)

01.    Variety tab – Choose Variety

02.    Chromosome/Scaffold ID – The chromosome or scaffold number. Click chromosome ID, e.g. Chr1, will display the genes on this chromosome/scaffold.

03.    Length – The chromosome or scaffold total length (base pare, bp)

04.    Scaffold count – The number of scaffolds used for assemble a chromosome

05.    Scaffold N50 – The sequence length of the scaffold at 50% of the total chromosome length

06.    N count – The number of ‘N’ in chromosome, ‘N’ is any nucleotide

07.    GC% - the percentage of nitrogenous bases in DNA molecule that are either guanine (G) or cytosine (C).

08.    Gene number – The number of protein coding gene in this chromosome or scaffold

09.    Minimum gene length – The length of the minimum gene in this chromosome

10.    Maximum gene length – The length of the maximum gene in this chromosome

11.    Gene N50 – The sequence length of the gene at 50% of the total gene set


Metabolites module


Figure 4. The metabolite content information (showing an example of the study of Meijer et al., 1995)

01.    Tissue tab – Choose tissue

02.    Study tab – Choose study

03.    Legend – The types of metabolites. Users can click on each metabolite to show or hide it.

04.    Sample bar – Each bar shows the metabolite content (percentage) of a variety.

05.    Detail info – When the mouse stays in sample bar, the value of metabolite content (percentage) appears.

06.    Variety – Varieties investigated in this study


Proteins module


Figure 5.The protein information (showing an example of the study of Mamone et al., 2019)

01.    Tissue tab – Choose tissue

02.    Study tab – Choose study

03.    Export – Export the table in 'xlsx' format according to the proteins selected in the check box

04.    Check box – Choose one or more proteins

05.    Uniprot accession – The Uniprot ID, click it will go to Uniprot introduction website. Data are derived from original research.

06.    Protein info – Descriptions of protein. Data are derived from original research.

07.    Best match gene – The protein-coding gene identified in CannabisGDB, click it will go to the gene page




Figure 6. The search page

Step 1: Choose a variety.

Step 2: Choose a search type.

Step 3: Enter your search keyword.

Step 4: Get search results.

Keyword format – Users can search gene by gene ID and other database description. The examples of valid format are listed in the help box.

WordCloud – Display popular keywords searched by users, the frequency of the search will be reflected by the size of the font. Users can directly click on the keyword to see the items.


Figure 7. The search result table

01.    Ascend – Sort in ascend order

02.    Descend – Sort in descend order

03.    Selected columns – Choose the columns that need to be displayed

04.    Data filter – Filter results using keyword

05.    Filter conditions – Filter results using combined criteria

06.    Edit filter conditions – Filter results using advanced criteria

07.    Export to excel – Export all results to Excel table


Figure 8. The gene page (showing an example of the gene CsFN_06G0010770 )

01.    Gene identification – The variety and ID of the gene

02.    Gene attributes – The position of the gene on the chromosome/scaffold

03.    Gene structure – The gene structure presented by JBrowse. Click ‘Go to JBrowse’ will jump to genome browser

04.    Gene functional annotation – The gene functional annotation, include Nr, Uniprot, KEGG, GO and Pfam.

05.    Gene orthogroups – Other genes belonging to the same orthogroup.

06.    Gene expression – The gene expression level in different studies. The gene transcriptional expression in different studies presenting by bar chart.

07.    Gene sequences – Provide genomic, CDS, protein and cDNA sequences in FASTA format.


Genome browser 


Figure 9.The genome browser (showing an example of Jamaican Lion DASH)

01.    Toolbar – Move, zoom in and zoom out the display area.

02.    Functional annotation track – Choose and display the functional annotation

03.    Gene structure track – Choose and display gene structure

04.    Iso-seq track – Choose and display ISO-seq data to support gene prediction

05.    Genome sequence track – Choose and display genome sequence

06.    Repeat sequence track – Choose and display repeat sequence

07.    RNA-seq track – Choose and display RNA-seq data to display gene expression level





Figure 10.The BLAST page (showing an example of the gene CsFN_06G0010770 )

Step 1: Paste sequence(s) or drag file into the search box.

Step 2: Choose the BLAST database

Step 3: Set parameters. (Option)

Step 4: Choose the software and run BLAST.


Figure 11. The BLAST result page (showing an example of the gene CsFN_06G0010770 )

01.    Overview of all hit sequences - Similarity between a query sequence and sequences within CannabisGDB was listed from the highest percent identity.

02.    Detailed information - Alignment for each HSP (High Scoring Pairs).




Figure 12. The heatmap tool (showing an example of Finola)

Step 1: Choose reference genome.

Step 2: Choose project, please find the details of each project under the 'Help - Materials and methods - Transcriptome data' section

Step 3: Input the gene ID. Users can obtain the genes of interest from search results table using search function and paste it in bulk.

Step 4: Click ‘Go’ and draw a heatmap of gene expression.

Step 5: Save and download the heatmap

For more instructions on heat map tools, please refer to


Primer3 web


Figure 13.The primer design tool (showing an example of the gene CsFN_06G0010770 )

Step 1: Choose primer task

Step 2: Input the source sequence

Step 3: Choose primer type

Step 4: Pick primers

For more instructions on heat map tools, please refer to




Figure 14. The genome colinearity tool

01.    Chart configuration – Adjust the chart display effect

02.    Gene search – Locate the collinear region containing the query

03.    Collinear result – Display MCScanX parameters and collinear result

04.    Filter panel – Choose chromosome (CsJLD provide the longest 10 scaffolds)

05.    Chromosome layout – Show collinear result. An example of the genomic collinearity between CsFN and CsCBD is shown.




Figure 15. The enrichment tool

Step 1: Choose reference genom

Step 2: Input different expression genes (DEGs) obtained from users' experiments

Step 3: Click 'Go' and get enrichment result

Step 4: The result showing top 20 enriched GO terms (An example of enriched GO terms associated with up-regulated genes in stem relative to root in the variety of 'Therapy')




Figure 16. The download page

01.    Cannabis genome – The assembly sequences, annotation results and reference materials of cannabis genomes

02.    Cannabis transcriptome – Quantitative results of transcriptome level

03.    Cannabis protein – Raw data for protein module

04.    Cannabis metabolite – Raw data for metabolite module

05.    Cannabis collinearity – Result data for MCScanX


Get in Touch

Please Cite

Cai, S., Zhang, Z., Huang, S., Bai, X., Huang, Z., Zhang, Y. J., Huang, L., Tang, W., Haughn, G., You, S.and Liu, Y. (2021) CannabisGDB: a comprehensive genomic database for Cannabis Sativa L. Plant Biotechnol J,