Tutorial

  1. Upload contigs in FASTA format
  2. Output of FunctionAnnotator
  3. Download results
  4. Retrieve results after job finished

  1. Start a new job

On the Function Annotator web page. Click the "Analysis" link and "Start a new job”. Then

Step1. Select the kingdom of the organism the uploaded contigs are generated from.

Step2. Select the analysis modules FunctionAnnotator should perform.

Step3. Choose and upload the fasta file containing “contigs” assembled from RNA-Seq reads. (Do not upload raw reads data. The file size limitation for FunctionAnnotator is 150M).

After the fasta file is uploaded successfully, this file is submitted to the queue. The result page will auto-refresh and show the annotation results once the analysis is done. Once the file is uploaded. User may bookmark this page to check data later or jot down the job-id and come back to retrieve annotation results later.

 


  1. Output from FunctionAnnotator

After the job has finished, annotation results are presented in several tabs. On top of all these tabs is the job information including

  1. job id,
  2. file name of uploaded fasta file,
  3. file size of uploaded fasta file,
  4. number of contigs,
  5. when this job was submitted and
  6. number of contigs filtered out (no predicted amino acid product longer than 66 amino acids).

Other annotation results are shown in the tabs including:

  • Basic statistic information of the contig file uploaded.

 


  • Best hit in NCBI non-redundant database

"hits to NCBI-nr" shows results from the LAST search against NCBI's non-redundant protein database. Only top 100 sequences with the lowest E-value are shown here. For a complete list of the LAST result, user should download the table directly.


  • Taxonomic distribution of the organisms from where the best hits come

In addition to the best hit for each contig, FunctionAnnotator also explores which organisms the best hits comes from. In addition to the species information, FunctionAnnotator further includes taxonomy information for each species and displayed the taxonomy distribution for organisms where the best hits come from.


FunctionAnnotator have a built-in taxonomy database in which the tree structure of taxonomy is pre-calculated. User can select whichever level (such as phylum, class or order) he/she want to examine. FunctionAnnotator shows a bar chart for the number (proportion) of contigs having best hits from organisms belonging to each taxonomic classification.


  • Gene ontology annotation

FunctionAnnotator using Blast2GO for the GO term annotation. Users may download the GO annotation results in a text file or explore the distribution of GO terms in Biological process, cellular component or Molecular function on the website. FunctionAnnotator has parsed the tree structure of the GO terms and provides a drop-down list for user to choose whichever level of GO terms he/she want to explore.

A table of 100 contigs together with their annotated GO terms is shown in the end of the Gene Ontology. For each contig, if there are annotations for biological process, cellular component or molecular function, all thee GO terms are listed in this table.

 


  • Enzymes identification

FunctionAnnotator search all contigs in the PRIAM database and find contigs that are likely to produce enzyme products. User could also download all enzyme annotation results in a text file.

 


  • Domain identification

FunctionAnnotator annotate domains in the contig by searching against Pfam. Only hit with length longer than 50% are treated as a domain hit. The contigs and their domain hits are listed in a table and only 100 entries are shown on the website. For exploring the domain information for all contigs, user could download all domain annotation result in a text file.

 


  • Transmembrane protein identification

Transmembrane proteins are predicted by TMHMM*. A summary table lists the number of contigs having transmembrane domain identified in their protein product. FunctionAnnotator further classified contigs into two categories, one with only one predicted transmembrane domain and the other having multiple transmembrane domains.  

In addition to the statics, another table shows that the number of predicted transmembrane domain(s) and the predicted topology is also shown as a simplified plot.

*only contigs having predicted longest orf longer than 66 amino acids are used in this analysis.


  • Subcellular localization prediction

To predict the subcellular localization for the products* of contigs, FunctionAnnotator applies WoLF PSORT and PSORTb for eukaryote and prokaryote respectively. The subcellular localization includes cytosol, extracellular, mitochondria, nuclear, cytoskeleton etc. Prediction results together with prediction scores are shown in the table.  

*only contigs having predicted longest orf longer than 66 amino acids are used in this analysis.


  • Lipoprotein identification

FunctionAnnotator applies LipoP to predict the lipo-protein from the predicted products* of contigs. The prediction result of LipoP also includes prediction for cytoplasmic location, signal peptide, lipoprotein signal peptide, etc. The number of contig for all prediction results are shown in the 1st table. The prediction results together with prediction scores are shown in the 2nd table.  

*only contigs having predicted longest orf longer than 66 amino acids are used in this analysis.


  • Signal peptide identification

FunctionAnnotator applies SignalP to predict the signal peptide cleavage sites from the predicted protein products* of contigs. The prediction results together with prediction scores are shown in the table.  

*only contigs having predicted longest orf longer than 66 amino acids are used in this analysis.

 


  1. Download annotation results in text files

Download page provides links to download all results archived in a compressed zip file. Some files containing annotation results in tab-separated text files are also listed.


  1. Retrieve previously submitted results with the job id

On the Function Annotator web page. Click link of "Analysis" and "Retrieve submitted job”. Enter the job id provided by FunctionAnnotator. The annotation result for that particular submitted job will be retrieved.