Nyeo2 Week 10

From OpenWetWare
Jump to navigationJump to search


Link to my template

User Page

Link to my user page


Individual Journals

Class Journals


The purpose of this lab is to evaluate and analyze a database of our choosing and be able to share its scientific effectiveness to classmates.


  • General information about the database
    1. What is the name of the database?
    2. What type (or types) of database is it?
      1. According to the website, the database contains samples, diseases, publications, sequences, genes, CFSs, transcripts, CpG islands, miRNAs, virus-host ncRNA-associated interactions as well as TCGA miRNA and gene expression for viral integration sites.
      2. What type of data source does it have?
        • Secondary
        • Curated
          • Human curated
            • In-house staff curation
    3. What individual or organization maintains the database?
      • Public
      • School of Biomedical Informatics at The University of Texas Health Science Center at Houston.
    4. What is their funding source(s)?
      • National Institutes of Health, Cancer Prevention and Research Institute of Texas, and The University of Texas Health Science Center at Houston
  • Scientific quality of the database
    1. Does the content appear to completely cover its content domain?
      • 77,632 virus integration sites, 15,064 target genes, 123 fragile sites, 27 diseases, and 2,596 miRNA.
      • The paper does not claim to cover every single entry for the topic that they are interested in, but they do use the words "extensive" and "comprehensive" in their data collection section.
    2. What species are covered in the database?
      • Humans
    3. Is the database content useful?
      • Knowledge of virus integration sites and the site-related information can help others works on virus related pathology, virus biology, host-pathogen interaction, sequence motif discovery/pattern recognition, molecular evolution and adaption, and disease study.
    4. Is the database content timely?
      • Is there a need in the scientific community for such a database at this time? Yes (Coronavirus).
      • Is the content covered by other databases already? As far as I could find, there is no other database that covers human viral integration sites, but there is one that covers retroviral insertion sites.
    5. How current is the database?
      • Online since May 15, 2019
      • Since going online, the database does not seem to have been updated. However, on Jun 3, 2019, all the VIS's were rechecked.
      • It is unknown when the last update was.
  • General utility of the database to the scientific community
    1. Are there links to other databases?
      • There are links to the UCSC genome database, NCBI, Ensembl, TSGene, ONGene, HumCFS, miRTarBase, miRBase, HPVbase, COSMIC, RID, and HIRIS
    2. Is it convenient to browse the data?
      • The data on this website is sufficiently convenient to browse
    3. Is it convenient to download the data?
      • In what file formats are the data provided?
        • What type of files, indicated by the file extension?
          • .xlsx, Microsoft Excel
        • Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)? Not sure
    4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
      • Is the website well-organized?
        • The website is relatively well-organized. On the top of the screen, the data is placed into sections depending where the integration sites were(gene, RNA interaction, etc.) and within the searches, the components of each site are nicely placed in a table.
      • Does it have a help section or tutorial?
        • There is a help section with a downloadable user manual
      • Are the search options sensible?
        • Yes
      • Run a sample query. Do the results make sense?
        • Yes
    5. Access: Is there a license agreement or any restrictions on access to the database?
      • No restrictions
  • Summary judgment
    1. Would you direct a colleague unfamiliar with the field to use it?
      • I would direct a colleague to use this database because it seems to be the only one that covers this topic. Furthermore, the layout is simple, it is organized, and the data comes from credible sources.
    2. Is this a professional or "hobby" database?
      • Professional

Scientific Conclusion

In this lab I analyzed VISDB, a database for viral integration sites in humans. I got to see different databases and answered questions that helped me critically evaluate the ins and outs of the database.


  • I followed the protocol on the Week 10 assignment page
  • I copied and pasted the questions from the assignment page
  • Except for what is noted above, this individual journal entry was completed by me and not copied from another source

Nyeo2 (talk) 22:27, 1 April 2020 (PDT)


  • OpenWetWare. (2020). BIOL368/S20:Week 10. Retrieved April 1, 2020, from https://openwetware.org/wiki/BIOL368/S20:Week_10
  • Rigden, D.J., Fernández, X.M. (2020). The 27th annual Nucleic Acids Research database issue and molecular biology database collection, Nucleic Acids Research, Volume 48, Issue D1 Pages D1–D8, https://doi.org/10.1093/nar/gkz1161
  • Tang, D., Li, B., Xu, T., Hu, R., Tan, D., Song, X. , Jia, P., Zhao, Z. (2020), VISDB: a manually curated database of viral integration sites in the human genome, Nucleic Acids Research, Volume 48, Issue D1, Pages D633–D641, https://doi.org/10.1093/nar/gkz867