Non: Week 10

The purpose of this week's lab was to explore a new database, analyzing its content, merit, and usability for scientists.

Database Evaluation Questions

  • General information about the database
    1. What is the name of the database? (link to the home page)
    2. What type (or types) of database is it?
      1. What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
        • Genetic relationship (sequences) to cancer phenotypes
      2. What type of data source does it have?
        • primary versus secondary ("meta")?
          • Secondary, the database links to other different databases on the bottom of the page
        • curated versus non-curated?
          • Curated
            • If curated, is it electronic versus human curation?
              • Unclear based on wording. It seems like a combination of both; The contact page lists data curation staff while the wording of the about page mentions graphing algorithms.
                • if human curation, is it in-house staff versus community curation?
                  • In-house: The about page lists three data curation staff: Luana Licata, Livia Perfetto, Marta Iannuccelli
    3. What individual or organization maintains the database?
      • No name for an organization is given but a list of individuals is given: Marta Iannuccelli, Elisa Micarelli, Prisca Lo Surdo, Alessandro Palma, Livia Perfetto, Ilaria Rozzo, Luisa Castagnoli, Luana Licata, Gianni Cesareni
        • public versus private
          • Public
        • large national or multinational entity or small lab group
          • small lab group
    4. What is their funding source(s)?
  • Scientific quality of the database
    1. Does the content appear to completely cover its content domain?
      • The content does not appear to completely cover all cancer causing genes. In the Materials and Methods section, they appeared to curate the different types of phenotypes for clarity.
        • How many records does the database contain?
          • The database makes use of two cancer-related gene lists which had 710 genes/354 tumor types and 4145 genes/710 tumor types respectively.
        • What claims do the database owners make about coverage in the corresponding paper?
          • The database owners do make mention that their coverage, while as extensive as possible, could not possibly document every gene. They also mention that 20% of the genes did not have much information about them.
    2. What species are covered in the database? (If it is a very long list, summarize.)
      • They do not explicitly say, but the implication seems to be only Homo sapiens.
    3. Is the database content useful? I.e., what biological questions can it be used to answer?
      • The database is useful in visually seeing the connection between a large variety of genes and different tumor types. They have fairly easy to understanding mapping algorithm to view the relationship between these genes. The database can be used to determine how closely related cancer genes are to each other and to the cancer phenotype itself.
    4. Is the database content timely?
      • It is unclear when the website was last updated but there are multiple citations referencing 2018. Also, the gene lists they used each make reference to updates in 2019 [1], [2]
        • Is there a need in the scientific community for such a database at this time?
          • While cancer gene databases are not rare, this database is useful in visualizing the pathways and relationships between genes and tumors, which as far as I know have not been made in other databases.
        • Is the content covered by other databases already?
          • The identity of cancer genes and tumor types are found in other cancer gene databases; pathway information with this type of graphic visualization is new
    5. How current is the database?
      • Again, there is no "last updated" date but the lists that the database were drawn from were last updated in 2019.
        • When did the database first go online?
          • Also not explicitly mentioned on the website. Judging by the accompanying article, it likely came online in 2019.
        • How often is the database updated?
          • The database does not keep a log or release notes of updates.
        • When was the last update?
          • It is unclear when the last update was.
  • General utility of the database to the scientific community
    1. Are there links to other databases? Which ones?
      • The website links to numerous other databases along the bottom of the page: Signor, COSMIC, DisGeNet, Disease Ontology, MeSH, UniProt, PubChem, Mentha, Elixir-Europe, Mint
    2. Is it convenient to browse the data?
      • It is relatively straightforward to browse for particular genes and tumor types. The charts are easy to generate.
    3. Is it convenient to download the data?
      • They allow you to download Cancer MiniPathways and a curated list of tumors.
      • In what file formats are the data provided? .txt files
        • What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)? .txt files
        • Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)? standard format, with tab-delimiting
    4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
      • I think it is relatively user friendly to use, especially if you have a particular gene or tumor in mind. Finding/generating your own pathways is a little more complicated without indepth knowledge of the subject.
        • Is the website well-organized?
          • The website has very simple organization but it does the job. All necessary tools are available on the homepage.
        • Does it have a help section or tutorial?
          • There is a tutorial option that is helpful in explaining how the tools work and how the figures are generated.
        • Are the search options sensible?
          • There aren't advanced options available but you can use gene names and UniProt ids to find genes.
        • Run a sample query. Do the results make sense?
          • When running a search on a random gene I found on a different pathway, it showed all the connected pathways with this particular gene.
    5. Access: Is there a license agreement or any restrictions on access to the database?
      • The legal section makes note that one of the lists the database uses DisGeNet can only be used for education and research purposes.
  • Summary judgment
    1. Would you direct a colleague unfamiliar with the field to use it?
      • I would recommend a colleague to use this website as a starting off point to learn more about the genes involved in a tumor pathway. While the info on the database site itself is relatively limited, it could be used to get started on further research.
    2. Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, and seems amateur.
      • This is a professional website, worked on be numerous scientists in the Department of Biology at the University of Rome.

Scientific Conclusion

CancerGeneNet is a useful concept for a database, combining lists of previously discovered cancer genes and tumors with a graphical representation of their relationships to each other. While the database does not provide much new information, it is useful as a starting off point in finding relationships between genes and cancer phenotypes.


  • I copied and modified the Week 10 Protocol to complete this lab.
  • Except for what is noted above, this individual journal entry was completed by me and not copied from another source.

Non (talk) 22:50, 1 April 2020 (PDT)


  • OpenWetWare. (2020). BIOL368/S20:Week 10. Retrieved March 26, 2020, from
  • Rigden, D. J., & Fernández, X. M. (2020). The 27th annual Nucleic Acids Research database issue and molecular biology database collection. Nucleic Acids Research, 48(D1), D1-D8. doi: 10.1093/nar/gkz1161
  • Marta Iannuccelli, Elisa Micarelli, Prisca Lo Surdo, Alessandro Palma, Livia Perfetto, Ilaria Rozzo, Luisa Castagnoli, Luana Licata, Gianni Cesareni, CancerGeneNet: linking driver genes to cancer hallmarks, Nucleic Acids Research, Volume 48, Issue D1, 08 January 2020, Pages D416–D421,