Adinulos Week 10

From OpenWetWare
Jump to navigationJump to search

Purpose

The purpose of this assignment is to critically analyze biological databases for their effectiveness and relevance, as well as determine if the database is reliable and appropriate for use within the scientific community.

Assignment

  • General information about the database
    1. What is the name of the database? (link to the home page)
    2. What type (or types) of database is it?
      • This database is a database that specifically focuses on pathogen-host interactions.
      1. What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
        • Phi-base incudes sequence data of pathogens that infect hosts.
      2. What type of data source does it have?
        • primary versus secondary ("meta")?
          • It is a secondary source
        • curated versus non-curated?
          • It is a curated database
          • if curated, is it electronic versus human curation?
            • PHI-base specifies that the database is curated by domain-experts
            • if human curation, is it in-house staff versus community curation?
              • It seems like the curation is maintained my in-house staff.
    3. What individual or organization maintains the database?
      • Researchers at the Rothamsted Research maintain the database.
      • public versus private
        • The database is open to the public
      • large national or multinational entity or small lab group
        • It seems like a small lab group
    4. What is their funding source(s)?
      • Funding comes from grants and awards from BBSCR BBR, specifically PhytoPath,
  • Scientific quality of the database
    1. Does the content appear to completely cover its content domain?
      • The content appears to cover a lot of the content domain, specific to the goals of pathogen-host interactions.
      • How many records does the database contain?
        • The database contains 6, 780 genes, 13, 801 interactions, 268 pathogens, 210 hosts, 502 diseases and 3, 454 references, for a total of 25, 015 entries.
      • What claims do the database owners make about coverage in the corresponding paper?
        • The database owners claim that PHI-base contains curated information on genes that affect pathogens as well as references to literature that describes gene alterations.
    2. What species are covered in the database? (If it is a very long list, summarize.)
      • Unfortunately, the webpages for any of the lists of the diseases and pathogens do not load correctly. But some species listed in the original article Fusarium graminearum, Magnaporthe oryzae, Ralstonia solanacearum, and many other plant pathogens.
    3. Is the database content useful? I.e., what biological questions can it be used to answer?
      • The content would be useful in determining what affects pathogens and genes infect crop and other plant species globally.
      • Is there a need in the scientific community for such a database at this time?
        • The article stated that it would be useful for solving problems related to food security, human community structures, and the biodiversity of natural ecosystems.
      • Is the content covered by other databases already?
        • The database seemed to state that they had the most updated and thorough data on plant pathogens.
    4. How current is the database?
      • The database was updated with information up to March 2019.
      • When did the database first go online?
        • The database first went online on May 4, 2005
      • How often is the database updated?
        • The database is updated twice a year.
      • When was the last update?
        • The last update was May 27th 2019
  • General utility of the database to the scientific community
    1. Are there links to other databases? Which ones?
      • There are links to other databases such as Ensembl, PHI-Nets, PHI-Canto, and PHIPO.
    2. Is it convenient to browse the data?
      • It was very difficult for me to browse the data because the data links on the homepage didn't work for me. I used an example pathogen from the article to use the search engine, the search bar seems user friendly but kind of cluttered.
    3. Is it convenient to download the data?
      • I couldn't access the pages with the individual sections to download, however there was a button to download all of the data on the database that was easily accessible. To download, a person has to fill out a registration form.
      • In what file formats are the data provided?
        • What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
          • It seems like the files are in csv format from my attempts to download information.
        • Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
          • Csv can be accessed through a spreadsheet program, which seemed to be a fairly common format for pathogen data.
    4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
      • Is the website well-organized?
        • The website was well organized and had clear sections and headings.
      • Does it have a help section or tutorial?
        • There was a help section but it didn't offer any information on how to correctly use the database efficiently.
      • Are the search options sensible?
        • There was a direct button on the task bar to access the search engine.
      • Run a sample query. Do the results make sense?
        • A sample query using a known pathogen showed all the information that the database stores on it. The page looks confusing but everything has a direct label and seems clear.
    5. Access: Is there a license agreement or any restrictions on access to the database?
      • The database is accessible to everyone, but downloading the data seems to have some restriction.
  • Summary judgment
    1. Would you direct a colleague unfamiliar with the field to use it?
      • I would not direct a colleague unfamiliar with the field to use it since I there were many links and sections that would not load and inaccessible to me.
    2. Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, and seems amateur.
      • Since the database was so specific, had broken links, and had no help tutorials it seems like the database is more of a hobby database, even though it has a lot of updated information and a research team that helped curate it.

Conclusion

  • I analyzed the content of PHI-base, a database of pathogen-host interactions, for the effectiveness and relevance to the scientific community. The analysis was also done to determine if the database could be considered reliable and should be used within the scientific community.

Acknowledgments

  • Protocol and questions were taken from the Week 10 Assignment page.
  • Except for what is noted above, this individual journal entry was completed by me and not copied from another source.

Adinulos (talk) 11:24, 1 April 2020 (PDT)

References

OpenWetWare. (2020). BIOL368/S20:Week 10. Retrieved April 1, 2020, from https://openwetware.org/wiki/BIOL368/S20:Week_10.

Urban, M., Pant, R., Raghunath, A., Irvine, A. G., Pedro, H., & Hammond-Kosack, K. E. PHI-base: the pathogen–host interactions database, Nucleic Acids Research, Volume 48, Issue D1, 08 January 2020, Pages D613–D620, https://doi.org/10.1093/nar/gkz904