Yaniv Maddahi Journal Week 8

From OpenWetWare
Jump to navigationJump to search

Purpose

  • The purpose of this lab is to gain insight into databases and to analyze the different aspects of what goes into creating one. Further, we will be analyzing a database of our choice and using a rubric to discern the different aspects that go into considering the nature of a database. We are gaining great insight into biological databases which make up an integral aspect of bioinformatics.

General information about the database

  1. What is the name of the database? (link to the home page)
  2. What type (or types) of database is it?
    • This is a Specialty Protein Data Base About us
  3. What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
    • This database contains information on allosteric proteins and their modulators. The database now contains 1949 allosteric proteins from 327 species and 82074 modulators in three categories (activators, inhibitors, and regulators) Proteins and Modulators
  4. What type of data source does it have?
    1. primary versus secondary ("meta")?
      • This website contains primarily secondary data, giving access to other databases for subsequent research and also allowing third parties to submit work Expert submission
    2. curated versus non-curated?
      • if curated, is it electronic versus human curation?
        • It is human curation as noted by the website in stating that they will revise and post any information to their website work submission
      • if human curation, is it in-house staff versus community curation?
  5. What individual or organization maintains the database?
    • This is a professional database, run by Shanghai Jiao Tong University in China. It disallow for many to participate and to contribute, although it is highly monitored and run by the institution about us
  6. public versus private
    • It is privately run by the university although anyone can contribute to the database and submit information to be posted/published. It also outsources to other databases submit work.
  7. large national or multinational entity or small lab group
    • The database is run by the university's lab group although people all over the world can access it about us
  8. What is their funding source(s)?
    • This database is funded by the Shanghai Jiao Tong University in China about us

Scientific quality of the database

  1. Does the content appear to completely cover its content domain?
    • The database covers a wide variety of allosteric proteins, and modulators (activators, inhibitors, and regulators). However, they do not have a complete record of every single allosteric protein sequence, biological process, therapeutic area, diseases, psychochemical properties, structure search and blast, binding affinity, and modulator out there. This can be seen because they are still asking for allosteric protein information and refreshing the database.
    1. How many records does the database contain?
      • The data base contains 1949 allosteric proteins, 327 species, and 82,074 modulators in the activators, inhibitors, and regulators].
    2. What claims do the database owners make about coverage in the corresponding paper?
      • The authors have the corresponding article listed in theirliterature page, with the abstract of the article also listed. The authors also have an update that the same authors of the paper listed along with its abstract on the same page.
  2. What species are covered in the database? (If it is a very long list, summarize.)
    • A variety of species are covered, there is no focus on a specific species or genus but rather there is a variety from the three domains. This can be seen on their proteins page. Humans are the most common species with 837 protein sequences.
  3. Is the database content useful? I.e., what biological questions can it be used to answer?
    • Yes, the database content is extremely useful. Allosteric regulation is used to monitor and influence protein activity which is really crucial to know about. Those in the science field regularly study allosteric sites and their effects on protein activity. The relevance can be seen in how important allosteric regulation is on the Site page.
    1. Is the database content timely?
      • Yes, this content is very timely and people are always visiting the site. The site reported an average of a little less than 500 visits per month on the main site page. There are also several site publications dating as recent as 2019 literature page.
    2. Is there a need in the scientific community for such a database at this time?
      • Yes, there is a scientific community need for the structures, proteins, and modulators of allosteric regulators. Since 2010, there have been over 50 sites that have cited this database in their papers, as seen on the literature page. This means that this database is necessary and published work has used its tools.
  4. Is the content covered by other databases already?
    • According to Huang et. all 2011, article the Allosteric Database is the first database that contains the "display, search and analysis of structure, function and related annotation for allosteric molecules" (Huang et. al. 2011). There could be other sites that contain some sequences and structures, however, it is not as early or in-depth as the allosteric Database. As seen on the tools page, the allosteric database lists resources for other websites that provide resources on how to further study allosteric regulators.
  5. How current is the database?
    • The website copyright is as recent as 2020, but there are no indicators when the site was last published. Their last release, however, was on the 26th of September in 2019 which can be seen on their release page. The website has http://mdl.shsmu.edu.cn/ASD/module/misc/literature.jsp%7C on their literature page], shows sources that have used contents from their website from 2010 up until 2019. This means that the website is relatively current as people are still using their contents in scientifically published papers.
  6. When did the database first go online?
  7. How often is the database updated?
    • The website is updated whenever there is a new ASD release, which can be seen on their release page. The front page does show how many people have visited the website, which was updated this month.
  8. When was the last update?
    • The last observable update was on the 29th of September 2019, which is seen on their release page.

General utility of the database to the scientific community

  1. Are there links to other databases? Which ones?
    • Yes, on the allosteric database's site there is a link page containing several databases. They have a couple of "databases of drugs and drug targets", which are the PDBBind database, BindingDB database, DrugBank database, GlIDA; GPCR-Ligand Database, BJP Guide to Receptors and Channels Database, and the PubChem database. They also have the CATH, SCOP, PDBSUM, and PDB structure databases listed. As well as the OMIM, OMMBID, and MalaCards disease databases listed.
  2. Is it convenient to browse the data?
    • Yes, as soon as the database is opened, the main page has tabs and is organized so that you can find whatever it is quickly and conveniently.
  3. Is it convenient to download the data?
    • Yes, they have a Download tab, which allows the user to quickly find the sequence or structure and easily download and find the data they have listed.
    1. In what file formats are the data provided?
      • The data is linked to another database with the sequence and formating on that site, which is appropriate for that site which is shown on the protein page.
    2. What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
      • The file extension is found on the other database which contains the information necessary to find the protein information and structure, the other databases are found on the protein page. If they are listed, they are listed in the .xml format. Example protein
    3. Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
      • I believe these are the approved standard formats, however, there are no fasta extensions Example protein.
  4. Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
    • As seen on the protein page, any user that desires to find a protein for a specific sequence or organism can quickly search and find it. The website is very user friendly and easy to use.
    1. Is the website well-organized?
      • Yes, even if a user doesn't know the name of the sequence they are searching for, the tabs are organized so that one can find the species and protein name quickly. They also have an search option which allows people who know what they are looking for to find it.
    2. Does it have a help section or tutorial?
      • Yes, they have a very detailed and helpful help page.
    3. Are the search options sensible?
      • Yes, they have a simple search option or an search option. Both allow the causal user to find a protein and also a user with a specific sequence to find their protein.
    4. Run a sample query. Do the results make sense?
      • I searched ATP, as that was the example they used in the help section, however, it referred me to the search page. No matter what I searched it always referred me to that page.
  5. Access: Is there a license agreement or any restrictions on access to the database?
    • No, we could not find one however they just asked the user to site their database in the work they publish in the help page.

Summary judgment

  1. Would you direct a colleague unfamiliar with the field to use it?
    • No. Personally, I think being able to use this database, while simple and with clear instructions regarding use, requires a bit of experience dealing with proteins and understanding of the different proteins, enzymes, structures, and information that follows. This includes what processes they may be a part of or how they work Molecule list
  2. Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, and/or seems amateur.
    • This is a professional database, run by Shanghai Jiao Tong University in China. It disallow for many to participate and to contribute, although it is highly monitored and run by the institution about us
  3. Finally, please share why you chose this database in the first place, i.e., why did it interest you? Did it live up to the expectations you had when you chose it?
    • We chose this database because of it's complexity, the ways in which it has vast access to several other databases, and it's transparentness with regard to how to use it about us

Conclusion

  • In this weeks assignment we were able to analyze a database of our choosing and discuss the different characteristics of the different parts that go into making a database. We discussed the types of databases and all of the differences that can go into making one.

Acknowledgements

  • I acknowledge Kam D. Dahlquist, Ph.D. for her assistance in the discussion for digital citizenship
  • I acknowledge Annika Dinulos for her assistance in our classroom discussions
  • I acknowledge Nathan R. Beshai for using his responses to some of the questions regarding the database as well as collaborating and discussing content outside of the classroom
  • I acknowledge the database we used and for its use in hyperlinks in our answers to questions

References

Template

BIOL368/F20

Yaniv Maddahi

Yaniv Maddahi Template

Assignment Week

Individual Journal Pages

Class Journal Pages