This journal entry is due on Thursday, October 29, at 12:01am Pacific time.
The learning objectives for this assignment are:
- to deeply explore and perform a critical review of an existing biological database.
- to gain scientific data literacy skills.
Individual Journal Assignment
- You will be expected to work with your partner, in order to complete the assignment.
- For this week both partners are expected to contribute equally to one page; you will not have individual Week 8 pages for this week.
- You must give the details of the interaction with your partner in the Acknowledgments section of your journal assignment.
- Homework partners for this week are:
- Anna & Fatimah
- Nathan & Yaniv
- Aiden & Taylor
- JT & Nida
- Kam & Owen
- Ian & Macie
Format and Content Checklist
For this week, both partners will contribute to the same journal entry in lieu of individual journal entries.
- Store this journal entry as "database name Review" (i.e., this is the text to place between the square brackets when you link to this page).
- Both partners are expected to contribute equally to this one page; you will not have individual Week 8 pages for this week.
- Write something in the summary field each time you save an edit. You are aiming for 100%.
- Both partners should invoke their templates at the bottom of the page. The templates should contain:
- A link to your user page.
- A link to the template page itself.
- A list or table of all of the Assignment pages for the course.
- A list or table of all of your individual journal pages for the course. (You will need to edit the template to remove an individual Week 8 link and put the Database page link in instead.)
- A list or table of all the shared class journal pages for the course.
- The category "BIOL368/F20".
- Purpose: a statement of the scientific purpose of the assignment. Note that this is different than the learning objective stated on the assignment page. What science will be discovered by completing this assignment?
- Database Evaluation (see assignment below):
- You may copy the evaluation questions and paste them on your page and modify them as to what you actually did, as long as you provide appropriate attribution.
- Scientific Conclusion: a summary statement of the main result of exercise/research. It should mirror the purpose. Length should be 2-3 sentences, up to a paragraph.
- Acknowledgments section (see Week 1 assignment for more details.)
- You must acknowledge your homework partner with whom you worked, giving details of the nature of the collaboration. You should include when and how you met and what content you each contributed to the page.
- Acknowledge anyone else you worked with who was not your assigned partner. This could be the instructor, the TA, other students in the class, or even other students or faculty outside of the class.
- If you copied
wiki syntax or a particular style from another wiki page, acknowledge that here. Provide the user name of the original page, if possible, and provide a link to the page from which you copied the syntax or style.
- If you copied any part of the assignment or protocol and then modified it, acknowledge that here and also include a formal citation in the Reference section.
- You must also include this statement:
- "Except for what is noted above, this individual journal entry was completed by me and not copied from another source."
- Sign your Acknowledgments section with your wiki signature (four tildes,
- References section (see Week 1 assignment for more details.)
- Use the APA format.
- Cite this assignment page.
- Cite any protocols that you copied and modified (this must also be noted in the Acknowledgments section).
- Cite any other methods, software, websites, data, facts, images, documents (including the scientific literature) that was used to generate content on your page.
- Do not include extraneous references that you do not cite or use on your page.
Readings and Resources
Each year, the journal Nucleic Acids Research (NAR) devotes the first issue in January to biological databases. In this assignment you will evaluate a biological database published in this issue. Collectively, through presentations, you will gain experience with the breadth and depth of biological databases available on the Web:
- You and your partner will then choose your database:
- Nucleic Acids Research Database Issue Table of Contents 2020
- Nucleic Acids Research Database Issue Database List
- Make sure that the database you choose has a corresponding paper in the 2020 issue.
- You may not choose a database from NCBI, EBI, or the DNA Databank of Japan. You may not choose Ensembl, UniProt, SGD, or other major model organism database. The intent for this exercise is to pick something that is not one of the "major" databases.
- Sign up for your database by editing this page next to your names listed under the Homework Partners section. Dr. Dahlquist must approve all database choices and will affix an approved tag next to it when it is approved.
Read the article about the database from the Nucleic Acids Research journal and then go online to the database itself.
For this week, both partners will contribute to the same journal entry in lieu of individual journal entries.
You may copy these questions into your database journal entry and then answer them. In keeping with Academic Honesty and citation practices, when you answer the questions below, provide a hyperlink to the page that you got the information from, next to the answer to the question. There should be one hyperlink per answer. In your References section, you will only need to provide a formal citation to the home page of the database (as well as the article, the assignment, and any other extra resources you use).
- General information about the database
- What is the name of the database? (link to the home page)
- What type (or types) of database is it?
- What biological information (type of data) does it contain? (sequence, structure, model organism, or specialty [what?])
- What type of data source does it have?
- primary versus secondary ("meta")?
- curated versus non-curated?
- if curated, is it electronic versus human curation?
- if human curation, is it in-house staff versus community curation?
- What individual or organization maintains the database?
- public versus private
- large national or multinational entity or small lab group
- What is their funding source(s)?
- Scientific quality of the database
- Does the content appear to completely cover its content domain?
- How many records does the database contain?
- What claims do the database owners make about coverage in the corresponding paper?
- What species are covered in the database? (If it is a very long list, summarize.)
- Is the database content useful? I.e., what biological questions can it be used to answer?
- Is the database content timely?
- Is there a need in the scientific community for such a database at this time?
- Is the content covered by other databases already?
- How current is the database?
- When did the database first go online?
- How often is the database updated?
- When was the last update?
- General utility of the database to the scientific community
- Are there links to other databases? Which ones?
- Is it convenient to browse the data?
- Is it convenient to download the data?
- In what file formats are the data provided?
- What type of files, indicated by the file extension (e.g., .txt, .xml., etc.)?
- Are they standard or non-standard formats? (i.e., are they following an approved standard for that type of data)?
- Evaluate the “user-friendliness” of the database: can a naive user quickly navigate the website and gather useful information?
- Is the website well-organized?
- Does it have a help section or tutorial?
- Are the search options sensible?
- Run a sample query. Do the results make sense?
- Access: Is there a license agreement or any restrictions on access to the database?
- Summary judgment
- Would you direct a colleague unfamiliar with the field to use it?
- Is this a professional or "hobby" database? The "hobby" analogy means that it was that person's hobby to make the database. It could mean that it is limited in scope, done by one or a few persons, and/or seems amateur.
- Finally, please share why you chose this database in the first place, i.e., why did it interest you? Did it live up to the expectations you had when you chose it?
- Electronic curation occurs when someone writes a program to add information to a database record from another database.
- Manual curation occurs when a human reviews the information being added to a record to validate it as true.
- In-house is when the human works for the database organization.
- Community is when the database allows members of the scientific community that don't work for the database organization to add information to the record.
Next week in class...
During class next week, you and your partner will give an informal presentation about your database to the rest of the class. You will go through the answers to the questions on your wiki page. You will also share your screen on Zoom and give a tour of the database as you talk through the features. We will allocate 5-10 minutes per group for this.
- Compose your journal entry in the shared Class Journal Week 8 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
- Create a header with your name, and then answer the questions in your own section of the page.
- You do not need to invoke your template on the class journal page.
- Any Acknowledgments and References you need to make should go in the appropriate sections on your individual journal page.
- Sign your portion of the journal with the standard wiki signature shortcut (
- Add the category "BIOL368/F20" to the end of the wiki page (if someone has not already done so).
A set of core competencies for scientific data literacy is listed in the section below. Answer the following questions on the shared Class Journal Week 8 page:
- Which of these core competencies are being demonstrated by the individuals profiled in the two Wired articles you read?
- Name two of these core competencies are you most skilled with (or which is most familiar to you)? Where and how did you gain the skills/become familiar?
- Name two of these core competencies that you want to know more about? Why?
Scientific Data Literacy Core Competencies
- Databases and Data Formats
- Understand how to query relational databases, and be familiar with data types and formats for the discipline.
- Discovery and Acquisition of Data
- Locate and utilize disciplinary data repositories, and identify appropriate data sources
- Data Management and Organization
- Understand the lifecycle of data, and use data management plans to track subsets of processed data.
- Data Conversion and Interoperability
- Migrate data from one format to another, and understand the benefits of standard data formats.
- Quality Assurance
- Use metadata and screening procedures to recognize artifacts, incompletion, or corruption of data sets.
- Interpret metadata from external sources, and annotate data so it can be used by external users.
- Data Curation and Re-use
- Recognize the role of curation throughout the data lifecycle in its value in effective reuse of data.
- Cultures of Practice
- Know the practices, values, and norms of discipline as they relate to managing, sharing, and curating data.
- Data Preservation
- Understand the technology, resource, and organizational components of preserving data.
- Data Analysis
- Understand the basic analysis tools of their discipline including workflow management tools.
- Data Visualization
- Use visualization tools of discipline, and understand the advantages of the different types of visualization.
- Ethics, including citation of data
- Understand intellectual property, privacy, and the ethos of the discipline around sharing and citing data.