User:Janet B. Matsen

From OpenWetWare
Revision as of 17:49, 14 March 2014 by Janet B. Matsen (talk | contribs) (Why I love ggplot2 (and R))
Jump to: navigation, search

Janet Matsen 2013 bench photo.png

Janet B. Matsen

Department of Chemical Engineering
Seattle, Washington

I am a 3rd year Chemical Engineering PhD student at the University of Washington working in Mary Lidstrom's Lab. We are engineering E. coli to make biofuel precursors from electricity and CO2 using a metabolic pathway that doesn't exist in nature. Success would enable production of biofuel from renewable electricity. Getting the pathway to work in living cells has been challenging. We are combining metabolic engineering, synthetic biology, metabolomics, enzyme engineering, and directed evolution in E. coli and a novel methylotroph to achieve this goal.

My first year was spent investigating methanotrophic metabolism in pure cultures and a model ecosystem in a team that combined transcriptomics, metabolomics, and single-cell observation.

Janet Matsen.png

I started the Lidstrom Lab OWW wiki and love posting what I learn! It has been a lot of fun to record what I have learned about lab techniques, and my pages are viewed by many scientists outside the lab. This wiki is also a fun place me to share tips/tricks and results of experiments that probe dogma in experimental techniques.

Research Interests

  • production of chemicals using microbes
  • metabolic engineering
  • synthetic biology
  • transcriptomics
  • chemical engineering


PhD (in progress) University of Washington, Seattle
B.S. University of California, Berkeley
  • Chemical Engineering, 2010


  1. Matsen, Yang, Stein, Beck, & Kalyuzhnaya. Global molecular analyses of methane metabolism in methanotrophic alphaproteobacterium, Methylosinus trichosporium OB3b. Part I: transcriptomic study. Frontiers in Microbiology (open access), 2013
  2. Yang, Matsen, Konopka, Green-Saxena, Clubb, Sadilek, Orphan, Beck, & Kalyuzhnaya. Global molecular analyses of methane metabolism in methanotrophic Alphaproteobacterium, Methylosinus trichosporium OB3b. Part II. metabolomics and 13C-labeling study. Frontiers in Microbiology (open access), 2013

Awards & Activities

  • 2012 honorable mention for the National Science Foundation's Graduate Research Fellowship Program


  • 2011-present Outreach Coordinator for the Puget Sound chapter of the American Institute of Chemical Engineers
    • Leading a mentoring project with 8 chemical engineering mentors and 8 students from the Technology Access Foundation Academy in Kent, WA.
  • 2010-2011 Outreach Coordinator for the University of Washington chapter of the American Chemical Engineering Society
    • Organized two half-day and one all-day events for students from MESA, the Math, Engineering, Science Achievement organization of Washington, involving 60 volunteer- hours and resulting in 660 student-hours of outreach to disadvantaged minority students.
  • Misc. outreach:
    • Gave a presentation to high school students describing statistical challenges associated with transcriptomics research.
    • Hosted a booth at Engineering Discovery Days at University of Washington, engaging and educating the public about chemical engineering.

My Personal Pages

Tools to Share

APE annotation library generator & list of primers to share with our lab

  • Ape Annotation Feature Library Creator
    • This is an R script that converts the info in my list of primers into a file that I can use to annotate DNA files in APE with. It:
      • trims out sequences not intended for sequencing such as Gibson assembly primers
      • makes a label that combines the unique primer number, the melting temperature, and the letter F or R for forward or reverse, and an asterisk if you should consult the primer spreadsheet comments before using it
      • assigns colors in APE that communicate whether it primers in the forward direction or the reverse direction.
      • saves the info in the format APE needs, with the date it was generated in the title.
    • This allows me to instantly see where all of the primers I own bind to a DNA sequence for a given project I am working on. It also allows me to share these primers very easily; by sharing the file it outputs allows my lab mates to instantly see if I have any primers that can be used in their project. It has been very handy for them!
    • I am happy to help friends modify this script to be useful with their own primer libraries! No R experience is necessary.
    • Anyone can access my most current primer "Annotation Feature Library" here. You can also see the files used to generate it there.

Use notes

  • If the primer binds in the forward direction, the primer will be light gray
  • If the primer binds in the reverse direction, it will be dark gray
  • If the primer binds in the opposite direction stated in my primer table, it will appear red. (If it says F in the primer name, it is a reverse primer & vice versa.)
demo of APE primer library tool
  • Examples:
    • Primer 7 is VF2 in BioBricks. Primer 60 is its reverse compliment. In a biobrick vector, it appears light gray for 7 and dark gray for 60. pCM66 happens to have this same sequence in the region upstream from the multiple cloning site, except it is REVERSED. Both primers will appear red as they bind in the opposite direction expected.
    • I designed some primers for a Kan cassette. The Kan cassette in pCM66 is read in the reverse direction, so all the primers built for a forward Kan cassette appear red.
      Kan primers binding in the opposite direction relative to my database appear red

Skills I'm developing

  • molecular biology
  • enzyme assays
  • mass spectrometry based metabolomics
  • R & ggplot2
  • Inkscape
  • Gibson cloning

Why I love ggplot2 (and R)

R is a very easy language for people with experience to pick up, and it is one of the easiest for people without experience as well. It is definitely the best language for noodling with data and doing statistics.

R was developed at the Hutch, but is a "big deal" worldwide. ggplot2 is a more recent package that can be used within R. To get a sense of its power, just type "ggplot2" into google images. The book that introduces the fundamentals is freely available online.

I like to use ggplot2 for two main few reasons: (1) Layers. Imagine you have defined a plot called p in a program. If you want to add anything to the plot, you just say p + thing. You can just layer in data, aesthetics, statistics, etc. You can also make one base plot, then make a bunch of variants of it by adding different layers of interest. It is hard to imagine going back once you have this freedom. Layers have different types of geometries you can apply. (2) Facets. Biological data is complex! Experimental data is complex too. ggplot2 can help you plot complex data by spatially separating out variables.

Having scripts that I can recycle when doing similar experiments allows me to do in-depth quality checks and make summary statistics that would be impractical to do with Excel. The quality check plots are automatically generated for each data set. Then I add plots that are specific to the questions I was investigating with my experiment. This series of plots paint a story about how the experiment was performed, what variables were important, and what the key findings are. I can compare elements of different experiments by comparing similar plots that are generated (almost) automatically for each experiment with the (almost) identical scripts. Here is a sample folder sample folder] from an experiment I did recently to get a small sense.