McClean: Plotting Stacked Histograms

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
(Example)
(Code)
Line 96: Line 96:
=Code=
=Code=
-
You can copy and paste the code below into a Matlab m-file to run all of the examples shown above.  You will also need the two functions listed in the references below, available from the Matlab file exchange at [http://www.mathworks.com/matlabcentral/ Matlab Central].
+
You can copy and paste the code below into a Matlab m-file to run all of the examples shown above.  You will also the "Data.mat" example data:
 +
 
<pre>
<pre>
-
close all;
+
%% Preliminaries:
-
%Suppose you have the following data for two different strains across 4
+
close all; clear all;
-
%different experimental conditions (Conditions A,B,C,D, from left to right)
+
load('Data.mat')
-
Strain1_Mean=[0.5137    3.2830    1.5887    5.9188];
+
-
Strain2_Mean=[0.4042    2.9884    0.5709    2.7766];
+
-
Strain1_std=[1.1393    2.8108    2.2203    3.5233];
+
-
Strain2_std=[0.8762    2.8478    0.9878    2.2197];
+
 +
%% Define the bins to use for our data (you will need to adjust this depending on your data):
-
%Plot this data as a bar chart
+
%In this case we are using the same bins for each data set.  You probably
-
bar([1 2 3 4],[Strain1_Mean' Strain2_Mean'])
+
%want to do this when you are plotting stacked histograms.
-
legend('Strain 1','Strain 2')
+
-
pause; close all;
+
-
%This looks ok, but we would really like some error bars, so we use a handy
+
bins=logspace(0,4,60);
-
%function from the file exchange:
+
x=bins;
-
h=figure; hold;
+
-
barwitherr([Strain1_std' Strain2_std'], [1 2 3 4],[Strain1_Mean' Strain2_Mean'])
+
-
legend('Strain 1','Strain 2')
+
-
pause; close all;
+
-
%This is ok, but we'd rather only have one-sided error barsTo do this,
+
%% Bin your data using Matlabs "hist" function.   
-
%you will send barwitherr zeros for the lower error and keep the upper
+
-
%error as is by sending in the matrix cat(3,zeros(4,2),[Strain1_std'
+
-
%Strain2_std']) for the error
+
-
barwitherr(cat(3,zeros(4,2),[Strain1_std' Strain2_std']), [1 2 3 4],[Strain1_Mean' Strain2_Mean'])
+
-
legend('Strain 1','Strain 2')
+
-
pause; close all;
+
-
%Now let's use better colors by changing the color map and set the bar
+
%The variable "n" will be the number in each bin described by the variable
-
%widths, line widths, axis fonts etc to something prettier
+
%"x".  HistData will become a matrix of the normalized bins (normalized to
-
barwitherr(cat(3,zeros(4,2),[Strain1_std' Strain2_std']), [1 2 3 4],[Strain1_Mean' Strain2_Mean'],'LineWidth',2,'BarWidth',0.9)
+
%the total number of elements). Means will become a vector of the mean
-
legend('Strain 1','Strain 2')
+
%value for each distribution, which we will use when coloring our
-
%set the axis properties
+
%histograms (so that colors roughly correspond to the mean of the
-
ax=gca;
+
%distribution).
-
set(ax, 'FontSize',12)
+
 +
HistData=[]; 
 +
Means=[];
-
%Don't like the colors? You can change them by modifying the colormap:
+
for i=1:5
-
barmap=[0.7 0.7 0.7; 0.05 .45 0.1]; %[0.7 0.7 0.7] is grey, [ 0.05 .45 0.1] is a green
+
    [n,x]=hist(Data(i,:),x);
-
colormap(barmap);
+
    HistData=[HistData; n./sum(n)];
-
ylabel('Data','FontSize',14)
+
    Means=[Means mean(Data(i,:))];
-
title('Title of Experiment','FontSize',14)
+
end
-
pause;  
+
-
%It isn't very useful to have our experimental conditions labelled 1,2,3,4
 
-
%so can we change these to words? Yes:
 
-
set(ax, 'XTick',[1 2 3 4],'XTickLabel',{'A','B','C','D' });
 
-
pause;
 
-
%But this isn't perfect, maybe we want more information on the axis.  To
 
-
%have actual labels rotate them using the handy xticklabel_rotate function:
 
-
%set(ax, 'FontSize',12,'XTick',[1 2 3 4],'XTickLabel',{'Condition A','Condition B','Condition C','Condition D' });
 
-
xticklabel_rotate([1 2 3 4],45,{'Condition A','Condition B','Condition C','Condition D' })
 
-
pause
 
-
%If you are going to use this figure in a presentation or paper you can
 
-
%save it in various forms (including as a file for adobe illustrator):
 
-
%Recall that h is our figure handle:
+
%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases
-
saveas(h, 'ExampleBar.fig','fig')
+
 
-
saveas(h, 'ExampleBar.png','png')
+
% In this case we chose to make the histograms brighter green at higher
-
saveas(h, 'ExampleBar.ai','ai')
+
% mean values since the flow cytometry data is of GFP.
   
   
-
  close all;
+
%Define a color map
 +
MMColorMap=zeros(5,3);
 +
 
 +
%Define colors so that they scale with the difference between the mean
 +
%fluorescence at a given timepoint and the mean at time 0
 +
 
 +
 
 +
MM=sort(Means);
 +
MMdiff=Means-Means(1);
 +
MMdiff=MMdiff./(max(MMdiff));
 +
 
 +
 
 +
MMColorMap(1:end,2)=MMdiff;
 +
 
 +
%Set up the figure and axis properties:
 +
h=figure; hold; colors=colormap;
 +
 
 +
set(gca,'XScale','log')
 +
set(gca,'XLim',[10,2000])
 +
set(gca,'PlotBoxAspectRatioMode','manual')
 +
set(gca,'PlotBoxAspectRatio',[1 3 1])
 +
set(gca,'FontSize',12)
 +
set(gca,'XTick',[100 1000 10000 100000])
 +
set(gca,'YTick',[0 1])
 +
ylabel('Fraction of Cell Population','FontSize',14)
 +
xlabel('Fluorescence [a.u.]','FontSize',14)
 +
 
 +
%% Plot the histograms along the y-axis
 +
 
 +
spacing=.15; %Spacing along the y-axis chosen empirically
 +
 
 +
for i=1:5
 +
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
 +
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
 +
end
 +
 
 +
 
 +
%% Save the histogram figure
 +
saveas(h,'ExampleStackedHistograms','fig')
 +
saveas(h,'ExampleStackedHistograms','png')
 +
saveas(h,'ExampleStackedHistograms','ai')
 +
saveas(h,'ExampleStackedHistograms','pdf')
</pre>
</pre>

Revision as of 18:17, 17 July 2013

Contents

Summary

This explains the basics of plotting histograms stacked vertically (this allows you to see the shift, in for instance, fluorescence in a population of cells analyzed by flow cytometry).

Example

Your data could be anything. In this example, the variable "Data" contains five (5) rows, each of which contain 9000 fluorescence readings from a FACS experiment. Each row represents a timepoint, with induction of GFP increasing with time.


%% Preliminaries:

close all; clear all;
load('Data.mat')

Chose bins (you probably want to use the same bin for every plot, since you will be stacking them along the same y-axis) and then bin your data using the Matlab "hist" command. We also keep track of the distributions' means since we use this to color the histograms later.

%Set up bins (we are making histograms of flow cytometry data so we chose logarithmically spaced bins):
bins=logspace(0,4,60);
x=bins;

%Bin the data using "hist" and keep track of the number of elements "n" in each bin "x" for each row in "Data".  Also keep track of the mean of each row of "Data":

HistData=[];  
Means=[];

for i=1:5
    [n,x]=hist(Data(i,:),x);
    HistData=[HistData; n./sum(n)];
    Means=[Means mean(Data(i,:))];
end

We set up a colormap so that our histograms change in color as the mean of their distribution increases:


%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases

% In this case we chose to make the histograms brighter green at higher
% mean values since the flow cytometry data is of GFP.
 
%Define a color map
MMColorMap=zeros(5,3);

%Define colors so that they scale with the difference between the mean
%fluorescence at a given timepoint and the mean at time 0


MM=sort(Means);
MMdiff=Means-Means(1);
MMdiff=MMdiff./(max(MMdiff));


MMColorMap(1:end,2)=MMdiff;

%Set up the figure and axis properties:
h=figure; hold; colors=colormap;

set(gca,'XScale','log')
set(gca,'XLim',[10,2000])
set(gca,'PlotBoxAspectRatioMode','manual')
set(gca,'PlotBoxAspectRatio',[1 3 1])
set(gca,'FontSize',12)
set(gca,'XTick',[100 1000 10000 100000])
set(gca,'YTick',[0 1])
ylabel('Fraction of Cell Population','FontSize',14)
xlabel('Fluorescence [a.u.]','FontSize',14)


Plot the histograms along the y-axis. We choose the spacing variable empirically so that the plot "looks good":


spacing=.15;  %Spacing along the y-axis chosen empirically 

for i=1:5
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
end

Image:ExampleStackedHistograms.png

Save your figure in a variety of formats for later use (recall that we made h our figure handle):

saveas(h,'ExampleStackedHistograms','fig')
saveas(h,'ExampleStackedHistograms','png')
saveas(h,'ExampleStackedHistograms','ai')
saveas(h,'ExampleStackedHistograms','pdf')
 

Code

You can copy and paste the code below into a Matlab m-file to run all of the examples shown above. You will also the "Data.mat" example data:

%% Preliminaries:

close all; clear all;
load('Data.mat')

%% Define the bins to use for our data (you will need to adjust this depending on your data):

%In this case we are using the same bins for each data set.  You probably
%want to do this when you are plotting stacked histograms.

bins=logspace(0,4,60);
x=bins;

%% Bin your data using Matlabs "hist" function.  

%The variable "n" will be the number in each bin described by the variable
%"x".  HistData will become a matrix of the normalized bins (normalized to
%the total number of elements).  Means will become a vector of the mean
%value for each distribution, which we will use when coloring our
%histograms (so that colors roughly correspond to the mean of the
%distribution).

HistData=[];  
Means=[];

for i=1:5
    [n,x]=hist(Data(i,:),x);
    HistData=[HistData; n./sum(n)];
    Means=[Means mean(Data(i,:))];
end



%% Define a colormap for the histograms that will make the histograms brighter as the mean of the distribution increases

% In this case we chose to make the histograms brighter green at higher
% mean values since the flow cytometry data is of GFP.
 
%Define a color map
MMColorMap=zeros(5,3);

%Define colors so that they scale with the difference between the mean
%fluorescence at a given timepoint and the mean at time 0


MM=sort(Means);
MMdiff=Means-Means(1);
MMdiff=MMdiff./(max(MMdiff));


MMColorMap(1:end,2)=MMdiff;

%Set up the figure and axis properties:
h=figure; hold; colors=colormap;

set(gca,'XScale','log')
set(gca,'XLim',[10,2000])
set(gca,'PlotBoxAspectRatioMode','manual')
set(gca,'PlotBoxAspectRatio',[1 3 1])
set(gca,'FontSize',12)
set(gca,'XTick',[100 1000 10000 100000])
set(gca,'YTick',[0 1])
ylabel('Fraction of Cell Population','FontSize',14)
xlabel('Fluorescence [a.u.]','FontSize',14)

%% Plot the histograms along the y-axis

spacing=.15;  %Spacing along the y-axis chosen empirically 

for i=1:5
    fill([x(1);x'; x'],[i*spacing; (HistData(i,:)+i*spacing)'; ones(1,length(x))'*i*spacing],MMColorMap(i,:),'LineStyle','none')
    semilogx(x,HistData(i,:)+i*spacing,'LineWidth',3,'Color','k');
end


%% Save the histogram figure
saveas(h,'ExampleStackedHistograms','fig')
saveas(h,'ExampleStackedHistograms','png')
saveas(h,'ExampleStackedHistograms','ai')
saveas(h,'ExampleStackedHistograms','pdf')

Notes

Please feel free to post comments, questions, or improvements to this protocol. Happy to have your input!

  • Megan N McClean 17:27, 11 June 2012 (EDT): There are probably more elegant ways of doing this, but this solution has worked well for me so far. Please feel free to update and add information as you figure out better ways of doing this.

References

Function xticklabel_rotate: xticklabel_rotate

Function barwitherr: barwitherr

Contact

or instead, discuss this protocol.



Personal tools