User:Matthew Whiteside/Notebook/Bayesian Networks

From OpenWetWare
Jump to: navigation, search

<!-- sibboleth --><div id="lncal1" style="border:0px;"><div style="display:none;" id="id">lncal1</div><div style="display:none;" id="dtext"></div><div style="display:none;" id="page">User:Matthew Whiteside/Notebook/Bayesian Networks</div><div style="display:none;" id="fmt">yyyy/MM/dd</div><div style="display:none;" id="css">OWWNB</div><div style="display:none;" id="month"></div><div style="display:none;" id="year"></div><div style="display:none;" id="readonly">Y</div></div>

Owwnotebook icon.png <sitesearch>title=Search this Project</sitesearch>

Customize your entry pages Help.png

Project Description

Bayesian networks (BN) are an area I want to develop a working knowledge in. The aim in this project is to identify particular research directions that employ BNs or extend BN work.

Project Goals:

  • Review BN literature
  • Identify state-of-art algorithms and tools
  • Discover applications of BNs in life sciences

Literature Review

Review of Bayesian Basics

Copied from Weng-Keen Wong, 2005.

Probability Primer

  • Random variable
refers to event and some degree of uncertainty about outcome of event
  • Probability
The relative frequency that outcomes occurs if repeated large number of times under similar conditions
"Bayesian" definition: probability is degree of belief in an outcome
  • Conditional probabilities Ρ(A = true | B = true)
Out of all outcomes in which B is true, how many also have A equal to true. Read: "probability of A conditioned on B" or "probability of A given B"
H = "have a headache"
F = "coming down with flu"
P(H = true) = 1/10, ..... P(F = true) = 1/40, ..... P(H = true | F = true) = 1/2
"Headaches are rare, flu is rarer, but if your coming down with flu, there's a 50-50 chance you'll have a headache"
  • Joint probability distribution P(A = true, B = true)
The probability of A=true and B=true.
P(H=true|F=true) = P(H=true,F=true)/P(F=true), or the probability they both occur divided by the probability the conditioned variable occurs"
Can be any number of random variables e.g. P(A=true, B=true, C=true)
For every combination of variables, need to know how probable that combination is
A B C P(A,B,C)
F F F 0.1
F F T 0.2
. . . .
The probabilities of these combos need to sum to 1.Once you have the joint probability distribution, you can calculate any probability involving A,B and C.
E.g. P(A=true) = sum of P(A,B,C) in rows with A=true
P(A=true, B=true | C=true) = P(A=true, B=true, C=true) / P(C=true)
for k boolean random variables you need table of size 2k
Indepedence reduces the number of table entries.
  • Independence
for n coin flips, the joint distribution P(C1,...Cn), and if coin flips are not independent, you need 2n table entries
If independent, then P(C1,...Cn) = Πni=1 P(Ci)
Each P(Ci) has two table entries, for a total of 2n values
  • Conditional independence
A and B are conditionally independent given C, if any of the following:
  1. P(A,B|C) = P(A|C)P(B|C)
  2. P(A|B,C) = P(A|C)
  3. P(B|A,C) = P(B|C)
Knowing C tells me everything about B. I don't gain anything from knowing A. Two possibilities: A doesn't influence B, or C provides all information that A would provide.

Bayesian Networks

A Bayesian network is made up of

  1. A directed acyclic graph
  2. conditional probability distribution tables for each node.

These tables contain the conditional probability distribution P(Xi | Parents(Xi) for node Xi in graph. This only includes the immediate parents, and not higher ancestors. If you have k Parents, this table has 2k+1 probabilities (but because probabilities sum to 1, only 2k need to be stored.


  1. Encodes conditional independence between variables in graph structure
  2. Compact representation of join probability distribution over variables

Conditional independence (or Markov condition): given its parents, node (X) is conditionally independent of non-descendents. Using this markov condition we can compute joint probability distribution over all variables in BN using:

P(X1=x1,...Xn=xn) = Πni=1P(Xi=xi|Parents(Xi))

Inference: Using a BN to compute probabilities is called inference. Usual form is P( X | E), X = query variables, E = evidence variables. Exact inference is possible in med-small networks. Must use approximate techniques for large networks. Can also have many unobserved values.

Design: Either you can get an field expert to design BN structure, or you can try and learn it from data.