Harvard:Biophysics 101/2007/Notebook:Michael Wang/2007-2-8

I thought it might be worth it to try implementing functions... I approached this problem by doing the counts each time a trial is completed. I'm not sure if it would have been faster to do all the trials and then store them before counting. Alternately, I guess you could keep track of consective trials as they happen. Maybe one of the CS concentrators might have an answer.

Code:


 * 1) !/usr/bin/env python

import random import string

def countRepeats(trial, search_char, matrix): for i in range (10): substr = ''.join([search_char for n in range (i+1)]) #create the substring to search for of appropriate length count = 0 pos = trial.find(substr,0) while not pos == -1: count = count + 1 pos = trial.find(substr, pos +1) matrix[i][count]+=1 #add one to the appropriate element in the 2d matrix
 * 1) CountRepeats counts the number of sequences in a single trial passed as a string
 * 2) and then counts the consecutive string occurances and adds them to the appropriate
 * 3) count matrix

def displayCounts (character, matrix): print print Title = "Number of trials with n occurances of the given consecutive string of "+character print Title.rjust(80) print "n".rjust(45) print "String".rjust(len(matrix)), for m in range(len(matrix[0])): print repr(m).rjust(6), print "Total*".rjust(8) totals = [0,0,0,0,0,0,0,0,0,0] for i in range(len(matrix)): substr = ''.join([character for n in range (i+1)]) print substr.rjust(len(matrix)), for j in range(len(matrix[i])): print repr(matrix [i][j]).rjust (6), total = 0 for k in range(len(matrix [i])): total+=matrix[i][k]*k print repr(total).rjust(7) totals[i]=total print "*Total refers to total occurances of a particular string" return totals zero_list = range(10) for i in range(10): zero_list[i]=0; print zero_list;
 * 1) DisplayCounts displays the data stored in the matrix and calculates total occurances
 * 1) Initialize a list of 10 zeros
 * 1) there must be a better way to do this....
 * 2) ARGH! I originally intended to use zero_string below, but apparently it passes a pointer instead of a value...

count_of_H_counts = [] count_of_T_counts = [] for j in range (10): count_of_H_counts.append ([0,0,0,0,0,0,0,0,0,0,0]) #wanted to use zero_string here count_of_T_counts.append ([0,0,0,0,0,0,0,0,0,0,0]) #and here
 * 1) Initialize 2d matrix to store data

for k in range (10000): #This for loop controls the number of trials performed #where each trial is ten tosses random.seed
 * 1) Run Trials

#populating the trial temp = "" for l in range (10): if random.random<0.5: temp = temp + "H" else: temp = temp + "T"

#call the countRepeats function to process the new temp trial and add to H/T matrix countRepeats(temp, 'H',count_of_H_counts) countRepeats(temp, 'T',count_of_T_counts)

Htotals=displayCounts ('H',count_of_H_counts) Ttotals=displayCounts ('T',count_of_T_counts)
 * 1) Call displaycounts to display the tables and calculate totals

print print print "Total Counts" for i in range(len(Htotals)): print repr(i+1).rjust(5),"|",Htotals[i]+Ttotals[i]
 * 1) Display the totals

Output:

Number of trials with n occurances of the given consecutive string of H                                           n    String      0      1      2      3      4      5      6      7      8      9     10   Total* H     5    102    447   1188   2065   2475   2040   1123    465     79     11   49837 HH  1429   2356   2359   1775   1097    613    251     95     14     11      0   22234 HHH  5025   2314   1366    739    339    148     44     14     11      0      0    9809 HHHH  7578   1309    655    274    115     44     14     11      0      0      0    4282 HHHHH  8922    638    256    115     44     14     11      0      0      0      0    1807 HHHHHH  9560    256    115     44     14     11      0      0      0      0      0     729 HHHHHHH  9816    115     44     14     11      0      0      0      0      0      0     289 HHHHHHHH  9931     44     14     11      0      0      0      0      0      0      0     105 HHHHHHHHH  9975     14     11      0      0      0      0      0      0      0      0      36 HHHHHHHHHH  9989     11      0      0      0      0      0      0      0      0      0      11
 * Total refers to total occurances of a particular string

Number of trials with n occurances of the given consecutive string of T                                           n    String      0      1      2      3      4      5      6      7      8      9     10   Total* T    11     79    465   1123   2040   2475   2065   1188    447    102      5   50163 TT  1371   2312   2380   1806   1156    595    239    110     26      5      0   22546 TTT  4929   2355   1444    707    345    141     48     26      5      0      0    9959 TTTT  7556   1348    639    277    101     48     26      5      0      0      0    4292 TTTTT  8941    627    252    101     48     26      5      0      0      0      0    1786 TTTTTT  9568    252    101     48     26      5      0      0      0      0      0     727 TTTTTTT  9820    101     48     26      5      0      0      0      0      0      0     295 TTTTTTTT  9921     48     26      5      0      0      0      0      0      0      0     115 TTTTTTTTT  9969     26      5      0      0      0      0      0      0      0      0      36 TTTTTTTTTT  9995      5      0      0      0      0      0      0      0      0      0       5
 * Total refers to total occurances of a particular string

Total Counts 1 | 100000   2 | 44780    3 | 19768    4 | 8574    5 | 3593    6 | 1456    7 | 584    8 | 220    9 | 72   10 | 16