Holcombe:Statistics: Difference between revisions

Latest revision as of 18:08, 26 January 2011

SUPA Sydney University Perception and Action Lab Primer editing help refcard

Recent members

• Alex Holcombe
• Ryo Nakayama

The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible -- Tukey, 1974

The great fun of information visualization is that it gives you answers to questions you didn’t know you had -- Ben Shneiderman

Jody Culham error bars lecture "Rule of thumb for 95% CIs: If the overlap is about half of one one-sided error bar, the difference is significant at ~ p < .05 If the error bars just abut, the difference is significant at ~ p< .01 works if n >= 10 and error bars don’t differ by more than a factor of 2 "

"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. "- why?

the sum of two independent random variables is distributed according to the convolution of their individual distributions

Fitting curves to data

R is often used in the lab
Python alone and with SciPy can be used easily, example here
MATLAB is sometimes used
Using MacCurveFit for OS9; rarely used

Bootstrapping

how I Holcombe:fit psychometric functions and bootstrap Howell's pages

Circular Statistics

Effect size rant

in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:

If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly. Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change.

Holcombe:Statistics: Difference between revisions

Latest revision as of 18:08, 26 January 2011

Recent members

Projects

Technical

Other

Fitting curves to data

Bootstrapping

Effect size rant

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools

@@ Line 5: / Line 5: @@
 [http://psychology.uwo.ca/JodyCulham/Courses/ErrorBars_Lecture.ppt Jody Culham error bars lecture]
+"Rule of thumb for 95% CIs:
+If the overlap is about half of one one-sided error bar, the difference is significant at ~ p < .05
+If the error bars just abut, the difference is significant at ~ p< .01
+works if n >= 10 and error bars don’t differ by more than a factor of 2
+"
 "If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. "- why?
+the sum of two independent random variables is distributed according to the convolution of their individual distributions
 ==Fitting curves to data==
@@ Line 14: / Line 21: @@
 *Using MacCurveFit for OS9; rarely used
 ==Bootstrapping==
+how I [[Holcombe:fit psychometric functions]] and bootstrap
 [http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages
 == ==
 [[Holcombe:CircularStatistics|Circular Statistics]]
+==Effect size rant==
+in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d.  An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:
+If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d.  Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions.   This can make sense in paradigms like an implicit association test ,  where the raw measure is  number of milliseconds difference between two conditions.  Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something.  So they want to divide it by some measure of how much bigger it is than random fluctuations.  So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get.
+Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world.  The errors are similar. It has a more definite meaning than the differences in response time  of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.
+Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could  increase a lot.  Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline,  the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would  account for the consequences of the baseline change.