First, correcting a few misconceptions:
A "p <=.05" should not be read as "less than a 5% chance that the results are due to chance."
A "p <= .05" also doesn't mean "a 5% chance of a false positive."
What does mean is this - if there actually is no difference between the control and test groups (if the "null hypothesis" is true), how RARE is the data set that I have from my tests?
For example, if my study finds that adding a shot of whisky to each gallon of bee feed kills 99% of varroa in 3 days, "p <= .05" tells us that the results of my study are very rare in a hypothetical world where whisky has no effect. So, the data I got is highly unlikely to be due to random chance alone. So there is a less than 5% chance of seeing THESE RESULTS OR BETTER (such as a 99.7% kill rate rather than 99.0%) if whisky actually does nothing.
This is just talking about the data - it says nothing about how well designed the experiment is, or how well controlled, or if the data has been cherry-picked from a larger pool of results that were silently ignored by someone playing fast and loose, and tossing out outlier points. Assuming that no major errors were made, this makes most all results with "p <= .08" highly reliable results. The selection of ".05" as a threshold is arbitrary and seems capricious for the sort of things being measured in "bee science" and much of the rest of animal husbandry.
What happens in the rest of science? "p <=.05" is not good enough by a long shot. In genetics, you find statements like "p < 0.00000005" http://doi.org/10.1038/ng.3869 ( I don't think I've ever seen a harder flex on p-values anywhere.)
In physics, the threshold for particle work is two separate data sets (run the whole experiment twice) that meet a "five-sigma" criteria (about a one in three and a half million chance of the data being "noise"), so "p < 0.0000003".
The "p <= .05" rating could be described as a "two sigma" result. Physicists with "three sigma" datasets ( "p < 0.003" ) call that "evidence", but reserve the term "discovery" for five sigma datasets. It’s a little easier in physics, as if what we are looking at is real, it will also be consistent. Living things are far more variable in their reaction to everything, and harder to measure.
For example, a muon's "g-factor" is the ratio of a muon's magnetic moment to its spin. Our current understanding of particles says that this should be "2.00233183620(86)". (The parens show the uncertainty.) Brookhaven Lab on Long measured it to be slightly higher in 2001. But this was higher enough to mean that most everything we think we know about particles has to slightly wrong at a very basic level that would "rip up textbooks". Technology allowed for a more precise measurement, and Fermi could sustain a stronger beam, so the 50-foot diameter superconducting magnet was trucked from Long Island to Chicago in 2013, and after 4 years of "some assembly required", the same process was run, keeping the magnet at -450 F from 2017 until 2021.
Slightly lowers values were measured, but still "higher enough" to fundamentally change "how stuff works" at very basic levels. The average of the Bookhaven and Fermi measurements is "2.00233184122(82)". So everything after the 7th decimal digit to the right of the decimal point is still in agreement with theory, but the difference is still an indication that the "Standard Model" has severe problems. The difference from theory has a significance of 4.2 sigma, just under the 5 sigma ("5 standard deviations") that one is required to show to claim "a discovery" but this is still very compelling evidence that we've got something new going on, and that the standard model.... isn't.
So, they are stilling running the experiment, and collecting more data. But the theory boys have gone from beard stroking tweed-jacketed pontification to tearing-out of hair and rending of garments, amusing enough to prompt us experimentalist types to put down our torque wrenches and soldering irons for a moment and enjoy some well-deserved schadenfreude. 'Cause 2.00233183620(86) ain't even anywhere CLOSE to 2.00233184122(82) by the rulers we use.
***********************************************
The BEE-L mailing list is powered by L-Soft's renowned
LISTSERV(R) list management software. For more information, go to:
http://www.lsoft.com/LISTSERV-powered.html
|