Great post! Thank you (:
This comment has been removed by the author.
This comment has been removed by the author.
This has been really helpful to me, thank you!
Yes, it easier to interpret now. Thank you.
When we set parameter lower.tail=TRUE in phyper (or alternative='less' in fisher.test), p-value <= threshold suggests significant depletion. An interpretation of p-value is the probability of observing equal or more depletion when null hypothesis is true. Similar principle applies to enrichment test. Either way null hypothesis remains the same - samples are randomly chosen from population (thus no depletion nor enrichment). Does this make sense?
Thank you for reply.
My question is: Had P-value been < threshold in your example, would we call sampling a) depleted and b) enriched, respectively?
OR,
What is the null hypothesis in both the cases in above example?
Traditionally we interpret p-value > threshold as INCONCLUSIVE, since null-hypothesis can only be rejected but never be proved. I think the trend is to abandon p-value once for all: http://www.sciencemag.org/news/2017/07/it-will-be-much-harder-call-new-findings-significant-if-team-gets-its-way . http://www.stat.columbia.edu/~gelman/research/unpublished/abandon.pdf
Useful post.
I have question regarding interpretation of p-values.
In both the cases (i.e. depletion and enrichment), the p-value >0.05 (if 0.05 is my threshold), what would I interpret? I understand that it is neither enriched nor depleted in the above example!! Is that correct? In other words, if p-value would have been less than 0.05, I would say that this sampling is enriched or depleted? Thanks
Great post but the colors make it a little hard to compare fisher to phyper. Slight re-order of the colors would make it more clear.
Good catch. I fixed it. Thank you!
Hi - thanks for posting, but please update
failInPop = 54-hitInPop
to
failInPop = 52-hitInPop
Superb summary, Meng, thanks for that.
Your post was actually top-ranked in my search, so people are surely reading it! I got into reading your other posts and the one explaining the PCA is fabulous. Thank you very much for putting this much time and effort on explaining these very important--but often misunderstood--concepts on bioinformatics. Please keep posting, I'll make sure to come back!
This comment has been removed by the author.
Thank you Pablo, for the great suggestions! I would made it better if I know people actually read this. Again, thanks!
Thanks for the note. Since you begin the post talking about GO term enrichment in a set of genes (reason why I landed here during a google search), I think the example should reflect this analysis instead of the how many diamonds from a set of cards (those of us who don't play cards might not even know how many diamonds should be in a deck to start with ;)). Also, using the same color for different meaning parameters of the two functions can be a bit misleading! All best, Pablo
Thanks. You're right, flies! This is a slip of keyboard. Apparently the p-value doesn't match if copy and run the code. I corrected it.
"We subtract x by 1, when P[X ≥ x] is needed." But you subtract one for both depletion and enrichment tests...
Nice note!