A few months ago I was rereading the SHAP NeurIPS paper and noticed error in Thm 1. Roughly, L&L Thm 1 says that the SHAP solution is unique, even if we allow prior or external information to influence the model explanation. That's actually not true...
Why divide by n-1? Understanding the sample variance with pairwise differences
Why divide by n-1 for the sample variance? Because the variance tells us how much people tend to be different from each other. And, for each person in our sample, there are only n-1 people we can compare them to. This is not just a heuristic...
Why statistics professors should grade for craftsmanship
The statistical community values skillful explanations in almost every setting except the classroom. Students are typically graded for correctness, completeness, and (sometimes) participation, but not for craftsmanship. That's a missed opportunity.
Mark Twain was a stats fan, anything else is a Damn Lie.
Many of us have heard the quote popularized by Mark Twain, that there are "lies, damned lies, and statistics." It turns out though that Twain's comment was not meant as an attack on overly complicated statistical wizardry that obfuscates the truth. If you read the quote in context, his point is actually quite the opposite...
Many statistical "gold standards" aren't perfect, but that's why they're perfectly named
Statisticians and computer scientists often use the term "gold standard" for the best possible benchmark you could have when trying to estimate something. I often hear statisticians questioning the benchmark though, saying that "it's not really a gold standard" because it isn't perfect. My response is that the term "gold standard" should still apply. Moreso even, because the actual gold standard of matching currency to gold reserves isn't perfect either...
4 minutes to run code: a live demo inside a JSM speed talk
On Monday morning, at this year's JSM, I'll be presenting at the speed talks session on Epidemiology and Imaging. My plan is to attempt something a little unconventional: to live demo a new method for a computationally intensive procedure (bootstrap PCA), within a presentation that's limited to just 4 minutes...
ggBrain - An R package for beautiful brain figures
Check out this lovely brain image figure! I wrote an R package called ggBrain that lets you generate figures like these with just a couple lines of code!
Mission Statement for the Blog
In this blog, I'll talk about issues in statistics from a graduate student's perspective. Some specific topics include: surviving a PhD, enjoying a PhD, designing intuitive graphics, and dealing with high dimensional data. There will also be musings from time to time about food/cooking (i.e. JHSPH Biostat Chili Cookoff strategy), and culture in general...