It happens almost every election season: the return of the Poll Analysis Concern Trolls. Well…they’re baaaaaaaak!!!
This season we have HA’s newest amateur right-wing propagandist, “Bob”, who is vewy, vewy concerned about the methods and polls I use for the election analyses. And we have the return our most esteemed amateur right-wing propagandist (to put it kindly), currently under the name, “Smilin'” (before that, ironically self-named, “GetFactsFirst”) . If you are interested, you can follow some of their election analysis concern trolling here, here and here.
I don’t want to totally belittle our Concern Trolls. The do play some useful roles here, like contributing to the raucous back-and-forth in the comment threads. And, for me, providing new opportunities to pontificate about polls, probabilities, statistics, simulations, bias, etc—topics that I enjoy in my professional life as well as in my hobby of collecting and analyzing electoral polls.
I also want to acknowledge them for inspiring a new occasional feature for this election season: The Electoral Pundit Contest. It is sort of like Lee’s Birds Eye View contest, but dealing with polls and stuff. The challenge is given below, but first allow me to pontificate….
This first contest was inspired by Bob and Smilin’s discussion of “outliers” in polls. It really bothers them that I don’t assess whether polls are “outliers.” And their latest “target” is a new Pennsylvania poll from Franklin and Marshall college (also known as The Keystone Poll). It shows Obama leading Romney 48% to 36% with 17% selecting neither.
What triggers their “concern” is the partisan make-up of the poll: “Respondents 50% D, 37% R, 10% I.”
Smilin’ puts it:
Why would Darryl include a poll that uses 50% Dems? Seems like there are several “outlier” polls like this that have zero credibility because of their underlying assumptions.
Is this poll an outlier? We could approach this from a probabilistic point of view by asking the question: if the sample of 412 registered voters was truly a random sample of PA voters, what is the probability of drawing a result as “extreme” as 50% Ds and 37% Rs and 10% I?
To make this easier, let’s ignore the “I” category, so the question becomes: if the sample of [207 Ds + 154 Rs =] 361 registered “partisan” voters was truly a random sample of PA voters, what is the probability of drawing a result as “extreme” as [50%/(50% + 37%) =] 57.5% Ds and [37%/(50% + 37%) =] 42.5% Rs?
A proper test would require us to know the “truth” about the probability of drawing a D versus an R in the population. Suppose the “true” probability is 54% for drawing a Democrat and 46% for drawing a Republican (ignoring folks who are Independent). We could then ask: for a sample of 361 partisans and a true probability of 54%, how probable is it to draw at least 207 Ds?
There is an exact answer to this question that can be found from the Binomial Distribution. The answer is about 11%.
In other words, if we did a bunch of polls with truly random samples of 361 registered voters each (assuming truthful answers, etc.) and with the true proportion of Democrats of 54%, we would, just by chance, draw a Democratic sample of 57.5% or greater about one out of every nine such polls. Hence, this particular evidence is not very strong, under our assumptions, that the poll is an outlier.
Whether partisan make-up or whether we look at the percentage “voting” for each candidate, there isn’t usually strong evidence for outliers. For example, let’s look at all polls for PA in the 2012 Obama—Romney race:
The vertical lines show the plausible range of “true” proportions, given the poll proportion and the sample size.
Two points. First, the plausible range of the most recent Franklin and Marshall poll largely overlaps all recent polls. The best evidence of an outlier comes from the previous Franklin and Marshall poll that just barely overlaps a Susquehanna poll (yellow). But both polls plausibly overlap their neighbors. So…which one should go? Or are they both perfectly valid, but happened to legitimately draw samples at each end of the spectrum? The rule for my analysis is to assume the difference is sampling variability, and include both polls. Since the election analyses typically have 60 or more polls, this sampling variability will, more or less, cancel out.
The second point is that the most variable polls are the smallest polls. The most current Franklin and Marshall poll is tiny. (In fact, you can get a rough idea of the sample sizes of polls from the plausible range—the Quinnipiac polls (cyan) all have samples over 1,100.) Because of the mechanics of the simulation analyses, larger polls (with smaller sampling error) have greater influence on the analysis.
Contest: There are three parts.
(1) In the above discussion, I had used 54% as an example for the “true” proportion of Ds versus Rs in Pennsylvania. Your task is to provide your best estimate of the true proportion of Democratic, Republican and “independent” (or other) voters in Pennsylvania. Use any resource and estimation technique you wish. Since partisan composition could change daily, let’s pin it down to June 4th (the last day of the Franklin and Marshall poll) as our target day.
(2) Assess the difference between your best estimate (part 1) and the partisan composition of the Franklin and Marshall poll (this is simple subtraction). The difference may be surprising.
(3) What is the cause for the “surprising” difference?