The headline on the Seattle Times front page asks “137,689 names later, gay community asks: How did they do it?” in regards to Referendum 71, while over on Slog, Dominic Holden looks at the apparently low invalidation rate on the first batch of signatures and declares “This (Probably) Means War!“…
In case you haven’t heard, a preliminary check of signatures for anti-gay Referendum 71 shows the measure may qualify for the ballot. Some quick math: Elections officials scanned 5,646 petition signatures and found that 4,991 were valid as of last Friday, says secretary of state’s office spokesman David Ammons. That’s a 11.34 inaccuracy rate (which is unusually low compared to a standard inaccuracy rate for Washington petitions of only about 18 percent). Referendum backer Protect Marriage Washington submitted 137,689 total signatures, which would give them a 14 percent cushion. But they’re beating that cushion by nearly three points. If they keep it up through the rest of the signature count, the religious bigots will succeed at putting domestic partner rights of gay couples up to a public vote in November.
Geez… doesn’t anybody read HA on the weekends?
First of all, even if the invalidation rate was as low as 11.34%, they are still not “beating that cushion by nearly three points,” for the media (aided by a lack of clarity on the part of the SOS) is comparing the invalidation rate to the wrong number. R-71’s sponsors submitted 137,689 signatures, 17,112 (or 14.20%) more than the 120,577 minimum required. But since the invalidation rate is calculated against the signatures submitted and counted, so to must the so-called cushion, coming to a 12.43% (17,112/137,689) threshold for invalid signatures beyond which the measure fails to qualify for the ballot.
So based on the raw data from the first batch of signatures processed, R-71 is squeaking by, but by little more than a point.
But, as I explained on Saturday, the reported 11.34% invalidation rate on the first batch of 5,646 signatures is deceptively low because such a small sample cannot reflect the true percentage of duplicate pairs within the total universe of 137,689 submitted signatures. The reason, if you think about it, is obvious, but rather than trying to explain this again myself, I’ll just let the Secretary of State’s Office do so in its own words, from a 2006 FAQ regarding the rejection of I-917:
Duplicates play an important role in the state’s formula that determines the rejection rate on a random check.
In the normal course of events, finding duplicates in a random sample bears directly upon the size of the sample being done.
For example, a random check of 100 names out of 266,006 would not be expected to find any duplicates, but a random check of 200,000 names would be expected to find duplicates. Thus, the size of the pool increases exponentially the likelihood of duplicates.
Finding duplicates in a small 4% sample suggests that the number of duplicates that exists in the entire pool is exponentially larger.
The mathematical algorithm adopted by the state contains calculations designed to account for this dynamic.
Thus, the state is not able to finally determine the rejection rate on a particular initiative simply by looking at the signatures approved and rejected. The formula also calculates the acceptable number of duplicates for the sample size.
The SOS doesn’t specifically share its algorithm for projecting duplicate signature rates, but from the data provided in the I-917 FAQ, one can make a pretty good guess. The SOS reported 24 dupes found amongst 10,819 signatures sampled out of 266,006 submitted, yet projected a 5.45% duplication rate… exponentially larger than the 0.22% rate within the sample itself.
So how did the SOS come up with that larger number? They appear to be dividing the number of dupes by the sample ratio (sample size over total submitted), and then dividing the quotient by the sample size, as in:
( 24 / ( 10,819 / 266,006 ) ) / 10,819 = 5.45%
Run the data from the first batch of R-17 signatures through the same equation and rather than the current 0.12% duplication rate, you get:
( 7 / ( 5,646 / 137,689 ) ) / 5,646 = 3.02%
Now, separate the 7 dupes from the other 633 signatures rejected in the first batch, and you get a projected total invalidation rate of 14.23%… not at all bad by historical standards, but nearly two points worse than what is needed to qualify.
So… how reliable are these projections? It’s hard to say. The sample size is pretty small, and we have no reason to believe the first batch was particularly random. Furthermore, while I’m no statistician, the formula above does strike me as rather unsophisticated. (That said, Darryl ran his own simulations on the same data and came up with a slightly higher projected duplication rate of 3.25%.)
What I can say with absolute certainty is that the duplication rate is dramatically underreported in the first batch, and that it will steadily rise as the aggregate sample size gets larger, increasing the total invalidation rate with it. Thus, while the press may hope for the contentious R-71 to qualify for the ballot and continue to generate headlines, in answer to the Times’ question, “How did they do it?”, the most likely answer will be: “They didn’t.”