Yesterday I took a break from my all-too-frequent analyses of the R-71 signature counts. I didn’t even look at the numbers until this morning. When I did look, a Spock-esque twitch afflicted my left eyebrow. “Curious”, I though. “But maybe it’s just a one-time fluke….”
The analysis of yesterday’s data showed the probability of NOT making the ballet increased from a nearly impossible 0.04% to an almost-interesting 0.91%. In fact, this slow increase in the probability of not qualifying has continued a trend begun after 13 August.
Well, if you like that result, hold onto your sou’wester, because today’s result will blow you away. I’ll present the results in three parts. First, the basic results for today, then we’ll explore the trends in the daily data dumps. Finally (and below the fold) we’ll look at the micro-level volume data to divine what this trend suggests.
Today’s R-71 data release has the signature count up to 79,195, (about 57.5% of the total). There have been 9,208 invalid signatures found, for a cumulative crude (non-duplicate-corrected) rejection rate of 11.63%.
The invalid signatures include 7,805 that were not found in the voting rolls, 703 duplicate signatures, and 700 signatures that mismatched the signature on file. There are also 38 signatures “pending”; I’ve ignored them in the analyses. The 703 duplicate signatures suggest a final duplication rate of about 1.90% for the petition. This continues the trend we’ve seen this week of the projected duplicate rate growing faster than the mathematical predictions under the assumption of random sampling.
Using the V2 estimator, the number of valid signatures is now expected to be 120,777 leaving a thin surplus of only 200 signatures over the 120,577 needed to qualify for the ballot. From the cumulative data to date, the overall rejection rate is projected to be 12.28%.
A Monte Carlo analyses consisting of 100,000 simulated petition samples suggests that the measure has an 80.48% probability of qualifying for the ballot, assuming the only “error” is statistical sampling error.
Here is the distribution of valid signatures relative to the number required to qualify.
The red bars on the left show the times R-71 failed to qualify among the 100,000 simulations; green bars show the counts of signatures in which the measure qualified. Compare this to the results from just two days ago. Quite a difference!
Let’s examine the history since the SoS office started releasing accurate data a week and a half ago:
The red line shows the number of signatures needed to qualify, and the blue symbols show the daily projections of valid signatures, surrounded by 95% confidence intervals.
Clearly, since the 13th of August, the projected number of signatures has declined–and, as of today, declined more than we could expect by chance alone. Something is going on.
Tomorrow will be interesting…if the trend continues, success of the measure may dip below a probability of 50%.
The analyses I’ve done here are based on two assumptions: (1) that the signatures evaluated so far are just like signatures that remain to be evaluated, and (2) that the signature validation process is “stable” (the people validating signatures are not changing their standards over time). Today we see some pretty good evidence that one (or both) of these assumptions is (are) violated.
The supporters of R-71 will, no doubt, focus on the second assumption. If the measure fails, Secretary of State Sam Reed will likely take much abuse from fringe homophobes for “personally pushing a homosexual agenda.” To me, the simplest explanation is that the volumes being examined in serial order are chronologically correlated with the signature collection order. ( I don’t know if this is true; but, I cannot rule it out either.)
My thinking is that later-collected signatures (and therefore, later volumes) should have a higher duplication rate, just because there is an increasing chance with time early signers forgot whether or not they signed earlier. Additionally, with the last push of getting as many signatures as possible with an approaching deadline, it seems plausible that errors would increase. I’m thinking errors like collecting more out-of-state signatures, underage signatures, and signatures from people not active on the voter rolls.
Below the fold, I examine the fine-level data to see just what types of errors are increasing as the process proceeds. If you are still interested, click through…
Let’s first look at the trend in the projected number of signatures not found on the voter rolls (i.e. “missing” signatures). In these graphs, I look at volumes 201 through 361, and show the media and 95% confidence interval based on 10,000 Monte Carlo simulations. Volume 200 was completed in on the afternoon of 12th of August, and volume 361 was completed Wednesday afternoon:
Clearly there is a trend toward finding more missing signatures as the process continues. This could, conceivably, reflect reduced effort by signature checkers to find the signers in voter rolls. Alternatively, this might simply reflect the increasing sloppiness of signature collection over time.
Either way, the trend shows no signs of slowing down.
Here are the projected final number of duplicate signatures:
We see a positive trend from volume 245, but the trend stops increasing and shows signs of reversing itself at the end. It is difficult to know what to make of this trend.
Here are the projected number of mismatched signatures:
There is a whisper of a trend in mismatched signatures. But really, there is not much there.
This is a fine-scale version of the second figure in this post. It strongly suggests a linear declining trend since about volume 300 (and a more slowly declining trend before volume 300).
As I said, tomorrow’s data release will be very interesting. From today’s perspective, the results should be considered encouraging for supporters of the “all but marriage” law, because of trends that defy the underlying assumption of random sampling.