Updated twice.
The SOS processed another 5,815 R-71 signatures yesterday, and as expected, the percentage of duplicate signatures increased again. 7 dupes in the first batch, 16 in the second, 22 in third; it’s almost exactly what my spreadsheet predicted.
The raw invalidation rate in yesterday’s batch was also the highest thus far, coming in at 14.4 percent, nearly two full points above the 12.43 percent threshold. That brings the raw invalidation rate on the 17,317 signatures processed to date to 12.99 percent. Adjusting for duplicates, and removing from the count the 49 signatures not on file, the invalidation rate on the total sample is now running at approximately 14.55 percent, up only slightly from the total for the first two batches.
While it should be noted that these numbers do not technically represent a random sample, at 12.5 percent of the total signatures submitted, it is already sufficiently large enough to predict R-71’s failure with a high degree of certainty.
Update [Darryl]
This figure shows the required signatures and, for each data dump, a statistical estimate of the expected signatures required.
The estimate of total signatures adjusts for both duplicates and invalid signatures, and to be conservative, I have assumed that all of the “missing signature card” signatures will be found and counted as valid.
There are error bars showing standard sampling error for each day—they are tiny for yesterday’s dump. Clearly, if sampling error is the only error involved, there is no way R-71 will pass. Even after the first data dump day, there was slightly under an 8% chance the final count would put R-71 on the ballet.
The graph does suggest substantial error other than sampling error (i.e. the big swing from day 2 to day 3 that is way outside sampling error), but there is now a huge amount of ground to make up. Still, with only 12.6% of the sample counted as of yesterday, there could be some surprises.
Update 2 [Darryl]
Oops…When I looked back at the program I used to estimate the number of valid signatures I had entered 150 instead of 45 as the number of duplicates (150 is actually the total number of no-matches found so far). So here is the corrected figure:
Correcting the number of duplicates makes a huge difference in a qualitative interpretation. Now, it looks like there is very little non-sampling error (and very little sampling error). If so, this pretty much spells doom for R-71.