The headline on the Seattle Times front page asks “137,689 names later, gay community asks: How did they do it?” in regards to Referendum 71, while over on Slog, Dominic Holden looks at the apparently low invalidation rate on the first batch of signatures and declares “This (Probably) Means War!“…

In case you haven’t heard, a preliminary check of signatures for anti-gay Referendum 71 shows the measure may qualify for the ballot. Some quick math: Elections officials scanned 5,646 petition signatures and found that 4,991 were valid as of last Friday, says secretary of state’s office spokesman David Ammons. That’s a 11.34 inaccuracy rate (which is unusually low compared to a standard inaccuracy rate for Washington petitions of only about 18 percent). Referendum backer Protect Marriage Washington submitted 137,689 total signatures, which would give them a 14 percent cushion. But they’re beating that cushion by

nearly three points. If they keep it up through the rest of the signature count, the religious bigots will succeed at putting domestic partner rights of gay couples up to a public vote in November.

Geez… doesn’t anybody read HA on the weekends?

First of all, even if the invalidation rate was as low as 11.34%, they are still not “beating that cushion by nearly three points,” for the media (aided by a lack of clarity on the part of the SOS) is comparing the invalidation rate to the wrong number. R-71’s sponsors submitted 137,689 signatures, 17,112 (or 14.20%) more than the 120,577 minimum required. But since the invalidation rate is calculated against the signatures submitted and counted, so to must the so-called cushion, coming to a 12.43% (17,112/137,689) threshold for invalid signatures beyond which the measure fails to qualify for the ballot.

So based on the raw data from the first batch of signatures processed, R-71 is squeaking by, but by little more than a point.

*But*, as I explained on Saturday, the reported 11.34% invalidation rate on the first batch of 5,646 signatures is deceptively low because such a small sample cannot reflect the true percentage of duplicate pairs within the total universe of 137,689 submitted signatures. The reason, if you think about it, is obvious, but rather than trying to explain this again myself, I’ll just let the Secretary of State’s Office do so in its own words, from a 2006 FAQ regarding the rejection of I-917:

Duplicates play an important role in the state’s formula that determines the rejection rate on a random check.

In the normal course of events, finding duplicates in a random sample bears directly upon the size of the sample being done.

For example, a random check of 100 names out of 266,006 would not be expected to find any duplicates, but a random check of 200,000 names would be expected to find duplicates. Thus, the size of the pool increases exponentially the likelihood of duplicates.

Finding duplicates in a small 4% sample suggests that the number of duplicates that exists in the entire pool is exponentially larger.

The mathematical algorithm adopted by the state contains calculations designed to account for this dynamic.

Thus, the state is not able to finally determine the rejection rate on a particular initiative simply by looking at the signatures approved and rejected. The formula also calculates the acceptable number of duplicates for the sample size.

The SOS doesn’t specifically share its algorithm for projecting duplicate signature rates, but from the data provided in the I-917 FAQ, one can make a pretty good guess. The SOS reported 24 dupes found amongst 10,819 signatures sampled out of 266,006 submitted, yet projected a 5.45% duplication rate… exponentially larger than the 0.22% rate within the sample itself.

So how did the SOS come up with that larger number? They appear to be dividing the number of dupes by the sample ratio (sample size over total submitted), and then dividing the quotient by the sample size, as in:

( 24 / ( 10,819 / 266,006 ) ) / 10,819 = 5.45%

Run the data from the first batch of R-17 signatures through the same equation and rather than the current 0.12% duplication rate, you get:

( 7 / ( 5,646 / 137,689 ) ) / 5,646 = 3.02%

Now, separate the 7 dupes from the other 633 signatures rejected in the first batch, and you get a projected total invalidation rate of 14.23%… not at all bad by historical standards, but *nearly two points worse* than what is needed to qualify.

So… how reliable are these projections? It’s hard to say. The sample size is pretty small, and we have no reason to believe the first batch was particularly random. Furthermore, while I’m no statistician, the formula above does strike me as rather unsophisticated. (That said, Darryl ran his own simulations on the same data and came up with a slightly higher projected duplication rate of 3.25%.)

What I can say with absolute certainty is that the duplication rate is dramatically underreported in the first batch, and that it will steadily rise as the aggregate sample size gets larger, increasing the total invalidation rate with it. Thus, while the press may hope for the contentious R-71 to qualify for the ballot and continue to generate headlines, in answer to the Times’ question, “How did they do it?”, the most likely answer will be: *“They didn’t.”*

ArtFart spews:

Goldy,

I’m more than a little puzzled at your continued musings over this, as if until the fat lady sings you’re clutching to some furtive hope that R-71 won’t make it on the ballot or that it’ll somehow make a difference if it does so by a narrow margin. By all rights, instead of finding any excuse to relax, everyone on the left ought to be preparing for one hell of a battle on this one.

Goldy spews:

ArtFart @1,

My focus on this has little to do with R-71, and everything to do with the media repeatedly getting this kinda stuff wrong. It’s math. Perhaps not simple math, but math nonetheless, and by misrepresenting the numbers they are creating a misleading impression that R-71 has more support than it really does.

I mean, if we can’t trust them to get the math right, what can we trust them on?

Jeff spews:

Thanks, Goldy, for correcting everyone’s math, so I don’t have to in the comments. There was a post on washblog that got it wrong as well.

jeff spews:

If a signature is ruled not to be a match does a person have the opportunity to sign an affidavit and get counted? I believe they can do that with their ballot in an election.

jeff spews:

As a math professor I applaud your efforts to get journalists to do a better job at reporting mathematical concepts.

Jeff spews:

@4 @5 Weird to see another jeff post right after me. Thanks for the lower-case j to distinguish us. :-)

rhp6033 spews:

Goldy;

I hate to admit this, but you lost me at “algorithm”, at which point my eyes glazed over, and I figured I would just wait to see the results of the validation process.

“Will this be on the test? Nobody told me there would be math involved!”

Mr. Cynical spews:

Goldy–

You may be right on the math Goldy.

But I would think you would want Ref 71 on the ballot to once & for all prove how “Progressive” Washington is.

Pardon me, but you seem a bit less than confident in your fellow Progressives and the Progressiveness of Washington State.

What are you afraid of Goldy???

Are you actually afraid R-71 will somehow pass in Progressive Washington??

And what are the other Gay Activists afraid of??

They quote all these studies & polls to support their case.

Could it be their facts are phoney??

Only way to tell is a Public Vote, right?

ArtFart spews:

@8 Cyn may have a point, and it could end up in a positive way. As I interpret Geov’s post about the Seattle mayoral race, an election-weary public might just decide not to bother if the choice is between Nickels and some other schlub who doesn’t have anything in particular to offer. On the other hand, if there’s a big hoo-rah with a bunch of dominionist gay-bashers dissing on our friends, relatives and neighbors, more of us in Seattle and King County may well be more motivated to fill out our ballots to give a hearty middle finger to Hutcherson, Joe Fuiten and their sanctimonious ilk–and in the process also vote against the

other“Hutch” and try our best to pick the lesser of evils for Seattle’s mayor.Goldy spews:

Cynical @8,

Actually, I’m of two minds on this, and the activists I’ve spoken to are mixed as to whether it might be a net plus to get this on the ballot.

mr. smitty spews:

There’s a bigger potential issue out there now. In the likely event that this doesn’t make it on the ballot, we will hear wingnuttery about how the SoS office must have cheated to keep this off the ballot. And they’ll use this early press as “proof” that something foul is amiss.

So if Goldy or whoever can get the MSM to clearly explain the status, that will save us all some time on another pointless “FRAUD!!!!” debate in a few weeks.

Piper Scott spews:

@9…AF…

Joe Fuiten publicly opposes – and did not sign – Referendum 71, according to The Seattle Times.

Check your facts before you offer a middle finger to someone.

The Piper

Right Stuff spews:

“But, as I explained on Saturday, the reported 11.34% invalidation rate on the first batch of 5,646 signatures is deceptively low because such a small sample cannot reflect the true percentage of duplicate pairs within the total universe of 137,689 submitted signatures.”LMAO!!!!!!!!!!!!!!!!!

Goldy channels Steffan Sharkansky over R-71…

Batches, Samples, duplicates, error rates, equations for validating signatures….

LMFAO!!!!!!!!!!!!!!!

Ladies and gentlemen, I give you Goldy Sharkansky…..

ArtFart spews:

@12 Right you are, Piper…I read that article myself when it first came out. It appears there are other soldiers leading the charge this time.

ArtFart spews:

@11 So far, it seems that Sam Reed’s been pretty good at turning a deaf ear to such things. It also makes the Republicans look even sillier to cry foul at one of their own.

Darryl spews:

Right Stuff @ 13,

“Ladies and gentlemen, I give you Goldy Sharkansky…..”That is funny! But it kind of betrays a misunderstand on your part.

The “duplicate adjustment” isn’t a figment of Goldy’s imagination…this is the adjustment that the State makes to estimate the number of duplicate signatures.

Goldy spews:

Darryl @16,

And it betrays an ignorance of the history of my adversarial relationship with Stefan, which started with dueling analyses of error rates in vote counting.

Lurleen spews:

The SoS said that since so few signatures were submitted,they’re going to look at every signature until they hit the golden mark of 120,577 valid, or have found enough invalids that it is impossible to hit that mark. So as far as I know they won’t be using formulae to determine when/if the petition qualifies. Am I right or have I missed a key piece of information somewhere?

I’m glad you’re walking through these formulae, because I didn’t know before this how all this was calculated on larger data sets where subsampling is employed. I’m also waiting like a greedy nerd to be able to graph the dulpicates trend. I’m a g’nerd, a total g’nerd.

Rob spews:

It’s hard to get all the details, but I think the Secretary of State’s office will now go through ALL the signatures looking for invalid ones and dups. The RCW (or WAC, whatever) says that first a random sample of at least 3% is drawn. They did, by drawing 4.1%. If the projection from that sample shows that the number of valid signatures exceeds the minimum required (allowing for reasonable sampling “error”), then the referendum goes to the voters. But if the sample projection falls short of the minimum (which it did), and the population could still be above the minimum (which it can), then all signatures are reviewed. This is why it could take weeks.

Jim Anderson spews:

How about a solution for media innumeracy

anddeclining revenue: tax breaks for newspapers, conditional on all staffers passing the math WASL.Danno spews:

Kind of says it all, doesn’t it Pyritey?

Danno spews:

Geez… doesn’t anybody read HA… messed that up

Rob spews:

If I am doing the math right (a big IF), this referendum is toast. Having 7 duplicates in the random(?) sample means there is about a 90% probability that at least two percent of all the signatures are duplicates. But there is only a 1% probability that they have a cushion of two percent more valid signatures from registered voters (including duplicates) than they need. I figure that there is less than a 1% chance that they have enough non-duplicated signatures from valid voters.

Darryl spews:

Lurleen,

“So as far as I know they won’t be using formulae to determine when/if the petition qualifies. Am I right or have I missed a key piece of information somewhere?”Yep…the point is, when they do do sampling, the duplicates in the total population of signatures have to be estimated based on (1) the size of the population and (2) the size of the sample they have examined.

This is because the number of duplicates grows exponentially as the sample size increases.

Goldy is pointing out that the media’s estimates of the

total number of invalid signaturesearly in the count process are not accounting for theexponentially increasing numbers of duplicatesthat should be found as the count grows.In short, the 0.125% duplicate rate found in the first batch of 5,646 ballots should translate into something like a 3.25% duplication rate after all 137,689 signatures are examined.

By contrast, the rate of invalid signatures (excluding duplicates) in the small sample is a fair estimate of the invalid signature rate (although uncertainty is also increased).

Since they found an invalid rate of 11.34% in the first 5,646 signatures examined, a fair estimate of the total bad signature rate (invalid + duplicate) is 11.34% + 3.25% = 14.59%.

That should be enough to keep it from the ballot.

Alki Postings spews:

I am ALWAYS shocked and exasperated by the KKK nut wing endless attempts to stop the gays, Jews and blacks from ever having equal voting, marriage, legal rights….whether those racist homophobes are represented by the Dixiecrat’s of old or the current day Republicans. They just can’t IMAGINE gays being given paid time off to visit their spouse/partner/whatever of 20 years in the hospital just like heterosexual couples. If that sort of thing, or letting Jews into your country club, or interracial marriage, is what angers you, then you’re sick person. You’d think being on the wrong side of history EVERY single time would eventually get them down, but god bless ’em, they keep being wrong. No, not JUST wrong, but PROUD of being wrong and stupid. It’s flabbergasting when you find a group of people who think ignorance is a badge of honor.

Mr. Baker spews:

algorithm is a dance with numbers, set to music that only nerds can hear

or not

Rob spews:

I thought “algorithm” was what Al Gore showed when he danced the Macarena at the Democratic Convention.

Lurleen spews:

Darryl @ 25 thanks, that helps clarify it for me.

Btw, I was just over at From Our Corner, and I guess I don’t need to request the daily numerical details, lol! Something like 5 polite requests already.

Brenda Starr spews:

