By the Numbers: Let’s avoid these statistical sins of the past

The founders of marketing research invent a number of extremely powerful and valuable tools, methods, questions and concepts that we all use and benefit from every single day. We are indebt to their originality, inventiveness and pioneering genius that found and shap our industry and its culture.

Much of this founding work took place during the 1920s through the 1960s and some of the research inventions occurr during the 1970s through the 1990s. But no one is perfect and our industry fathers and mothers committ sins that blight our industry to this day.

The first great sin is the top two-box percentage

Somewhere along the way, a founder develop the top two-box concept for questions with multiple positive responses.

So, supercharge your marketing effort with the power of phone number lists. Having at your fingertips the most current and targeted data, you’ll be able to connect with prospects with much less friction, thus enhancing rates phone number list of engagement and conversion. This is how personal outreach was going to make a real difference in your business, with each new campaign achieving more than the last.

A good example is the five-point purchase intent scale: definitely buy, probably buy, might or might not buy, probably not buy, definitely not buy. If only the “definitely buy” answers are count, the founders reason, information is lost.

What about the “probably buy” answers – shouldn’t they be count, too? Hence, the top two-box solution came into being and the custom is to present the “definitely buy” percentage, follow by the top two-box percentage (“definitely buy” plus “probably buy”). Sounds perfectly reasonable, so where is the sin and shame?

The top two-box

Percentage counts a “definitely buy” the same (i.e., gives it the same weight) as a “probably buy,” when it’s blatantly obvious to everyone that a “probably buy” is not nearly as good as a “definitely buy” answer. For the five-point purchase scale above, the sin of counting a “definitely” and “probably” as equals is, no doubt, a cardinal sin.

phone number list

If we were working with a nine-point, 10-point or 11-point scale, the top two-box percentage might only be a minor transgression. That is, on a longer scale, the difference in meaning between a top box and the second box is relatively small, so no great harm in adding the two together. On shorter scales, however, the distortion (and the sin) is usually much greater.

Back to the five-point purchase intent scale

A better solution is to count all of the “definitely buys” and then discount the “probably buys” by 40 percent, or 50 percent, or 60 percent, and add the “definitely buys” to the discount (or down-weight) “probably buys,” creating a weight average that provides a more accurate measure of the results.

For example, if the “definitely buy” answers equal 32 percent of respondents and the “probably buy” answers equal 20 percent of respondents, a best practice is to count all of the top box (the 32 percent who said “definitely buy”) and let’s say 50 percent of the second box (the 20 percent who said “probably buy”).

That yields a purchase

Intent score of 42 (32 percent direct trade advancing the retail industry plus half of 20 percent). The result is call a score (not a percent) since we have creat a hybrid number.

The second great sin of our industry founders involves significance testing. There is little doubt that significance testing of critical decision statistics is valuable. For example, determining whether product blue is better than product r is a good application of significance testing.

In the beginning, significance tests had to be calculat by hand, so only the most important results were subject to significance testing. But the growing power of computers and the expanding availability of statistical software l to the automation of significance testing in crosstabulation tables. Thus, with a few programming scripts, thousands of significance tests could be automatically run on a set of crosstabs.

You could easily test rows of percentages 

Column A against Column B. You could even determine if the differences between statistics in rows and/or columns were significant at the 90%, 95% or 99% level – with some type of code letters, symbols or colors. The resulting significance assertions could then be incorporat easily into charts, graphs and written reports.

Some might hail the exhaustive use of significance testing as a great advance in our craft. However, I would argue that willy-nilly significance testing is a great waste of time and effort. Overuse of significance testing adds costs to the preparation of written reports, adds extra time in quality-assurance verification and actually increases the risks of errors in interpreting the survey results.

If every number in a set of tables or

A report is significance-test, the analyst might avoid looking at the non-statistically significant results and thus overlook important phone number sa findings and patterns in the data. If the analyst is overly focus on statistical significance, he or she often overlooks other types of significance or other signals in the data.

Mass use of significance testing adds a hodgepodge of confusing symbols and potential bias into survey results. Also, many of the “significantly different” indications will be false, bas purely on chance variation.

I have personally watch analysts overlook almost everything of importance in survey results because they were so focus on statistical significance that they were blind to everything else. A best practice is to use significance testing only on the one or two most important questions in the survey data.

The third great sin comes from type

I and type II error in hypothesis testing. You can easily argue that the founders of the research industry stole type I error and type II error from the statistics or the academic world (and should, therefore, be blameless) but why on earth would our industry founders steal something as confusing as type I error and type II error? Couldn’t they have stolen something more useful?

Can anyone remember which is which (false positive versus false negative) and exactly what the heck type I and type II mean? Maybe I’m just old and over-the-hill but I have to do a Google search and study type I and type II error before ever attempting to actually use these concepts. And, why are we only focusing on errors and not on truths?

If there are two types of error (false positive and false negative), then there must be two types of truth (true positive and true negative). Or are there more than two types of error and more than two types of truth?

My head hurts

The fourth great sin is the so-call semantic differential scale. It was no doubt stolen from the psychology or sociology world, but again, why did our founders not have better judgment?

Now, I’m not against stealing if you can do it in the dark of night and if it’s profitable but I am firmly oppos to dumb stealing. Semantic differentials are usually some type of numeric scale (five points, seven points, nine points, 10 or 11 points) with the endpoints anchor by two words with opposite meanings, such as love/hate, fast/slow, modern/old-fashion and so on.

Scroll to Top