It is easy to identify successful companies, but hard to pin down the characteristics that make them successful. Why do some companies grow and prosper while others languish and fail? Why are some companies great while others are merely good, mediocre, or bad? These questions are asked and answered over and over again by business executives, management consultants, financial analysts, and investors, but their answers are usually wrong.
For example, in his best-selling 2001 book, “Good to Great: Why Some Companies Make the Leap and Others Don’t,” Jim Collins’ boasted that, “We believe that almost any organization can substantially improve its stature and performance, perhaps even become great, if it conscientiously applies the framework of ideas we’ve uncovered.”
Bold claims — if indeed they were true, as research paper I co-authored points out. The paper, “Great Companies: Looking for Success Secrets in All the Wrong Places,” published in the Fall 2015 Journal of Investing, shows the problem with “Good to Great” is that it relies on a backward-looking study, undermined by data mining.
Collins and his research team spent five years looking at the 40-year stock market history of 1,435 companies and identified 11 stocks that outperformed the overall market and were still improving 15 years after they made the leap from good to great: Abbott Laboratories ABT, -0.46% ; Kimberly-Clark KMB, -0.62% ; Pitney Bowes PBI, -1.25% ; Circuit City; Kroger KR, +1.25% ; Walgreens (now Walgreens Boots Alliance) WBA, +0.33% ; Fannie Mae; Nucor NUE, -0.45% ; Wells Fargo WFC, -0.40% ; Gillette (since acquired by Procter & Gamble PG, -1.00% ), and Philip Morris PM, -0.27% .
Collins scrutinized these 11 great companies and identified five common themes, He gave them catchy labels, such as “Level 5 Leadership” (leaders who are personally humble, but professionally driven to make a company great), and concluded that these were a road map to greatness.
Collins wrote: “We developed all of the concepts in this book by making empirical deductions directly from the data. We did not begin this project with a theory to test or prove. We sought to build a theory from the ground up, derived directly from the evidence.”
Collins apparently believed that this proclamation made his study sound unbiased and professional. He didn’t just make this stuff up. He went wherever the data took him. The reality is that Collins was admitting that he had no idea why some companies are more successful than others, and he was revealing that he was blissfully unaware of the perils of data mining — deriving theories from data.
When we look back in time at any group of companies, the best or the worst, we can always find common characteristics. Finding such traits only confirms that we looked, and tells us nothing about whether these characteristics were responsible for past successes or are reliable predictors of future success. For instance, each of the 11 companies selected by Collins has either an I or an R in its name, and several have both an I and an R. Is the key for going from good to great to make sure that your company’s name has an I or R in it?
Of course not. This random I-or-R pattern is an obvious example of data mining. Collins’ data mining is less obvious, because the appealing labels he thought up make his unearthed patterns sound plausible. It is nonetheless data mining because, as he freely admits, Collins made up his theory after looking at the data.
Collins does not provide any evidence that the five characteristics he discovered were responsible for these companies’ success. To do that, he would have had to eschew data mining and, instead, follow the scientific method that has been the foundation for the triumph of science over superstition: (a) select the characteristics beforehand and provide a logical reason for why these characteristics predict success; (b) select companies beforehand that do and do not have these characteristics; and (c) monitor their success over the next several years using a metric established beforehand. Collins did none of this.
To buttress the statistical legitimacy of his theory, Collins cited a professor at the University of Colorado: “What is the probability of finding by chance a group of 11 companies, all of whose members display the primary traits you discovered while the direct comparisons do not possess those traits?” The professor calculated this probability to be less than 1 in 17 million.
In statistics, this kind of reasoning is known as the Feynman Trap, a reference to the Nobel Laureate Richard Feynman. Feynman asked his Cal Tech students to calculate the probability that, if he walked outside the classroom, the first car in the parking lot would have a specific license plate, say 8NSR26. Cal Tech students are very smart and they quickly calculated a probability by assuming each number and letter were determined independently. This answer is less than 1 in 17 million. When they finished, Feynman revealed that the correct probability was 1 because he had seen this license plate on his way to class. Something extremely unlikely is not unlikely at all if it has already happened.
The Colorado professor fell into the Feynman Trap, coincidentally with the same 1-in-17-million probability as in Feynman’s license-plate calculation. The calculations made by the Colorado professor and the Cal Tech students assume that the five traits and the license plate number were specified before looking at which companies were successful and which cars were in the parking lot. They were not, and the calculations are irrelevant. Finding common characteristics after the companies or cars have been selected is not surprising, or interesting.
The interesting question is whether these 11 companies’ common characteristics are of any use in predicting which companies will succeed in the future. For these 11 companies, the answer is no. Fannie Mae stock went from more than $80 a share in 2001 to less than $1 a share in 2008, and delisting in 2010. Circuit City went bankrupt in 2009. The performance of the other nine stocks since the publication of “Good to Great” has been distinctly mediocre. Overall, five of the 11 stocks did better than the S&P 500 SPX, -0.40% , six did worse. On average, they did slightly worse than S&P 500.
The Feynman Trap plagues every book espousing formulas/secrets/recipes for a successful business, a lasting marriage, living to be 100 years old, and so on, that are based on backward-looking studies. To avoid the Feynman Trap, we need to specify the secrets ahead of time (and explain why they make sense), and then test them with fresh data.
Gary Smith is the Fletcher Jones Professor of Economics at Pomona College and author of “ Money Machine: The Surprisingly Simple Power of Value Investing .”