Interface: The Tech World Isn't a Democracy of Data, and Neither Is the NBA (Wired Opinion)

I love Alexis Ohanian’s technology-inspired take on the Jeremy Lin phenomenon not least because I think it’s totally wrong. This is what’s great about both sports and technology, at least if you enjoy talking about them. As long as everyone stays civil, you can argue forever.
New York Knicks039 Jeremy Lin
New York Knicks' Jeremy Lin celebrates in the final moments of an NBA basketball game against the Dallas Mavericks in New York, Sunday, Feb. 19, 2012. The Knicks defeats the Mavericks 104-97. (AP Photo/Seth Wenig)AP Photo/Seth Wenig

I love Alexis Ohanian's technology-inspired take on the Jeremy Lin phenomenon not least because I think it's totally wrong.

This is what's great about both sports and technology, at least if you enjoy talking about them. As long as everyone stays civil, you can argue forever.

Ohanian points out that Jeremy Lin's college statistics should have tipped off basketball scouts that he could succeed as a pro. Ohanian also says that this by-the-numbers approach is one thing that endears Lin to geeks everywhere, especially in the tech industry:

Geeks strive to build a world where decisions are driven by data. At this point in time, the internet is the closest thing we have to a marketplace of free ideas. If you’ve got the talent and access to the data, you have a fair shot at making something great (keeping that data free was a major driving force behind our opposition to SOPA & PIPA). Moneyball inspired millions, but it especially warmed the hearts of geeks, even those of us who got picked last during gym class. These are not new insights. In 2009, Michael Lewis wrote an article in The New York Times about Houston Rockets GM Daryl Morey using secret statistics “to find new and better ways to value players and strategies.”

Ohanian cites a now-famous pre-draft analysis by Ed Weiland that made a strong case for Lin as a sleeper pick. Weiland pointed to two publicly available statistics, field-goal percentage (shots made divided by shots attempted) and "RSB40" – that is, rebounds, steals and blocks per 40 minutes. Lin's numbers here were comparable with NBA stars like Allen Iverson, Jason Kidd and Penny Hardaway.

So far so good. Weiland, though, adds an important caveat:

Lin put up his numbers in the Ivy League, while most of the players on the list played in major conferences. This is a big deal. For players from a small conference the jump to the NBA is a lot tougher. They don’t get the exposure, unless their team makes the tournament. They need to be that much better statistically to stand out.

A Harvard degree might help you get a job at Facebook, but not in the NBA. There, a big school like Michigan State (my alma mater) has a lot more credibility.

Weiland then compares Lin's numbers to those of successful players from smaller college conferences, it's a much more modest group: Derek Fisher, Jose Barea, Dee Brown. And this is with cherry-picking on the supply-side: plenty of small-school stars have put up Lin-like college numbers and then crashed and burned in the NBA, but Weiland doesn't look at them at all.

Meanwhile, the Puerto Rico-born Barea was a sensation in last year's playoffs, helping the Dallas Mavericks win the championship – but as far as I know, nobody's now arguing that Jeremy Lin is the next J.J. Barea. Lin's star has already become much bigger than Barea's (and not just because Barea, listed at 6 feet tall, is probably closer to 5'9").

Weiland also wrote that Lin – who played shooting guard rather than point guard at Harvard – needs to learn how to pass and handle the ball better. "He appears to have the skills to become at least a usable combo guard," says Weiland. "If he can get the passing thing down and handle the point, Jeremy Lin is a good enough player to start in the NBA and possibly star."

And that's basically what's happened: Lin moved from undrafted prospect to bench player to the development league and back, and finally got the chance to start exactly when his skills improved. And they're still improving.

Yes, scouts should have paid attention when Lin outperformed against top college teams like Connecticut, Georgetown and Boston College. But they didn't. It wasn't that they didn't have enough statistics available, or that they ignored they saw. They simply never saw him play.

That is, they never saw Lin play a whole basketball game. Instead, NBA teams ran him and other prospective rookies through drills designed to test his raw athletic ability. As Lin would be the first to tell you, he is no track-and-field star. Then they had him play three-on-three basketball, which doesn't show off Lin's ability to see the whole court on offense. Three-on-three drills also maximize his shortcomings on defense.

All of these workouts generated plenty of data. None of it made Lin stand out. And NBA teams move on the data they have, not the data they wish they had. As my colleague Jonah Lehrer says, "not only have [sports] teams failed to find relevant variables for predicting future player performance, but they typically pretend otherwise."

Companies fixate on easily measurable variables, and make huge bets on them even though their predictive value is poor. Hold on, wait a minute – that sounds like almost every investor, analyst and executive in the technology industry.

After Golden State signed Lin as an undrafted rookie, they saw a player from a small school, who would have to learn how to play point guard because he was too short and unathletic to play NBA shooting guard, and decided – rightly, I think – that he needed a year or two in the development league to retool his game.

In VC-speak, Jeremy Lin needed to raise another round of capital and diversify his business model before going public. And that time in "stealth mode" served Lin extraordinarily well.

Even if you look at Lin's performance this season, his effect on the Knicks still isn't captured in his numbers. Consider an advanced statistic called "win shares per 48 minutes." The idea of "win shares" comes from sabermetrics Time Lord and Moneyball hero Bill James, who applied it to baseball. There's a lot of ingredients in the win shares soup, but it's engineered to capture a player's overall excellence. indexing it by minutes played helps level things out regardless of whether he's played a lot of games or, like Lin, relatively few.

For the most part, the statistic works. LeBron James' win shares per 48 minutes tops the league at .333. Stars like Chris Paul, Kevin Durant, Derrick Rose and Dwight Howard round out the top 20. It's not perfect; Kobe Bryant's win shares per 48 minutes are remarkably low (just .156). Even if you could argue that Kobe's overrated, he's probably not that overrated.

Lin's figure is .188 – identical to star forward Dirk Nowitzki (who's having something of an off year), but below Philadelphia's Louis Williams. Who's Louis Williams?

Williams, like Lin, is a small, young, high-energy guard who can score a lot of points in a hurry. He was drafted in the second round out of high school as a long-term prospect, and has matured into a solid pro. But virtually nobody knows who he is. And very few of those who do think he's amazingly great.

Ultimately, Lin's appeal isn't completely based on his numbers. It's also not completely based on his ethnicity or Ivy League background, even though I don't think you can deny that they add to his legend. It's based on the fact that he's fun to watch.

Lin dunks, throws no-look passes, takes defenders off the dribble, and hits clutch three-pointers. Sure, he helps the Knicks win, but so does center Tyson Chandler. Chandler's putting up amazing numbers this season – his .741 true shooting percentage is, like Jem and the Holograms, truly outrageous – while doing the dirty work and getting almost no interview requests. Except to talk about Jeremy Lin.

For the Knicks, Chandler is really the Moneyball success story, which is all about recognizing players who aren't flashy but help teams win. Chandler's talents were definitely recognized early. In fact, as the second overall draft pick in 2001, he'd been seen as something of a bust, bouncing around from team to team.

Now, Chandler is fourth in the league in win shares per 48 minutes, with .244, ahead of household names like Bryant, Durant and Rose, ahead of all-stars at his position like Kevin Love, Dwight Howard, Pau Gasol or Andrew Bynum, and way ahead of Jeremy Lin.

Chandler wasn't a cheap free agent for the Knicks, but still a bargain compared to a superstar center like Howard. And he arguably fits around Knicks scorers Carmelo Anthony, Amare Stoudemire and Lin much better than Howard would.

Lin isn't like that; even on off nights, he fills up a highlight reel. If his numbers taper off over time – and with Carmelo back, they probably will, as Lin won't have to carry the team and will switch to a different kind of playing style – he's a star now, in a way that can't be measured. Except possibly in tickets, jerseys and sneakers sold.

Nevertheless, data matters. So I want to come back to Ohanian's point about statistics in basketball and the tech industry. Geeking out about sports aside, I really do think there's an important connection here. It's just the opposite of what Ohanian sees.

As Ohanian points out, what Oakland GM Billy Beane was to statistical analysis in baseball, Houston Rockets GM Daryl Morey has been to basketball. Morey, though, has taken a different approach. He doesn't use publicly available statistics; he keeps all the information he gathers himself. And with that information, he develops his own, sophisticated, proprietary evaluation criteria for players.

ESPN writer/NBA superfan Bill Simmons has a brilliant summary of both Morey's approach and the fundamental difference between basketball and baseball:

Baseball isn't basketball. It's an individual sport; teammates don't matter unless they can help get PEDs. (Sorry, I had to.) Every conceivable diamond talent can be measured objectively. I thought Derek Jeter was a great shortstop until the defensive stats told me otherwise. I thought Wade Boggs was wrong for a leadoff hitter; turns out, an OBP machine who drags pitch counts along is just what the top spot calls for...

The statistical intelligence in NBA front offices is superior for one simple reason: They spend millions of dollars to figure this stuff out. Daryl has many minions crunching numbers. At the conference, Hollinger joked that Daryl was lucky the league hasn't imposed a salary cap on stat guys. Daryl laughed nervously. Because it's true.

Like every other forward-thinking GM, he considers numbers not a sacred evaluation tool but rather part of a bigger process: How can we calculate the best way to win? And there's no easy answer. Ongoing success in basketball hinges on talent, leadership and role play.

Morey does Bill James one better by measuring what we've thought of as intangibles, capturing the data that nobody had thought had any informational content. Sometimes you don't need a stat as complicated as win shares; you need to know whether a guy can hit open threes in the corner and avoid turning the ball over, or one who can throw the other team's best player off his game, because that's all you'll be asking him to do.

"Does it not bother anyone else that certain teams meticulously keep track of and hoard those moments?" Simmons asks:

It's valuable data that would give us all a better understanding of what we're watching. Meanwhile, the rest of the statistical community is more obsessed with comparing players and chasing impossible-to-prove-objectively stats like "adjusted plus-minus." Hey, geeks on the APBR board, I'm talking to you. You could be feeding us gourmet cheeseburgers, except you're more interested in cloning cows.

Hey, you know what? Simmons sounds like Google complaining about Facebook. Or like me whenever I complain that we don't really know how many Kindles have actually been sold.

In sports as well as technology, the real power of data isn't that it's open to everyone. That's a fantasy. The point is to use data of any and all kinds to win. That's why big companies like Google, Facebook and Amazon opposed SOPA/PIPA; the free flow of data across the web helps them to win. That is, up until the exact moment it arrives on their servers, at which point the vast majority of it disappears from public view forever.

It's lovely to think that we live in a Bill James industry, where a geeky outsider crunching numbers out in plain sight can beat the big names with more experience who go on gut instinct.

Instead, we live in a Daryl Morey industry, where data is power, and like all powerful resources, needs to be kept under control. It's also an uncertain power, subject to all manner of changing circumstances that are beyond the control of either you or your spreadsheets. (Ask Netflix's Reed Hastings about that one.)

Oh, and by the way, Morey and the Rockets? They signed Jeremy Lin after he left Golden State. And then they cut him – which is how he wound up in New York.

In New York, Houston or Silicon Valley, it turns out that even people who know everything don't know everything.

Image credit: AP/Seth Wenig