A doctoral grad builds a sprawling guide to baseball statistics, trivia, and history.
When Sean Forman was a doctoral student at The University of Iowa, his field of study was applied mathematical and computational sciences. But what he really cared about was baseball.
“I grew up in Manning, Iowa, as a Red Sox fan,” he says. “There weren’t many Red Sox fans in Iowa. But I’d read the league batting leaders every week in the newspaper, and it always had lots of Red Sox—Wade Boggs, Dwight Evans, Jim Rice—so I liked the Red Sox.”
It wasn’t until later that he realized the abundance of Boston Red Sox among the batting leaders was a statistical quirk. Playing home games in Fenway Park, with its Green Monster and oddly dimensioned outfield, made it an easy place to hit.
It also helped spark a fascination with statistics that led him to The University of Iowa, and after that, a career that has made him one of baseball’s most vital online presences.
Forman is the founder, developer, and CEO of baseball-reference.com, considered the definitive web site for baseball statistics, records, and other minutiae. If anyone wonders how many wins Ken Holtzman had during his rookie season with the Chicago Cubs, how many seasons Ted Williams missed while flying fighter planes in World War II and Korea, or how many home runs Babe Ruth hit during his sad final year with the Boston Braves—and baseball fans do wonder those things—baseball-reference.com is the place to go.
(By the way, it’s 11, three full and parts of two, and six.)
The site has its roots at The University of Iowa. Forman was a doctoral student in the math program in the mid-1990s when he managed the Iowa Farm Report, his online source of statistics and information for minor league teams.
He soon discovered, though, that there were no other web sites doing what his did, even for the major leagues.
“You couldn’t find any reliable statistics source online,” he says. “You couldn’t find Ty Cobb’s statistics online.”
So he started putting together what would become baseball-reference.com in his spare time while also completing his thesis (“Torsion Angle Selection and Emergent Non-Local Secondary Structure in Protein Structure Prediction”). His advisor, Alberto Segre, was patient.
“He was a great advisor and I enjoyed working with him, but he wasn’t a fan,” Forman says. “Mostly, he put up with me doing my baseball stuff.”
The site went live in 2000, filled mostly with statistics of retired players from the database of Baseball Archive. At first, it included 20,000 pages and needed only 300 megabytes of server space that Forman rented for $20 a month.
It wasn’t much, he admits, but considering there was nothing else like it, baseball-reference.com it was a treasure trove for baseball fans. Then one day, the site came to the attention of a Sports Illustrated writer who wrote a blurb about it.
“The server crashed,” Forman says, which made him think there might be a viable commercial market for this sort of thing. He kept building the site, even as he finished his doctoral work at Iowa and joined the faculty at St. Joseph’s University in Philadelphia.
He added box scores from every major league game played going back to 1957. He added statistics from the Federal League and other failed major leagues of the late 19th and early 20th century. He added minor league statistics going back to the 1890s.
Forman said his work in the UI doctoral program prepared him to oversee the growth of a project that will never stop growing.
“The math wasn’t that relevant, but my thesis was a large programming project that really helped me manage large projects like that,” he says.
Which is good because the site has grown to include more than 500,000 pages of data from 20 sources. It’s updated daily during the season with the latest Major League standings and an In Memoriam section remembering every former major leaguer who died in recent days.
It became so big that Forman eventually left his faculty job to run his company—Sports-Reference.com—full-time. He merged with several other companies that maintain sites with comprehensive NFL, NHL, and NBA statistics, as well as the Olympics. He employs three full-time and a handful of part-time employees from his Philadelphia headquarters, and earns enough revenue selling banner advertising and page sponsorships to make a comfortable living.
Forman is planning a major upgrade of the site this year. He hopes to add long-term statistics from the Negro Leagues, and a list of every bench-clearing brawl in baseball history.
The site’s customers are more than just baseball fans, egghead statistics mavens, and Rotisserie League general managers agonizing over which flawed player they should draft as a fourth outfielder—the one-dimensional Pat Burrell or the aging Bobby Abreu.
Forman says the site is also used regularly in press boxes, front offices, and law firms, as sports reporters, team management, and player agents use it frequently.
Story by Tom Snee; Photo courtesy of the Philadelphia Inquirer
Oct. 12, 2009