Sports Writing, Sort Of

Living in North Carolina, I see a lot of Tar Heel blue, I hear a lot of UNC talk, and I am confronted with a lot of articles about Chapel Hill’s basketball team in the local papers. This season preview, though, is a bit different.

The first game of the 2010-2011 season for North Carolina basketball will be in Chapel Hill on November 12 against Lipscomb. Expectations are high that this year’s Tar Heels team is an improvement on last year’s. They’ll be bringing back a group that played 43% of last season’s minutes and adding the efforts of 3 Top 100 recruits, including #1 Harrison Barnes. North Carolina has the largest deficiency in rebounding where they lost 64.3% of their output. Equally as concerning is three point shooting, where they also lost a big 63% of last year’s output.

The AP gives the Tar Heels a #8 ranking in their preseason AP Top 25 poll. They weren’t ranked in last year’s final poll. North Carolina closed out the last season with an overall record of 20-17, placing 9th in the ACC with their 5-11 conference record. The Tar Heels lost to Georgia Tech 62-58 in the ACC tournament. They then went to the National Invitational Tournament (NIT) as a #4 seed, losing in the Championship game to Dayton, 79-68.

It’s pretty standard preview information, just without a whole lot of extra flare. In fact, you could safely say that the writer of this piece has no personality, because the entire article was “written” by an algorithm. This was computer generated content from start to finish, with the only human involvement being in the coding of the formula that created the story.

TechCrunch has a write-up on the compnay today, explaining the basic idea behind the concept. They are a data collector who has decided to turn their warehouse of numbers into content by using formulas to mine through the interesting information about a specific college basketball team and write season previews or game recaps, among other things.

While the piece isn’t overly interesting, I expect that this will become something of a trend in the future. Using play-by-play data and a decent algorithm, you could come up with a pretty solid recap of any sporting event. It won’t have any player quotes, of course, but as the world moves away from paper and towards digital content, it would not be hard to imagine a shift away from reporters getting quotes and transcribing them into a story, as providers instead just embedded full audio or video of a press conference next to the automated recap.

If I was a beat writer, this would scare me to death. I don’t know if this is the future of journalism, as good writing will always find an audience, but this is almost certainly part of the future of sports coverage.




Print This Post

Dave is the Managing Editor of FanGraphs.


11 Responses to “Sports Writing, Sort Of”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Chris R says:

    I was guessing the piece was from a student newspaper, written by someone with access to a bank of stats but without much experience in valuing information, or synthesizing it with personal observations or historical context. Unless they really are hacks, I don’t think beat writers need to be fearful, at least of version 1.0.

    Vote -1 Vote +1

  2. Brad Johnson says:

    In this info-centric world, I don’t think these kinds of articles preempt human created content in the least. Even in terms of game recaps, I’m confident I or a real beat writer can tell a more compelling story using just game statistics than a robot. The human mind doesn’t need to know all the variables involved in making a good story to do so, it’s simply intuitive. The algorithm does (or at least most of them). And let’s not forget, a good story isn’t told the same way every day. That’s the job of the box score.

    I think a better use of this is to supply the beat writers with briefs so they don’t have to mine for data. If beat writers don’t have to work as hard to find quality data, it should result in higher quality copy.

    Vote -1 Vote +1

  3. Nate says:

    The biggest problem with the article is that it’s about the wrong blue team from NC. The National Champions were a darker shade this year, and have a great chance at repeating. That sounds great to me, even if it comes from a computer.

    Vote -1 Vote +1

  4. baycommuter says:

    Let’s compare that to Giants beat writer Henry “Hammerin’ Hank” Schulman in the Nov. 2 S.F. Chronicle:

    There stood pitcher Matt Cain, at 26 the longest-tenured player on the 2010 Giants, raising the circle-of-flags trophy above his head on the field so hundreds of San Francisco fans who refused to leave the Rangers’ ballpark could see it.

    “Wow, this is sick,” Cain said. “We’re the World Series champions of 2010.”

    How long the faithful have waited to hear those words – not years, but generations. The Giants moved to San Francisco in 1958 and had not touched that trophy until Monday night, when they beat the Texas Rangers 3-1.

    ***
    Your move, computer.

    Vote -1 Vote +1

  5. B N says:

    The whole point of reading an article is the insight and/or opinion. Otherwise, I could just go look at the stats myself- such as the box score or the play by play. I do this a lot anyways, but if the recap came in that form, I’d skip it every time.

    Don’t get me wrong, I love automation. But until we have some serious AI going on, this stuff is going to be drier than the Sahara.

    Vote -1 Vote +1

  6. Robbie says:

    Thanks for the feedback guys. Keep in mind we are at the infant stage of this trend. We are just getting started.

    Most articles are written by one person, which means you are relying on whatever that one person can research, think of, conceive, etc. Algorithmic content is collective in nature. We can bake in significantly more analysis and insights than any one person could do on their own. Also, game recap is just one type of article we are working on. There are many new types of articles we are developing that you just don’t see today.

    Now I’m not saying what we are doing is a replacement for anything. Definitely not. There will never be one voice for everyone’s content needs. But I do think we’ll have a unique and different voice. And who’s to say we won’t have a bunch of different “voices” :-)

    Lastly, yes we are starting to look at play-by-play and a variety of other things to tell interesting stories. One model we are looking at is providing stories to writers which they could then customize.

    Right now, the big advantage to what we are doing is filling the “long-gap” of sports content. It’s not a long tail because content isn’t written for a lot of smaller teams (in college) and small markets. Even some pro teams don’t get significant coverage in every instance.

    Stay tuned. This space will evolve significantly in a short period of time.

    robbie@statsheet.com

    Vote -1 Vote +1

  7. asdf says:

    Dave Cameron writes a NotGraphs article. All hope is lost.

    Vote -1 Vote +1