Statistics proves once and for all that authors are a product of their times

Illustration for article titled Statistics proves once and for all that authors are a product of their times

You only have to pick up an old novel to realize that people write very differently than they did 100 years ago, or even a couple generations ago. But is it possible to quantify those differences? Could pick apart centuries' worth of literature, and track their age based only on the words on the page?


Top image:

A group of researchers have done just that. They trawled through nearly 7,750 works by more than 500 authors from the Project Gutenberg Digital Library — but only paid attention to a comparatively small handful of words. They used just 307 "content-free" words — including prepositions, articles, and conjunctions — that were commonly used throughout the centuries.


It might not surprise you to find that authors clustered stylistically to their chronal contemporaries, over 85% of authors had an associated temporal disparity of less than 37 years. As the years between writers increases, the similarities between their styles drop.

What's especially interesting is what has happened since the 1900s: alongside an explosion in the number of authors, this effect has become more and more marked. With every passing decade, contemporary authors become stylistically more similar, but this bubble of similarity covers a smaller and smaller time frame. The authors are more closely linked, but for shorter periods.

Which is to say that modern authors are less influenced by their predecessors, and much more by their immediate peers. I guess a classical education won't take you too far, any more.

Share This Story

Get our newsletter


Corpore Metal

Yes, I'll agree with this even without the statistics.

I find a lot more people putting punctuation outside of quote marks these days—my theory is that this is due to software developers. Quotes are often delimiters and so often when someone learns programming one has to unlearn the rule about putting any other punctuation inside quotes unless you want it treated as a literal string character.

The other thing I've noticed that has changed in the last 20 years, especially the last 15 is this construction appearing in dialog now:


This wasn't at all common in the 1980s or earlier. Instead authors actually wrote out, "She hesitated," or, "He said nothing as a look of consternation crossed his face," and so on.

Personally I hate the "..." construction because it shows us nothing. Is the other character mumbling or giving us verbal hesitation like "Um, er, ah, well—?" It tells us nothing about what the other character's face or body is doing when they hesitate. This:


is lame and lazy writing and I wish writers would stop it. Hell, even the Stephenson has started doing it (He did it in the Baroque Cycle.) it and I really respect him!

Actually it's even worse. I see this appearing even in comics nowadays. Its utter lameness is doubly compounded, cluttering a panel with an talk bubble devoid of anything but "...."