My NaNoWrimo Stats
12/21/2013
by Gabe Koss
On a total whim I decided to participate in National Novel Writing Month. This is a month long writing marathon in which particpants attempt to write a 50,000 word novel in the month of November. I cheated a little bit and started on October 26.
Total Words | 54173 |
October Words | 2917 |
November Words | 51256 |
Avg Words/Day (Nov) | 1709 |
Progress Over Time
The vertical axis represents the word count of the story as it grew. Each bar indicates the total number of words reached per day. Hovering your mouse will show you the exact number of words reached on that date. The light line is created from the word count done each time I made a substantial save.
Common Words
After excluding common English stop words such as "that" or "is" the 10 most common words in my story were as follows:
sage | 631 instances |
out | 315 instances |
rama | 249 instances |
back | 184 instances |
one | 165 instances |
down | 144 instances |
looked | 139 instances |
here | 138 instances |
more | 125 instances |
know | 124 instances |
Common Bigrams
Bigrams are two word units such as "depraved heathen" or "kind soul". The most common two word groupings were as follows:
of the | 390 instances |
in the | 222 instances |
to the | 188 instances |
on the | 163 instances |
into the | 151 instances |
she had | 107 instances |
was a | 104 instances |
from the | 92 instances |
out of | 91 instances |
she was | 90 instances |
Code snippets:
I wrote the story with Vim and tracked my progress with Git. I did the analysis on this data using a combination of Ruby, D3.js and the Linux command line. Much of my data analysis was inspired by the classic Unix for Poets.
Here are some of the tools I used to do this analysis.
Extract top 10 words:
tr -sc '[A-Z][a-z]' '[\012*]' < story.md | tr '[A-Z]' '[a-z]' | sort | grep -E -v '^.{,2}$' | grep -E -v -f ../stop_words.grep |uniq -c | sort -n | tail -n 10
Extract top 10 bigrams
tr -sc '[A-Z][a-z]' '[\012*]' < story.md > nano.words
tail -c +2 nano.words > nano.next
paste nano.words nano.next | sort | uniq -c | sort -n > nano.bigrams
tail -n 10 nano.bigrams