At first glance, the chart in the upper righthand corner of the page in Voyant doesn’t really seem to mean anything. After all, Voyant is just counting words, so the fact that the number of times the chosen words appear throughout a novel increases and decreases over time isn’t all that exceptional. Except this one graph basically shows you the reason why the the Bishop of Wakefield, Walsham How, burnt his copy of Thomas Hardy’s novel Jude the Obscure. Some scholars hypothesize that because the criticism against Jude the Obscure was so strong, Hardy never wrote another novel. So, what does this graph show that’s so bad about Jude the Obscure?
Check out which words are being graphed: ‘wife’, ‘church’, and ‘love’. By section 5 on the graph (an even ten-point division created by Voyant), ‘wife’ and ‘love’ are almost perfectly in sync. The more Hardy uses ‘wife’, the more he uses ‘love’. Notice that ‘church’ does the exact opposite. This demonstrates the major theme of the novel which is that love and marriage aren’t always congruent with the church. (Note: I chose the word ‘wife’ and not ‘marriage’ because ‘wife’ appears slightly more frequently than the word ‘marriage’. Similarly, I chose ‘church’, not ‘religion’ or ‘Christianity’.)
So you might’ve noticed some inconsistencies, particularly at the beginning and the very last point where ‘wife’ and ‘church’ reconvene. In order to understand the significance of these points, you’re going to need some background on the plot. (I’d like to point out here that we already have a fairly solid grasp as to why Hardy’s book was burnt and we haven’t even gotten into the names of the main characters, the setting, or the plot, so that’s kind of cool).
Here’s my “Sparknote-y” version of the plot:
Jude is a pretty regular guy living out in the English countryside. He meets a local girl, Arabella, and they hook up. Though Jude wants to leave to go to Christminster to study, Arabella tells him she’s pregnant and so he decides to marry her.
Surprise! She lied.
They really don’t get along and Jude considers killing himself. Arabella decides to move to Australia and they go their separate ways. Jude goes to the city and runs into his cousin, Sue (who his aunt specifically told him to visit). Sue is intelligent and interesting and Jude is super into her. That’s right, she’s his cousin, he’s super attracted to her. This book was written a while ago (1895) but not far enough back where cousin incest is okay, so keep that in mind. Sue also feels an attraction of sorts though she doesn’t admit it because she knows it would be wrong. So she marries Richard, an old school teacher, who she really doesn’t love.
Yada, yada, yada. Sue and Jude eventually end up together after crossing paths a number of times. They don’t get married, but they live together like a married couple, have a couple of kids, and move around England to avoid suspicion regarding their circumstances.
And let me point out here that Hardy really makes you root for these two. Like, I didn’t think I was going to be dying for these two to get together, but he makes a compelling case. They’re happy. Who cares?
God cares, apparently, because while Jude and Sue are out and about at one point, one of Jude’s kids hangs the other two kids and himself. Sue, wracked with grief and guilt, takes this as a sign from God that they never should have gotten together/left their original spouses. She tells Jude she’s leaving him and going back to live with Richard and that Jude should get back with Arabella. Jude super doesn’t want to do this, but Sue wants it, so he does it. He is very unhappy with Arabella, and goes to visit Sue, who tells him she can’t be with him and though it makes her sad, she remarries Richard. Jude goes back to live with Arabella and dies of a lung disease (aka sadness). And you, the reader, cry in the airport after reading this thing in one night.
So okay, how does this inform the graph? In the beginning you can see how ‘church’ and ‘love are mildly in sync, though not highly correlated. Hardy uses the word ‘wife’ very frequently when Jude and Arabella get married but rarely uses the word ‘love’. It’s only once Jude and Sue admit their love for one another (though Sue flip flops on this constantly – note the ups and downs) by the middle of the book that ‘love’ and ‘wife’ sync up, weaving exactly opposite the word ‘church’ so that by the second half, the more Hardy uses the words ‘love’ and ‘wife’, the less he uses the word ‘church’ and vice versa. At the end of the book, ‘wife’ and ‘church’ reconvene because Sue and Jude give up the idea of love and submit to society’s conception of marriage.
Just a little playing around shows that ‘wish’ is also highly correlated with ‘love’ while ‘church’ and ‘arabella’ are mostly in sync. The things that make Jude and Sue unhappy–the rules of the church and Jude’s previous marriage–occur frequently at the same time in the novel and rarely with the things that do make Jude and Sue happy–their love and wishes. Basically, Hardy makes an argument that love and marriage shouldn’t be controlled by societal forces like the church because forcing people to live lives that are contradictory to their feelings will end in tragedy.
But I know what you’re going to say, this is just showing the number of times the words occurred in the chapter. It doesn’t show you the context they were used in. The interpretation of the graph could be wrong if you haven’t read the book. Even if you have read the book or the summary above, it doesn’t mean the number of times Hardy used ‘love’ and ‘wife’ in the sixth section of the book corresponds to the actual message of the text. It’s just correlation, not interpretation!
This doesn’t matter for two reasons.
- I’m not arguing that by looking at this graph, you have a complete understanding of Thomas Hardy’s opinions on love and marriage and the business of the church. I’m not arguing that you shouldn’t read Jude the Obscure. Making a graph in 5 seconds doesn’t stand in for reading the book. Besides, how would you know to choose these specific words in the first place if you hadn’t read the book? This graph is a jumping off point. It’s a way to help you formulate a thesis and really dig into the language of the text in a hands-on way. It’s a tool, not the end all be all of analysis.
- Correlation matters! If Hardy is discussing love and wives without discussing the church, if love and religion are at odds with one another, that wouldn’t look great in the eyes of Bishop Walsham How. The raw frequency of these words in each section of the book isn’t a perfect stand in for how the words are used, but it is enough to show a major theme in the novel, a theme that got Hardy into serious trouble.
What are the takeaways? In order to truly understand the relationship between these words and concepts, further text encoding would need to take place. For example, additional analysis could be done using the words ‘father’ and ‘right’ but the word ‘father’ could mean biological father or father of the church. ‘Right’ could mean opposite of left or it could mean just or proper. By taking the time to further encode the text, you could potentially reveal additional levels of insight into Hardy’s word choice. It may even be enough to hypothesize about Hardy’s subconscious while writing the novel. However, the exercise of just throwing a corpus into Voyant is still certainly worthwhile, especially if you need to come up with a topic for a paper or want to better understand an author’s vocabulary.
What matters most about this endeavor is that it’s just cool. It’s so interesting to think themes can be teased out of a corpus simply by counting words, and without a lot of effort at that. Just thinking about the use of individual words in this text and their meanings has given me a dozen different possibilities for further research into this text and others. So, I’ll end with a few questions:
How would all of Thomas Hardy’s works compare in terms of word choice? Would they follow the same theme? What about works of other authors who were highly criticised for their writing during the 19th century, like Oscar Wilde? What about authors who were praised? Can text encoding reveal the panoptic nature of the text? How would this graph change if Sue was the main character, not Jude? To what extent can we extrapolate whether some words were chosen for their double meaning (right direction vs right proper)? Can we make any guesses about Hardy’s intentions versus his subconscious while we was writing this novel?