Valentine's Day 2020 has recently passed. The day of the year where couples celebrate their relationship and social media is awash with posts, pictures and praises for 'other half's. It is often an unenjoyable time of year for anyone not in a relationship, who has to deal with the portrayals of perfection we so frequently see on our timelines. Despite being in a relationship, I too am fed up with seeing unrealistic depictions of couples online so this year I decided to fight back with some cold hard data. To do this I sacrificed my privacy and turned to the medium which I believe to be the most real - my Whatsapp chat with my girlfriend.
I recently discovered that it is possible to download your entire chat history from Whatsapp into a single text file. It occurred to me that this could be a very interesting source of data which might give a far more realistic, unique portrayal of what a relationship is actually like. So here in this article I'll put mine out there in the open in the hope that you might find the analysis interesting, or that it might inspire you to a) think about how data can counteract the false reality we see on social media, b) dig a bit deeper into the data you yourself produce or c) give your partner a very nerdy, data science-themed Valentine's day gift.
(Bonus points if you can identify all the section headings…)
Here comes the sun
As I mentioned, Whatsapp allows you to download the entire history of any of your chats very easily and to export as a single text file, which I then transformed and analysed using the programming language R. The chat in question was started in October 2016 (since I last changed phone numbers - my girlfriend and I have been together since March 2016).
It is our main form of virtual communication (we rarely text or use Facebook, for example) and the result was a dataset of 52,163 individual messages — an average of 43 messages per day. This is already surprising me and we haven't even scratched the surface yet…
The Power of Data
The aim is to summarise our relationship so right off the bat we're going to be aggregating and looking at the high level figures. So, here is a brief overview:
- 1,212 days - between 18th Oct 2016 and 12th Feb 2020.
- 980 (81%) of those were 'active' days (i.e. messages were sent).
- 52,163 messages. I sent 1,495 more messages than my girlfriend - again, not what I was expecting to find!
- 11,670 unique words were used (many of which are not real words). I used 8,844 unique words where my girlfriend used 7,043.
- 2 participants - my girlfriend and I - living in the UK and in our early twenties.
I don't want to type a thing
The number of messages we send each day has been falling, on average, over the time we've been together.
This is mostly down to the fact that at the beginning of this time period we were both at university and living apart, therefore messaging more often. You can then see in mid-2017 when we finish university and are living together but not yet in a full-time job. Because of this, we were spending most of our time together each day and as a result, did not message each other. In September 2017 we started jobs and message frequency picked up again (oops).
If you look closely you'll also see a pattern of a spike around Christmas time - which we spend apart with our respective families - and a subsequent dip around the new year - which we tend to spend together.
Friday, I'm in love
The next graph could easily be a graph of motivation levels throughout the week. We send more messages on weekdays than weekends.
Our messaging patterns tend to reflect my general feelings towards each day of the week, reaching a crescendo on Friday when we are most likely to be busy and socialising. This often involves planning and communicating of relevant information (so more messages). Sunday is the day of the week we are most likely to spend together so we send significantly fewer messages.
Time after Time
Following on with the theme of more messaging during the work week let's see if we really are that bad at procrastinating or if I can save a bit of face…
You can see that our messages gradually increase throughout the morning, reaching a peak around lunchtime (unsurprising). They then dip slightly after lunch as we go back to working and pick up again after 4pm, around the time we usually finish work. Fewer messages in the evenings is simply another reflection of the fact that we don't send messages when we are together.
More than words
Now for the real damning evidence…time to look at what our most commonly used words were. For this, I removed all stopwords ('a', 'the', 'and' etc.) and the top 10 most used words by each of us can be seen below:
Evidently we agree with each other…alot. I'm fairly sure my English teaching father would be disappointed with my frequent use of 'gonna'. It's also clear that 'ah' is our favourite filler word - or whatever the equivalent is for text conversations.
I find it interesting that 'time' is such a common word for both of us, probably reflecting situations when we are trying to organise or plan something ('what time?', 'if we have time', etc.). It's also fairly obvious who says 'good night' and 'good morning' the most often.
Total Eclipse of the Heart (Emoji)
Emojis are an extremely popular form of communication these days and are often used for comedic effect or to portray emotions without having to type the words. To round off this fleeting analysis of the Whatsapp chat between my girlfriend and I, let's take a look at what our favourite emojis are:
Neither of us are surprised at what our most commonly used emojis are. I am however very surprised at just how much I use the eye-roll emoji…especially when you consider it's more than twice as much as any emoji my girlfriend uses, other than the top two.
I also find it interesting to see how the chart reflects differences in the way we use emojis. I tend to use a wider variety and - although I clearly have some favourites - the spread of the number of times I use each emoji is fairly even. My girlfriend, on the other hand, has two emojis which are clearly her 'go to' emojis - the see-no-evil monkey and the laughing crying face.
EDIT: The code used in this article is now available on my Github here.
EDIT (Sep 2020): I have recently published a follow-up to this article, 4.5 years of a relationship, in Facebook activity.
Your Data is a Wonderland
Well, I hope you've enjoyed this brief insight into my relationship - I'm sure you've found it thrilling. Overall it seems 3.5 years of a relationship on Whatsapp can be summed up with one eye-rolling emoji.
In all seriousness if this type of analysis and presentation of data interests you, give me a follow and give my publication (Data Slice) a follow too to stay up to date with my articles! I'm also considering creating an app to allow anyone to quickly see a visualisation of their Whatsapp chat using similiar graphs/charts to what you see here - let me know in the comments or by direct message if that's something you would be interested in.