Researchers from the University of Pennsylvania have analyzed 700 million words and phrases from the Facebook messages of 75,000 volunteers. The resulting word clouds show the extent to which our use of language is influenced by our personality, age, and sex.

Social media channels like Facebook and Twitter have revolutionized the way psychologists study human behaviour and tendencies. To date, social media has been used to track the way people's mood changes depending on the season, to predict the stock market, and even estimate happiness over time. Google has also gotten in on the act by detecting influenza epidemics weeks before the CDC has been able to confirm the same.

Now, in a new study that appears in PLoS One, social media is being used to show how our use of language differs according to various demographic and psychological factors.


For the new study — which the researchers say is largest ever conducted on language and personality — a dataset of over 15.4 million Facebook messages were culled from 75,000 volunteers. Of these, lead researcher H. Andrew Schwartz and his team identified 700 million different instances of words and phrases. And after having the volunteers answer a standard personality test, the researchers correlated these terms with gender, age, and personality.

The results are fascinating — if disturbingly predictable.

Men tend to use the possessive 'my' when mentioning their wife or girlfriend more than women use 'my' with 'husband' or 'boyfriend.' Males also use more profanity and object references (e.g. "xbox").

Women use more emotion words, like 'excited,' first-person singulars, while using more psychological and social terms, like "love you" and the heart emoticon: "<3".

The extrovert’s word cloud reflects positive emotions, like the sarcasm emoticon, and terms like “excited,” “party,” “girls,” and "can’t wait.”

The introvert’s word cloud reflects isolation and a focus on computer-related activities such as internet and reading.

Social media users who are agreeable can be identified by traits such as being warm, kind, cooperative, unselfish, trustful, and generous. Their word cloud reflects religious words (e.g., prayer, church, god bless), well-being (e.g., excited, wonderful, amazing, blessed), and positive social relationships (e.g., love you all, thank you, friends and families).

Words used by less agreeable people reflect aggressiveness, substance abuse, and other terms reflecting a hostile approach to the world, such as “kill,” “punch,” “knife,” “drunk,” “i hate,” “racist,” “idiots”.

Neurotic word clouds are distinguished by depression, loneliness, worry, and psychosomatic symptom words like “depressed,” “lonely,” “scared, and “headache.”

Less neurotic folks use words that reflects positive social relationships, such as “team,” “game,” “success,” and activities that could build life balance, like “blessed,” “beach,” and sport-related words such as “lakers,” “basketball,” and “soccer.”

Word use also varies across age. The younger you are, the more apt you are to say things like “hate” and “bored,” and the older you are, the more apt you are to say “proud” and “grateful.”


By using this open-vocabulary technique, the researchers have shown how word and topic use varies according to geographic location, psychological predisposition, and of course, gender and age. Consequently, the research could eventually be used to test developmental hypotheses — i.e., to measure how our language use changes over time depending on certain demographic factors or developmental stages. It could also be used to differentiate groups or people.

It's important to remember that all these words and phrases were expressed over social media, and may not be indicative of the terms we use in conversation with our friends and families. It's simply how we choose to express ourselves over the internet.

Read the entire study at PLoS One: "Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach."

All images: World Well-Being Project/

Related: 10 Ways Big Data is Creating the Science Fiction Future & If you text a lot, you are probably also racist and shallow.