Gender Genie: The Waving Cat is male

“On the Internet, no one knows you’re a dog” by P. SteinerThe Gender Genie uses an algorithm to predict the gender of an author by analyzing their texts. Sounds like a big load of crap? Well, it isn’t. Moshe Koppel, Bar-Ilan University in Israel, and Shlomo Argamon, Illinois Institute of Technology put together a nifty piece of analysis software here.

The Guardian has a pretty good explanation of how the analysis works:

On the internet, there is a new website that claims to be able to tell you, with 80% accuracy, whether a piece of writing has been done by a man or by a woman. It uses a computer programme developed by a team of Israeli scientists after an exhaustive study of the differences between male and female use of language. One of their findings is that women are far more likely than men to use personal pronouns (“I”, “you”, “she”, etc), whereas men prefer words that identify or determine nouns (“a”, “the”, “that”) or that quantify them (“one”, “two”, “more”). According to Moshe Koppel, one of the authors of the project, this is because women are more comfortable thinking about people and relationships, whereas men prefer thinking about things. But the self-styled “stylometricians”, in creating their gender-identifying algorithm, have been at pains to avoid the obvious.

Well, let’s put it to a test. First of all, I noticed that most of my blog posts are pretty short. So I went back a little bit to the last really long post I wrote. Title: “Rant: Why Second Life won’t make it (the way it is)“. So I ran it through Gender Genie. (After taking out image captions and longer quotes there were 1139 words left.) Female Score: 1363, male score: 2029.

“The Gender Genie thinks the author of this passage is: male!”

I did a few double-checks, and it turned out that few briefer posts were rated as female, others decidedly male. So clearly, there’s no 100 per cent success rate, but these results are pretty cool, and they show again that there clearly is a difference in how males and females communicate – even if it’s only stereotyped. In other words: There’s meta-data in our language. Or in David Weinberger’s words:

The consistency suggests that the Gender Genie is measuring something real, some implicit metadata invisible to me but characteristic of my writing. And if it in fact gets authors’ sex right, it is even stronger evidence — as if we needed it — that so much of our meaning lives in what we don’t see.

“On the internet nobody knows you’re a dog”, a famous web comic says. Maybe not, but it looks like everybody can find out if you’re male or female.

(There’s a great brief blog post by Susan Abbott about personality types and how they affect your communication style.)

Link (via)