Saturday, July 30, 2005
Personality of a Blogger
What interested most people at the conference, was that I had gathered personality information on bloggers. There are various reasons why I cannot say anything for definite from my data: I only had 71 subjects; I have no comparison group. There has been work on the gender and age of bloggers, but I'm not sure anyone has data like mine.
However, before I publicly discuss my findings, I'd like to know people's intuitions. What sort of a person do you think a blogger is? To help you, the data I gathered gives a score on five measures:
Do you think bloggers are natural Extraverts? Are they highly conscientious? Which of these traits, if any, do you think reflects the personality of a blogger? And why?
Well, I'm back from the conference and subsequent Nowson European Vacation in the Swiss Alps, and it was very thoroughly enjoyable. Just what one needs to recharge the batteries before the final big push on the thesis.
For me the conference is as much about getting me out there, as it is my work. I've not always been the most confident person, and I tend not to talk to people at events like this. This time however I knew I had to. And so I did.
I spent most of the two hours of my poster session discussing my work with all the people that read my poster (which I shall endeavour to make available when I'm next in the office). After the disappointment that was our departmental poster session, I was surprised by just how many people were interested in my blogging work. Turns out there were a fair few bloggers among the Cognitive Science crowd - one even took a picture of my poster to stick on their blog apparently.
What most interested people was not what the poster was actually about, although I did have some very informed discussions about formality in blogging. No, what piqued much interest was that I had gathered personality data on bloggers. Everyone had their own theory on what sort of person a blogger is, most relating to Extraversion. Actually, I might make a small survey post about that...see what people think.
Anyway, I managed to confidently talk away to lots of people, and I think it was a very successful session. I had A4 copies of my poster hanging up for people to help themselves to, and despite the paper being on a CD everyone received that morning, a good few were taken. Easier to digest than a whole paper I guess.
Of course, I didn't get to speak to everyone I would have liked. Looking through the schedule, here were a couple of people whose work I had cited in the paper, and I was particularly keen to talk to them. But they weren't there. Likewise one of my first tutors at University, who helped me a lot before he left, was second author on 3 posters. But again, nothing but a blank space, taken over by someone who blatantly didn't follow the poster size guide lines.
So, in summary, it was a god conference, both in terms of boosting my work and myself.
Monday, July 18, 2005
Contextuality of Blogs, 3 - Individual Differences
In the last post I talked about how a measure of formality/contextuality could be used to tell the difference between different genres. It does this by looking at one aspect of the language they use, namely the parts-of-speech. Here I will talk about the work I've done on looking at individual differences: looking at gender and personality.
There has been a lot of work on gender and language. Perhaps the most well known application is the much linked gender genie, based on the work of Koppel and Argamon. I used the F-measure to see if there was any difference in formality between men and women. Not only was there author gender information for our blogs and emails but the BNC held information for some of its genres. You can see the resulting F-scores below.
In four of the five genres women score significantly lower than men. As we might expect, men take a more formal approach to writing, while women are far more contextual. The exception is academic writing. Here, the almost identical levels of formality suggest that when required, women can adopt a style as formal as that projected by men.
The next investigation concerns the personality of the bloggers in my study. As part of my data gathering, bloggers completed a personality questionnaire that gives scores on 5 factors: Neuroticism, Extraversion, Openness, Agreeableness and Conscientiousness. For an explanation of these traits scroll down this page.
To investigate how formality/contextuality differs by personality, we correlated each trait with the authors' f-scores. The previous work on implicitness in emails leads us to expect high neurotics and extraverts will prefer contextual language. The results of the correlations can be seen in the next table.
The negative score for Extraversion and Neuroticism is as we expected: higher scores on the personality scale, relate to low F-scores and contextual language. Low neurotics and introverts write in a more formal style, but only just. The correlations are actually very low, so the results are far from conclusive.
Conscientiousness has a very low score, meaning there is no correlation at all. Openness and Agreeableness however score much higher correlations, the latter being a statistically significant result. Openness has previously been considered the factor of intellect, and it was theorised that higher scores on the Openness trait would reflect higher levels of formality. Our result show some support for that theory.
More interesting is the novel Agreeableness result, for which we have our own theory. One aspect of Agreeableness is the cooperative and accommodating nature of high scoring individuals. This suggests to us that highly Agreeable bloggers are more aware that there may be little context shared between them and their readers. This results in a much less contextual (more formal) style of writing than low Agreeable bloggers.
This is by no means the end of my work, but it sums up everything that I'm taking to the conference. I've introduced you to the F-measure, shown you that it can show differences in genres, and it can highlight differences between individuals. In the future, I will return to formality/contextuality, as I have discovered a few more interesting things, but I'll leave it here for now.
If you would like more details on this work and the background to it, you can always check out the original paper. If there is anything else you would like to know, if you have any questions or want leave any feedback, then you can always leave a comment here or alternatively drop me a line.
Thursday, July 14, 2005
Contextuality of Blogs, 2 - Blogs as Genre
In the linguistic research community there is a lot made of genre. There are many studies that try to tell the differences between genres. Most studies that examine language, like this one, normally take their language from one or more genres. I chose blogs as my area of study, and so it is interesting to see if blogs can be considered as an individual and unique genre. It would certainly seem intuitive to say that they are at least a cross between diaries and personal homepages.
In a number of interesting papers Herring et al. have investigated, amongst other things, quite what defines a genre. They make a very convincing argument that weblogs can in fact be classed as a genre, placing them on a continuum between static HTML homepages and ever changing newsgroups.
The first analysis I conducted with the f-measure described in the last post was to compare my sample of blogs to samples taken from other genres. I chose a selection from the BNC (a collection of 100 million words of spoken and written English) that included both spoken and written texts, scientific and fiction writing.
I calculated the F-score for every file. The averages for each genre are in the table below. Remember that low scores mean the language is more contextual, while the high scores use more formal language.
As you can see, there are a number of very plausible differences and similarities: on the whole, spoken genres are more contextual than written; professional letters are more formal than personal; university essays are more formal than school essays though they are similar to academic publications. These quite understandable results suggest that our technique, the F-measure, clearly detects an aspect of differing language between genres.
So we turn to our texts, the emails and blogs. Emails are understandably similar to both personal letters and text from a mailing list. Interestingly, blogs are only as formal as school essays. They are however (significantly) more formal, or less contextual, than emails. This is understandable for a number of reasons:
This part of my study was to show that on at least one aspect of language, blogs differ from other genres, and the F-measure is a useful tool. In the next post, I will talk about how the F-measure can show difference in formality/contextuality between individual authors.
- Audience: the emails were written to close friends, but blogs can be read by anybody, which means the author does not know everyone who will read it. They cannot expect the reader to know everything about them.
- Space: since the blog author may not know the reader, they know that they may not share knowledge of the same things. They may have to give more descriptions of people, places and activities, than someone writing to a friend.
Wednesday, July 13, 2005
Contextuality of Blogs, 1 - Some Background
Weblogs are a growing amorphous entity on the internet and are coming increasingly to the attention of academics. The work I am most interested in concerns language, and the web is increasingly being considered as a resource for linguistic study. There has been previous work of computer mediated communication (CMC) formats, but this work most closely follows a study of email.
Previous work on implicitness of language has led Heylighen and Dewaele to develop a linguistic measure of contextuality. It is felt that certain kinds of words require more context than others in order for them to be unambiguously understood.
Consider for example Pronouns. If I were to say, "he loves it" you would not know who he is or what it is. You need more context in order to fully understand: "The President of the United States just bought a new bike. He loves it."
Using this classification, a formula was created that summed relative frequencies of parts-of-speech and results in a single score, the F-measure of a text. A low score means that a text is very contextual, while a high scoring text is termed formal.
I have collected a corpus of blogs (the personal diary kind), along with data about the authors. I have used the F-measure for two investigations.
This is the introduction to the work I am about to present and have published. The results of my work will follow here shortly. For those interested in the full details of the work, you can find a PDF of the paper here
- First, I want to know if the F-measure is any good. So I took a selection of genres from the British National Corpus (BNC) and scored them alongside my blog corpus, and the previous email corpus. This let me see if the genres were plausibly placed along a contextual/formal scale.
- Secondly, I am interested in individual difference within the blog corpus. I wanted to know how personality and gender might result in different F-scores.
Tuesday, July 12, 2005
As I may have mentioned before, I'm soon to depart for the Cognitive Science Conference, in Stresa, Italy at which I shall be presenting a poster based on the paper I wrote that will be included in the conference proceedings. It shall be a proud day, since this is one of my first proper publications from the blog project. Of course, unlike the good old days when you would get huge books full of papers, everything is now given to you on CD. Which means no excitedly flicking through to find my name in print, which is sad. It also means no carrying around 1000 pages of other people's work as I go about my travels, which is nice.
So, to my point, since my work is finally being published, I feel I can finally publish it here (I've previously discussed the problem of academics posting results on blogs, since many journals accepted work that has not previously been published). So I am going to post a link to the paper, and break it down into a few succinct, interesting and ultimately sound-bite worthy posts. Watch this space...
The title of this post, as I'm sure you are aware, is also the title of this here blog of mine. As far as I know, blogademia is a word I made up because I was doing an academic study of blogs, which few people were doing in 2003. I've also decided to call academics who are interested in blogging, blogademics, because that makes sense. You can see some in the links section of this blog and on my homepage.
So imagine my surprise when I used Google to search for Blogademia and not all the responses were mine. The cheek of it. Some people were even spelling it blogOdemia.
Wednesday, July 06, 2005
We Can Fix It
So I was looking at the Blogger website, when I came across some interesting news. It seems like the spacing problems weren't entirely my fault. A lot of people have been suffering aligment issues since picture blogging was introduced. There was a suggested solution but it made things, well not necessarily better, more like odd in a different way. After fiddling around for a bit and making things worse, I gave up and took out the changes. Everything seems to have returned to normal though which is nice. Here's hoping it stays that way with this post.
I'm working from home again today. I had an extended weekend, working here on Monday in order to avoid the troubles. Currently we are watching hundreds (soon to be thousands) of people in rain gear flow past our windows on the way to the final push concert at Murrayfield stadium.
Friday, July 01, 2005
So it's still the same, the date is at the top of the page, while the text has dropped. I don't know quite enough about CSS to see obviously what has gone wrong. I might just need to fiddle a little. I also want to work it so I can more easily layout a post. Like have a POSTTITLE class or something. Any suggestions would be most appreciated.
Spot the difference
So, my blog here looks different. This is because in preparation for the conference, with great help from my wife I have redesigned my webpages. My old ones looked dated, and they rambled a bit too much. The new design is crisper, and the text is more succinct. As I had always meant to, I decided to align my template here to be the same. I've barely touched my template before, it's pretty old school, so this was a little fiddly. It was especially soworking on a dial-up connection but I managed to get it working. Pretty much. What do you think? You like? The text has moved though, so I need to see where this post appears...