Wednesday, June 23, 2004

Another Break?!

I know, if I'm not just being lax and letting my blog slide I'm actually taking time off. And you know I don't normally talk about personal stuff, but this is important.

In less than a weeks time I'm getting married.

So from tomorrow i'll be away sorting the final things and looking after relatives and afterwards we are hoping to get away for a few days.

I know I've not posted much, but work has been going well these past months, and when I return I very much intend to keep that up.

Be seeing you.

Thursday, June 17, 2004

Once a Programmer...

So with the help of Java tutorials, and Java API things (you can tell I'm very expert) I'm happily coding away.

Yes, I thought I'd finished too, but it turns out, the error/correction rates aren't great with the automatic system. It introduces too many new errors, and the higher you put the confidence threshold (how confident you want the spellchecker to be with its suggestion) the more correct corrections you lose.

However, all is not lost. The system does a good job of some automatic corrections, such as capitalisations, and actually leaves very few spelling errors. So we decided that I should run the automatic part, and then spell check the rest (remember, fewer errors than we were expecting) by hand, if you will.

So I am currently using the javax.swing libraries to write some interface code to...interface...with the dictionary software. That's right, I'm writing my own spellchecker, like so many that already exist.

The advantages are I can get it to do exactly what I want, which should include maintaining different dictionaries of different kinds of words. And it will give me experience of Java interface coding. Cool.

Disadvantages...still earning the java; it looks rubbish; and it might not work.

I'm just about on top of the interface side, the actual look and working of the buttons. But then I have to connect it to the dictionary code, and I have NO idea how to do that.

Wednesday, June 09, 2004

Best Practice

Here's a thought...when writing something on a computer, like a letter, an essay or, I don't know, A BLOG, it's a good idea to use a spell checker.

Seriously people. I'm currently having to spell check my corpus, and it's a lot of words. So rather than spend approximately 3 weeks running all my text through something like Word, where I have to sit and go, change this, change that, I was looking for an automated tool.

Not very many of these it turns out. However, I have access to a set of Java commands and a big dictionary, so no problem to knock up my own automated java-based spell checker.

Except I don't actually know java. Still, I spent a week fiddling, and I'm there. Good to know the programmer in me is still alive and well. Anyway, the downside with an automated system is that often the top answer is not the right answer. So it's a trade off between accuracy, and time saved. At the moment I am currently running the my code of a random 10% sample of my texts, and measuring performance.

I'm looking at things like:
  • how many reported errors are actually errors
  • how many errors are actually correctly corrected
  • how many non-errors are incorrectly corrected.

Of course, if everyone would just check the spell of everything they write, it would be a great help. I admit I didn't use to when I blogged, but I have done for some time now, and not only does it make it easier to read, but also much easier to analyse :D

Of course there is still informal language such as slang and reeeeeaaallly interesting emphasis terms...but that's another story.

Tuesday, June 01, 2004

Begin Again?

So I've been busy working away, not updating my blog, as it seems has become the norm. My work has been covering various threads:

  • continuing with statistics which leads to...
  • beginning actual analysis.
  • reading about the history of personality models.
  • reading about blogs.

It's this last point that, perhaps most obviously, I wish to discuss here.

So a colleague gave me a thesis of some work using weblogs in collaborative education work, and they reference a paper a got ages ago, but never read. So I thought I'd better read it.

It's called Blogging Thoughts: Personal Publication as an Online Research Tool. It has been written by two blogging academics Torill Mortensen and Jill Walker. It's very interstice.

It dates from 2002, and it's really, as far as I can tell, one of the first pieces of academic work to embrace blogging. The field is growing, with many people using blogs as tools in study, or looking at the social impact of blog networks.

There are lots of points form the paper I want to talk about, but mainly it's the idea of the blog as research tool. I really don't take my blog seriously enough. Mainly because I'm just not a diary writer, but I'm going to try. I feel inspired to keep up with blogging, but also to make more of it.

Really I just seem to detail the things that I do, and really they are quite mundane (Mortensen and Walker discuss academic publishing online, something I should like to comeback to). So I'd like to use this blog to do more. This post is that start. I read an interesting article about blogs, and I'm sharing it. I'm going to look more into the world of academic blogging. I'm going to investigate to possible relevance to me of BlogTalk 2.0, a conference in Vienna for "bloggers, developers, researchers and others who share, enjoy and analyse the benefits of blogging.

Maybe I'll change by template too, so it looks more academic, and less blog beginner...I have been at this for a year now.

Nedstat Basic - Free web site statistics

Powered by Blogger