Wednesday, June 09, 2004
Here's a thought...when writing something on a computer, like a letter, an essay or, I don't know, A BLOG, it's a good idea to use a spell checker.
Seriously people. I'm currently having to spell check my corpus, and it's a lot of words. So rather than spend approximately 3 weeks running all my text through something like Word, where I have to sit and go, change this, change that, I was looking for an automated tool.
Not very many of these it turns out. However, I have access to a set of Java commands and a big dictionary, so no problem to knock up my own automated java-based spell checker.
Except I don't actually know java. Still, I spent a week fiddling, and I'm there. Good to know the programmer in me is still alive and well. Anyway, the downside with an automated system is that often the top answer is not the right answer. So it's a trade off between accuracy, and time saved. At the moment I am currently running the my code of a random 10% sample of my texts, and measuring performance.
I'm looking at things like:
- how many reported errors are actually errors
- how many errors are actually correctly corrected
- how many non-errors are incorrectly corrected.
Of course, if everyone would just check the spell of everything they write, it would be a great help. I admit I didn't use to when I blogged, but I have done for some time now, and not only does it make it easier to read, but also much easier to analyse :D
Of course there is still informal language such as slang and reeeeeaaallly interesting emphasis terms...but that's another story.