Thursday, February 19, 2004

It's harder than it looks

...this blogging lark. I am looking at blogs, or rather blogs from May of last year, all day every day, and people post all the time. Admitedly, there are some people who don't post every day, but there are some that post several times a day. Me...I manage once a week.

I just don't have anything interesting to say. I wish there was more of interest, but at the moment I'm just slogging away at the tagging. I'm trying to at the very least update at the end of the week, in case anyone is interested. So apologies for not saying more.

Onto the weekly report. I've been tagging. Lots of tagging. Happy happy tagging. Some of them are easy, some of them are tiresome. Any tagging like this in research, however necessary, is going to get boring. This is why members of staff pay undergraduates to do the "manual labour". I'm not sure if I have funding in place to pay some one to do the boring stuff. Doubt it. But i'm getting it done, and I'm getting done.

Big shout ouf to <oXygen/> because it was good, but as I got to learn the shortcuts, and the really complicated features such as find and replace, I now class it as great. It's incredibly helpful, and I'm not at all convinced by people who tag by hand.

There have also been small niggles, and things I've noticed...such as what i did at the start of the post. The title of a post is not part of the text of the post normallym and often it contaions extra data, so I tag it seperately from the main body. But as is the case above, they flow together. Which results in my opening clause, "...this blogging lark." making no sense. Something to bear in mind when it comes to the linguistic analysis.

On a more general work front, I need to read more. I started off well when I came back from my hiatus; I was finding a nice balance between practical work and research reading. But I've always preferred the practical stuff, and have drifted this past couple of weeks from the reading. So I must make a conscious effort to get back to that.

Also, I really must take my experiment offline, because people are still doing it from time to time. Thank you very much, it really is appreciated, but I can't really accept any more data at this time. Sorry.

Thursday, February 12, 2004

All change please

You know, I could have sworn I posted more recently than last Wednesday. I really thought I had, but unless Blogger has been playing silly sods again...well...I must not have posted.

So, a lot to catch up on? Not really work wise. I mean, I have been working: I'm picking up pace with the tagging effort. Every file brings up its own new issues, new tags that might be useful, so of course I need to go back and make sure that it might not have been needed in the already tagged files. It sure is a pain. Right now I'm taking a break from a file. It's finished, but when trying to validate the file (making sure the XML is correct, the file properly nicely all tagged up) the software tells me there is an error. But for the life of me I can't see what it is. I'm sure it's something obvious and stupid, it always is, but it's annoying.

So there has been work, but a lot of this week has been taken up with the move. We packed on Tuesday, the movers moved our stuff on Wednesday morning, and in the afternoon we kind of arranged to office nice. Today, we actually tracked someone down to make the computers work, so that we can actually get on with work.

I must say moving office is almost as big a distraction as moving home. It's incredibly inconvenient too because there are so many people telling you different things and ... well, I could rant more, but now is not the time nor the place. Needless to say, it's been a pain, and it's going to continue to be a pain as we acquire new machines, desks, and office mates.

So...any other issues to discuss. Hmm. I can't think of any right now, but if I do, I know where to find you.

Wednesday, February 04, 2004


So, again, after a longer period of time than I anticipated, I think my DTD (see yesterday's post for explanation) is finished. It took a lot of tweaking. I was playing around with entities and they seemed to work, but then they denied they were, and it got very confusing. So I took them out, but using them had forced me to totally pin down the structure of the grammar. With a few additions, it now seems complete.

And more to the point working. There was one issue I had testing it yesterday, that actually, as with many programming errors, came down to a spelling mistake. Don't they always. It's either that or a missing bracket.

So, I'm beginning to tag my first file. It all very exciting. And time consuming. I imagine as I get better at it, it will get faster. However, first problem, I am classifying the contents of the blog. But in order to classify things you need a definition of your classes. So I am trying to finally pin these down. I thought I had it, but I've just been discussing the issue with my office mate, and I'm not sure what I think any more. It's looking like it would be easy to set myself a distinction, but it may be hard to see where things lie. Something to consider and discuss at length in tomorrow's meeting, because it involves a lot of issues.

I'm sorry I'm not saying more, but I'm thinking, I think, about things that haven't really been considered before, so for now I'd rather not put my idea out there, until I'm settled, and until I have something to actually say about it. Sorry. Very confusing I know, for you and me both.

In other news, we are moving...sorry...being forced to move offices next week. So we've spent a lot of the day dealing with the admin of that, and discussing what we are going to do and what have you.

Also, after some more emails, my computer has settled down and is running RedHat 9. There are of course, as with any new system, lots of set up options I'm tempted to fiddle with, but I'm trying to control myself right now. I can do it once we move, because there will be a lot of re-adjusting to be done then.

Tuesday, February 03, 2004


So, I managed it. With my new work regime I made my self imposed target of pre-processing all my YES files. Hoorah.

The next job is to get on with the tagging. I'm tagging in XML, and so to help me, I've been writing my DTD (or Document Type Definition) which is like a grammar for XML. I'm defining all the tags i want to have, and the structure that relates them to one another.

ie. each blog will be wrapped in <BLOG></BLOG> tags. A BLOG will be made up of DAYs, and these in turn of POSTs.

Earlier today I had a meeting with my second supervisor so as to get her opinion on my structure so far. Things are looking good. I just need to finalise the syntax, and I can get on with testing it in <oXygen/>.

But then...I get back from my meeting, and finally the support guy has shown up to upgrade my Linux box from Redhat 7 to Redhat 9. So now machine is tied up, and I'm having to log onto another machine in the office, with a really low and difficult to use resolution. It's ok, but I don't want to get on with my DTD till I'm back on my own machine.

So I'm reading. I've a thesis to read, but I've reached the hard stats part, and I don't really know what's going on. So I'm blogging. Hello, how are you?

