Wednesday, March 24, 2004

4/3/4 update: This post was spurred by Mike1's pet finite-state-machine, otherwise known as a tamagachi.
You can't escape my notions of wanton ambiguity without leaving this page or changing the focus. I dare you ... At least for this post. I dismantled the gates and sent my watcher home early today. Refresh your eyeballs and then come back when your bowels are empty and stomach is not, since your car doesn't do well with clogged pipes and reserve gas tanks. Do you think consciousness can skip several billion years of evolution ? Or does technology count as just another product of those years ? If so, then when is something alive ? William Gibson's Neuromancer gives birth to some of the ideas that shaped The Matrix, and, from the A.I.'s perspective, deals with the psychology of being so brilliant that you've reached disturbing limitations and safeguards to keep your intelligence in check - something that actually is just one aspect of the human condition. Try to contest this point with me, and I'll show you you're a hellofalot smarter than you think... Much in the same Duke Nukem 3d's integrated deathmatch bots were far too powerful on default. They'd strafe perfect circles around one another (and you) ... blasting away with 100% accuracy, but don't even know it. Watching them play is like watching a ballroom filled with Bruce Campbell clones (but do they know ?) wearing sunglasses, holding loaded weapons, shouting taunts ... and yes their circling duels are ... almost like dancing.

As I walked across an unmarked, uncontrolled intersection lawfully with my co-worker and friend, we cut off an old man in a truck, forcing him to slow down from 20 or so meters (imagine one turtle cutting in front of another in an intersection, and you get the idea of what I mean.) As we walked, I noticed him staring directly at me as if to say something against us. I stared back and said to his watchful eyes, "Yeah, I'm staring back !"
He drove into the lightly trafficked street, and turned into the next parking lot we were walking by (fortunately seperated by a metal fence), stepped out of his car, walked up to the fence till his face nearly touched the bars and started to give us his hastily rendered speech to whom he felt were the representatives (us) of all who jaywalk and get in front of cars and trucks in heavily trafficked areas. (but not in so many words, and certainly not as nicely put.)
I was as polite with him as if I was a retail employee dealing with a customer, just to shit with him, and then I ignored him after the first few exchanges. I felt sorry for him -- when he calmed down, he might've realized that I didn't give him the argument he was looking for , and how one-sided the fool was. (Since he seemed like a normally reasonable fellow.)
Thinking someone represents a whole subset of people without any officiality is one of the biggest and dumbest and unfortunately most common misconception that humans have. Religous zeal, superiority, mysogeny, etc, etc ... activists/terrorists, leaders, and everyday people do this, and it's just one of the things that disgusts me. Fortunately, I know that not everyone does this. And these individuals do not speak for the whole ;-)
I had a productive day on the economic and family ends, but not so much on the personal end. I vow to enhance and improve. Peace, love, good fortune, and success to one and all. Good day to you on the other side of the world, and to everyone else ... good night.

PS -- Ah yes ... my job ... and M's job ...
well, it's not really a job ... more like compensation in computer parts and having fun 10 hours a week in all that's tech-related. In concert with our efforts, we are given our own office ... the techie playground, brought to you in full by Dave, the Boss (techie/phone-hacker at heart)
So far, on the job, the three of us have hooked up 4 computers at once by calibrating 4 wireless usb sensors to one wireless mouse and one wireless keyboard, taken apart a computer monitor plug to replace the pins, put together a few computers, and analyzed some hard drive problems.
I have this to say, regarding simple text extraction and analysis from everyday digital literature:
1.) Microsoft Word documents usually have little else in them that lies in the range of grammatical characters aside from the content written by the user. Most extra unchecked phrases and keywords can be caught by an exclude list that either removes header and footer info, or just removes the actual keywords, which is a bit more risky, considering the fact that any given plaintext keyword could be a legit piece of the document.

2.) Abiword documents have a plaintext syntax, much like teX, RTF, and html documents, which can also be filtered out in the same manner.

For instance, here is an average line of RTF text:
(the characters \{ denote an escaped bracket character, and action denotes standard text. everything that's either a command or could be confused with the syntax can be escaped with a \
\par {\loch\f0\fs24\lang1033\i0\b0 Action \{ \{ action \}actionactionactionactionaction \}action action actionactionactionactionactionAction action actionactionactionactionaction action action actionactionactionactionactionAction action actionactionactionactionaction action action actionactiona

And check out this segment of Abiword text !!
<p style="Default" props="margin-top:0.0000in; margin-left:0.0000in; text-indent:0.0000in; dom-dir:ltr; margin-bottom:0.0000in; line-height:1.000000; text-align:left; margin-right:0.0000in"><c props="font-family:Times New Roman; font-size:12pt; lang:en-US; text-position:normal; font-weight:normal; font-style:normal; text-decoration:none">\Action action actionactionactionactionaction action action actionactionactionactionaction.</c></p>

The only documents that require more in-depth handling seem to be specially encoded or zipped documents, OpenOffice documents (and variants), and PDF files. As soon as I get more info on those formats, I'll deal with them as well ! :-)