profile panorama photo
Computer Science
Bethlehem PA

My Computer Is Smarter Than Your Computer

Jeff Maher on Friday, 24 February 2006. Posted in Coursework

Artificial Intelligence (AI) probably can be defined as the study of how to get computers to accel at tasks that humans are able to perform better. This includes everything from playing chess to predicting market trends, from whipping your sorry butt in the lastest video game to driving a car, from determining the best color for your wall to predicting mechanical problems with an elevator. We rely on computers for so many things, but we're still smarter and are able to come up with good heuristics faster and to make efficient decisions. In the same way as parents watch their child learn and do new things, I get a certain sense of pride when I'm able to see my computer do something "intelligent" (it's my baby :-) ). That's why I've enjoyed my recent AI project so much.

The project was based on the concept of creating a program to accurately guess the author of a passage or work. In order to do this, the program has to do be able to train itself on works by several [known] authors. Assuming that you've trained the program on a good variety of passages for each author, you can give the program a text file (without telling it the author) and see if it guesses the correct author. While this may sound like the computer simply has to read and remember, it's not since it should be able to make guesses about passages it hasn't seen. It does this by collecting statistics related to the author's writing style (i.e. use of commas, average syllables in words, typical paragraph length, etc.) and calculating a probability for each author. In the end, the author with the greater probability is the one that is guessed. For example, if I train the program on Huck Finn (Twain) and Pride and Prejudice (Austen), it should be able to figure out that Twain wrote Tom Sawyer having never seen it before.

As the above example suggests, my program only had to be tested for Mark Twain and Jane Austen, although graduate students had to get their programs to work with a larger variety of writers. I'm happy to report that my program has a 96% success rate - correctly guessing any works by Jane Austen and only having trouble with Twain's Tramp Abroad. Yay - go me! While the program was only tested with Twain and Austen, I designed most of my Java method calls to make the design adaptable to work with any sample of authors. So - should I have time over break, I might play around with it to see if I can get it to work with any number of random, I'm such a geek...

Smart Computer

If anyone is interested in the project's write up, my professor will likely have the requirements posted here for a few more weeks.