• Student: Anthony L. Borchers, now at Lexmark International
  • Purpose: Apply data extraction techniques to World Wide Web documents
  • Method: A general-purpose WWW robot engine with an extension mechanism for attaching data-extraction procedures at runtime. These procedures may then implement domain- or format-specific functionality.
  • What the student learned
    1. Managing a large software project over a long period of time
    2. Multithreaded programming in Java, including synchronization methods (This part took considerable cleverness.)
    3. Details of the HyperText Transfer Protocol (HTTP) spoken by Web servers and clients
    4. Technical writing skills in preparing the write up and packaging the resulting tools