Projects:
- PlanetMath — A math-centric knowledge community. Features a collaborative mathematics "encyclopedia" and rich forum facilities. This project is currently unfunded but very active, and support is being sought.
- Noosphere — The underlying software of PlanetMath. Also see the planning wiki for PlanetMath and related projects (this is the place to go if you're curious or want to help out).
-
Quality Metrics — Challenging implicit assumptions about search engines in a digital libraries context by performing user studies, and building a generalizable DL search component based on what we learn. Funded by IMLS. Conducted in cooperation with Virginia Tech.
-
MetaCombine — A Mellon-funded digital library project. The goal is to more meaningfully combine digital library resources and services. This includes classification, semantic clustering, visualization, and federated services.
-
OCKHAM — An NSF-funded project to build a federated framework for libraries and digital libraries. The impetus is to get the NSDL's resources into traditional libraries, but OCKHAM should also be useful for DL-to-DL federation.
- CITIDEL — A computer science education digital library. Aiming to be huge. (This project supported me through my master's).
- ircquotes.org — A fun site for user-submitted memorable quotes from internet chat (such as IRC). Meant to illustrate "intelligent" online community design, in terms of personalization and organization. Logan Hanks did most of the implementation.
- The MetaCombine Software — A clustering system, a visualization system, a scheme editor, a focused crawling system (FCS), semantic clustering (OCKHAM) web services, an OAI repository insta-copier (OAICopy), and much more. All outgrowths of the MetaCombine project (and to some extent, OCKHAM).
- Noosphere — The software behind PlanetMath. A collaborative encyclopedic digital library framework, supporting LaTeX.
- Vinstall — A collection of scripts which gives the file-level illusion of a linux system within a linux system, allowing projects to be "sandboxed" from each other. This eliminates library conflicts, package system dependency conflicts, and yields other benefits.
- ESSEX — Efficient Scalable Search Engine for XML. Developed to be a light, featureful search engine for CITIDEL. Coded in C++, with some use of STL. Follow the link to read more about its capabilities. Note: This is in working condition currently, but check the README in the archive for caveats. This should be considered "alpha" software; there is a TODO list in the README that we could use help with.
- LaTeX::TOM — Where TOM = "TeX Object Model." This is a native perl LaTeX document parser and handler, inspired by XML::DOM. Created for both the CITIDEL and PlanetMath projects, it is potentially useful to very many people in the Digital Libraries field, due to its ability to extract and distinguish true "plain text" portions of LaTeX documents.
- XML Logging for Digital Libraries — Java logging server and Perl, Java clients included. I helped design this system and wrote the Perl client.
- The FUD-Based Encyclopedia — A defense of Wikipedia and commons-based knowledge resources from a rising onslaught of fear, uncertainty, and doubt. I propose some basic principles of CBPP production in here, to illuminate why (and when) it works, and demystify the phenomenon. I also discuss how integrating respect for expertise is not antithetical to CBPP — PlanetMath includes it implicitly with its ownership system.
- Combined Searching of Web and OAI Digital Library Resources (and extended version; JCDL 2004) — About building a combined search service over both native digital library resources and web resources via exposing the DL resources through the web.
- The Effectiveness of Automatically Structured Queries in Digital Libraries (JCDL 2004) — About increasing search precision for metadata-rich documents by inferring metadata field clauses for queries.
- An Architecture for Math and Science Digital Libraries (master's thesis) — My thesis, about PlanetMath and Noosphere.
- Authority Models for Collaborative Authoring (HICSS 2004) — About authority models (owner-centric and free-form, as well as a mix of the two) for collaborative content creation. This is an empirical study, aiming to determine if one of the models is "better". This is now a chapter in my thesis.
- Building A Digital Library The Commons-Based Peer Production Way (D-Lib mag. Oct 2003.) — Discusses commons-based peer production of digital libraries, using Noosphere/PlanetMath as a case study. Good intro to my thesis.
- An Architecture for Multischeming in Digital Libraries (extended version of ICADL 2003 paper) — Discusses issues with multiple classification schemes in digital libraries in a federated scenario and how to solve them. CITIDEL uses methods discussed in here.
- The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment (JCDL '03) — About the XML log standard developed in the DLRL. I had a part in designing the deployment (basically the socket architecture and protocol, as well as the client-side implementation in Perl, for CITIDEL).
- Building Digital Libraries from Simple Building Blocks (Online Information Review, Vol 27 No. 5, 2003) — This article is basically about Open Digital Libraries (ODL). I contributed some sections, talking mostly about the use of actual ODL components in CITIDEL, and things in CITIDEL which could potentially be componentized.