Factor/GSoC/2010/Improve XML library

Mentor

Daniel Ehrenberg

Skills required

  • Some prior knowledge of XML
  • Experience with parsing is a plus

Technical outline

Conformance

Factor's XML parser does not yet pass standard conformance tests, due to inadequate support for parsing DTDs. Minor modifications should let it pass the XML 1.0 and XML 1.1 conformance test suites, as well as the namespaces 1.0 and 1.1 conformance tests.

Validation

The Factor XML parser is non-validating. It might be useful to implement either DTD validation or validation based on one of the other XML schema systems.

Performance

The XML parser right now is not very fast. A new lexer based on Factor's regexp library could improve performance substantially. Other kinds of parsing abstractions might also be useful.

Benefit to the student

The student gains valuable deep knowledge of XML, and has the opportunity to learn about the theory of parsing.

Benefit to the project

XML is everywhere, and Factor applications will have to deal with it. Improvements on the XML library will make Factor useful for a wider range of data mining and manipulation applications.

This revision created on Sat, 27 Feb 2010 16:32:20 by littledan