Dave Beckett's blog

Raptor Web Library 1.4.17

2008-04-05 18:24

Last week I released version 1.4.17 of my Raptor C library (release notes) but in the 38 releases over the 7 years or so since I started building it, there's more to see than just triples.

I/O stream API
Abstracts from specifics of I/O so that you can read/from write to any of a string/memory buffer, a file, a C FILE* or any custom user data structure. Similar to C++ idea of stream. This allows lazy evaluation of I/O and using language-specific I/O routines such as PerlIO (potentially, not yet!). The read abstraction is new in 1.4.17.
Sequence API
For small lists of data items that can grow at either end. This allow small lists, stacks and queues to be made when needed, and iterated over. Do not use for large lists! (Something else is coming along to handle this soon.)
Stringbuffer API
Provides a growing string class, similar to Java's StringBuffer, which can be added to many times and then the result obtained (with no further changes allowed). This is handy for constructing formatted output, queries using parameters and values which is hard in C without entering the world of char buf[big enough]. The stringbuffer API tries hard to minimise copying.
Unicode API
For a small part of Unicode that is used near XML; UTF-8 encoding and decoding and Normal Form C validation. Nearby it provides XML 1.0 and 1.1 name-checking/validation functions.
Construction of URIs, resolving of relative URIs and turning absolute URIs into relative. It handles the RFC3986 URI spec updates.
Web fetch API
Retrieving of content from a URI in chunks of bytes. It works by calling curl, libxml or BSD libfetch underneath. It also handles redirections and adjusting any base URIs for 30x responses, returning content type headers early so that content negotiation can be done and customisation of the requests to send appropriate User-Agent and Cache-Control headers (latter, new in 1.4.17)
XML with Namespaces Streaming (SAX2) Reader API (newly public in 1.4.17).
Streaming reading of large XML documents over either libxml2 or expat (build time choice with configure). Expat out of the box does not support namespaces and XML QNames, so Raptor adds those and hides various library differences and some bugs. The XML API also provides full XML Base (xml:base) support.
XML Writer API
Write XML elements as Canonical XML output.

Some of the above are large pieces of work and some are small, but they are all solid and many have been used for multiple years in production. These turned out to be handy datatype classes for web programming and I needed them since RDF is built with web technology.

The bonus is that all of the above is used to provide the signature features of Raptor: RDF Parsing - turning syntax into triples and RDF Serializing - turning triples into syntax. Raptor now parses 7 syntaxes (GRDDL, N-Triples, RDF/XML, RSSes, Atom, TRiG, Turtle) and serializes to 10 (Atom 1.0, GraphViz DOT, JSON * 2, N-Triples, RDF-XML * 3, RSS 1.0, Turtle). The JSON outputs are new for 1.4.17

So although Raptor deals with all the RDF syntax details, it does a lot more. But I'm not changing the name!