The rdf4h Haskell library is for querying structured data described
with the the Resource Description Framework model, where data is a
collection of <subject,predicate,object> triples:
Rdf type class
The following Rdf type class methods are optimised for each graph implementation.
The Data.RDF.Query module contains more utility query functions,
here.
Building RDF graphs interactively
An RDF graph can be constructed with empty, and its triples contents
modified with addTriple and removeTriple, e.g.:
Bulk RDF graphs with parsing and writing
RDF graphs can also be populated by parsing RDF content from strings,
files or URLs:
RDF graphs can also be serialised to handles with hWriteRdf:
E.g. to write an RDF graph to a file:
Supported RDF serialisation formats
The rdf4h library supports three RDF serialisations:
Serialisation
Reading
Writing
NTriples
✓
✓
Turtle
✓
✓
RDF/XML
✓
✗
Type level RDF graph representations
The RDF type is a data family, for which there are a number of
instances. Those instances represent type level indexes that provide
the programmer with the choice of underlying in-memory graph
representation.
Those implementations
differ in their in-memory representation of RDF graphs.
RDF TList stores triples as Haskell lists, i.e. [(s,p,o),..].
The implementation.
RDF AdjHashMap is an adjacency hash map with SPO and OPS indexes.
The implementation.
TList implementation
Given two triples:
The TList implementation just stores them as is, i.e.
AdjHashMap
The adjacency hash map implementation has two hash map implementations:
A hashed S key pointing to value that is another hash map, whose key is a hashed P pointing to a hash set of O values.
A hashed O key pointing to value that is another hash map, whose key is a hashed P pointing to a hash set of S values.
So our two-triple graph is stored in SPO and OPS indexes:
This makes querying AdjHashMap graphs with query very efficient,
but modifying the graph with addTriple and removeTriple more
expensive that the TList implementation, which just use (:) and
filter respectively.
The TList and AdjHashMap data family instances represent
application specific tradeoffs in terms of space and runtime
performance. TList performs better for query, whilst AdjHashMap
performs better for select and modifying triples in a graph with
addTriple and removeTriple. See
these criterion results
for performance benchmarks, taken in November 2016.
RDF query example
The list of the Extended Semantic Web Conference 2015 programme
committee members is printed to standard out:
Here
is an example of computing if the structure of two RDF graphs are
identical, using the hgal library.
Tests
This library has two test suites:
Property based
tests
of the API using QuickCheck. All tests pass.
Unit tests provided by the W3C
to test conformance of RDF parsers, of which there are
currently 521. Some parsing tests fail currently.
To list the available tests that can be run in isolation using a
pattern: