Getting Started With Sesame 2.2
Abstract
This document is intended for people interested in learning how to use the RDF Sesame platform. This document assumes the reader has an understanding of XML.
Introduction to Sesame
Sesame is an Java platform for working with the Resource Description Framework (RDF). RDF is a W3C recommendation for representing information about resources in the World Wide Web. Sesame provides a common API for RDF parsers, writers, and RDF stores.
Understanding URLs, URNs, URIs, and IRIs
RDF builds on the URI and XML technologies and understanding what a URI is is important in understanding RDF.
A URL (Uniform Resource Locator) is used to represent a web resource and provide a means of locating it. An example of a URL is
http://en.wikipedia.org/wiki/Uniform_Resource_Locator
For resource that can't be located by a URL, they can be identified by a URN (Uniform Resource Name). For example the book "The Last Unicorn" can be identified by its book number
urn:isbn:0451450523
Resources that can't be downloaded may still be identified by a URL. However, when these URLs are resolved they either have no content, a "see also" redirect, or return meta-data describing them.
All URLs and URNs are URIs (Uniform Resource Identifier) and can also be represented by an IRI (Internationalized Resource Identifier). IRIs are meant to replace URIs and they use the Universal Character Set, which allows non-ASCII characters. This makes it easier for every language to adopt easily identifiable IRIs. However, within the Sesame framework IRIs are referred to as URIs, although in Sesame URIs are character set neutral and therefore can be used to represent IRIs as well. In the remainder of the document the term URI will be used, but anything that holds for URI also holds for IRI.
URIs can be very long and most RDF formats allow them to be abbreviated. This is commonly done by separating the URI into a namespace and a local name and using a local prefix in place of the namespace. Most RDF formats restrict the local name to be a valid QName, but Sesame has no such restriction within the repository. For example the namespace "http://en.wikipedia.org/wiki/" can locally be abbreviated as "wiki" with the local name "Uniform_Resource_Locator" written as
wiki:Uniform_Resource_Locator
This is a short form for URI <http://en.wikipedia.org/wiki/Uniform_Resource_Locator> when the prefix "wiki" is mapped to "http://en.wikipedia.org/wiki/".
Understanding RDF
In RDF all resources are uniquely identified by a URI. For example you may be viewing this article through a browser and can identify this article by the URL in the address bar, like this one:
http://wiki.aduna-software.org/confluence/display/SESDOC/GettingStarted
RDF is a framework for describing resources. Any resources that can be identified by a URI can be described in RDF.
To describe the author of this article one can use the statement
This article was authored by James Leigh
This statement can be broken down into a subject ("This article"), predicate ("was authored by"), and an object ("James Leigh"). The statement can then be written in RDF as:
The subject of the statement is expanded into the URI
http://wiki.aduna-software.org/confluence/display/SESDOC/GettingStarted
The predicate of the statement is expanded into the URI
http://purl.org/dc/elements/1.1/author
The object is a string "James Leigh". This is not a URI, but a literal. The object can be a literal or a URI. Here is another example describing the author of the article along with the author's name and an image depiction.
In the above example the predicates dc:author, foaf:name, and foaf:depiction were used. They belong to popular vocabularies: The Dublin Core Metadata Element Set and The Friend of A Friend (FOAF) RDF vocabulary. All predicates in RDF must be URIs and should belong to a vocabulary.
When resources cannot be identified (because their identify is not known or not important), a "blank node" can be used instead. For example, if the actual author of this document was not known, but the photo was available, the above example could be rewritten using a blank node locally identified as "somebody" in place of the URI.
Often it is not enough just to have statements describing resources, but its important to retain the context from where the statements originated. This context can be identified by a URI, or be referenced and described as a blank node. In Sesame every statement has a
- subject (URI or BNode),
- predicate (URI),
- object (URI, BNode, or Literal), and a
- context (URI or BNode).
What RDF Looks Like
RDF can be described in a number of formats. The first and most widely supported format is RDF/XML. In RDF/XML, statements are grouped by their subject. The name and depiction of an author can be written as:
Notice that in RDF/XML the nested XML tags alternate between subjects and predicates, and objects are either nested in a predicate or reference through the attribute rdf:resource. RDF/XML allows the rdf:Description to be substituted with the type of subject as shown below.
Another RDF format in XML is TriX. It contains a more controlled structure and therefore can more easily participate with other XML technologies like XSLT. TriG is a text based version of TriX. Both TriX and TriG include context information (called a graph).
Other formats are N3 and its subsets N-Triples and Turtle. These are all text based representations that are easier to read by humans. The examples show before use this family of RDF formats.
Installing the Sesame Server
Sesame requires an implementation of Java Servlet 2.4 and JavaServer Pages 2.0 technologies running on Java 5. Apache Tomcat 6.0 with Java SE 6.0 is recommended.
You can download Java SE 6 at http://java.sun.com/javase/downloads/
You can download Apache Tomcat 6 at http://tomcat.apache.org/download-60.cgi
The Sesame Server is included in the openrdf-sesame sdk archive available at http://www.openrdf.org/
Once you have downloaded and installed Java and Tomcat you can install the Sesame Server in the Tomcat webapps directory. Copy the openrdf-sesame.war and openrdf-workbench.war files in the Tomcat webapps directory and restart Tomcat.
If everything installed correctly opening your browser to http://localhost:8080/openrdf-workbench should present you with a "List of Repositories".
Storing RDF in Sesame
Start the Sesame Server and open your browser to the repository listing page located at
http://localhost:8080/openrdf-workbench
Create a new repository by clicking the appropriate link in the menu. Select the in Memory Store and give it an ID 'sandbox' with title 'Getting Started'. Follow the on-screen instructions, using the default options, to create a new repository.
To add data to the repository click the add link in the menu. Copy and paste some example RDF into the RDF Content box and be sure to select the correct data format. Navigate to the export page to download the contexts of the repository in different formats.
Try uploading other data from http://www.rdfdata.org/ and using the links from the export page to navigate through the repository.
Using SPARQL with Sesame
SPARQL stands for the SPARQL Protocol and RDF Querying Language. The Sesame server is an implementation of the SPARQL Protocol and is being used when uploading or exporting data from the repository.
To demonstrate the SPARQL query language we will use the RDF/XML data located at
http://www.daml.org/2001/01/gedcom/royal92.daml
Upload this URL from the add page of a newly created repository. You can change the namespace prefixes used to display the data, in the namespaces page. Select the prefix "a" from the drop down and remove the namespace field to delete this prefix. Now type in a new prefix "gedcom" with the namespace "http://www.daml.org/2001/01/gedcom/gedcom#", update, and then add the prefix "royal" with the namespace "http://www.daml.org/2001/01/gedcom/royal92.daml#". Then navigate to the query page to perform some SPARQL queries.
To find all the Kings of England. where and when they were born and when they died use the following query:
Use the result page to navigate between the relationships used in the query and find other relationships. Try creating your own SPARQL queries.
Embedding a Sesame Repository in Java
To create a new in-memory repository from Java issue the following statements.
To interact with the repository a RepositoryConnection must be created using the Repository#getConnection() method. Using the connection, data can be uploaded and retrieved as shown below.
Further Reading
Sesame User Guide
http://www.openrdf.org/doc/sesame2/users/
JavaDoc
http://www.openrdf.org/doc/sesame2/api/