So, linked data, you say?

Mobile & web Emerging technologies 6 min read

In my previous blog, I explained how to set up your own Solid pod server. Now that we have a place to store our data, we can start filling it up, right? But should you do this all willy-nilly? Wasn’t one of the key points of SOcial LInked Data that data is independent of applications and can be linked dynamically? Let’s find out.

Describe your data in three words ...

To make this happen, we should all speak the same language. And if we all need to speak it, it should be as simple as can be. Why complicate things after all?

A standard already in use to describe/model a lot of the web’s resources is RDF.

Resource Description Framework.

RDF is a family of W3C specifications. The idea is that you make a statement about a resource in the form of subject-predicate-object, also known as a triple. The subject is what we describe. The predicate is a property and expresses the relationship between the object and the subject.

For example (taken from solidproject.org): in the vocabulary for obelisks we can find the following (a graphical representation):

Graph of Obelisk

So, the Obelisk subject has four predicates. The “a” predicate (or property) describes it as a subject “Class”. Note that there are object properties and data properties, with the latter expecting a literal.

Wait a minute, vocabulary?

I stated that we found the info in the vocabulary for obelisks. When we talk about the same subject, it seems only natural that we use the same terms. For this purpose, there are already a lot of agreements on how to describe common subjects. These are made by developers or by governments. If there’s no agreement yet, you can always publish your own.

We find the basic elements to describe vocabularies in another vocabulary: the Resource Description Framework Schema.

... or less.

In RDF, everything is identified by IRIs, which are standard web URI’s that can contain more internationalized characters. Because the vocabulary itself is also a resource, we represent it by an IRI too. Say http://w3id.org/obelisk/. This is the link to the actual document containing all the triples.

Different RDF formats.

Commonly, we store RDF in one of four formats: N-triples (.nt file extension), Turtle (.ttl), JSON-LD(.json) or RDF/XML(.rdf).

Reading and writing N-triples is simple. Every line of a .nt file is a single triple, ending with a “.”. But if we put the complete IRI every time, this quickly becomes very unreadable. An http://w3id.org/obelisk/Obelisk is http://w3id.org/obelisk/builtBy a http://w3id.org/obelisk/Sculptor. That’s why in Turtle format we can use prefix declaration (a concept borrowed from XML namespaces.) This means we can use

@prefix obelisk: <http://w3id.org/obelisk/> .

And shorten the previous statement to An obelisk:Obelisk is obelisk:builtBy an obelisk:Sculptor.

Another feature of turtle is that multiple triples with the same subject are grouped into blocks separated by “;” and even multiple subjects can be grouped, separated by “,”.
Each triple still has to end with a “.”. We’ll see an example later.

The JSON-LD describes everything in a JSON format and the RDF/XML in an XML format. Both of which aren’t as readable as the Turtle notation. So I won’t elaborate on them here. Moreover, Solid pods use the Turtle syntax.

So how do we describe our data?

Let’s built further on the Obelisk case. We introduce Vuittonluis and G.Armani, world renowned-obelisk builders. So they describe themselves as obelisk:Sculptor on their profiles.

@prefix obelisk: <http://w3id.org/obelisk/> .
# The next line defines an empty prefix ":", which points to the current document (e.g. https://garmani.solidcommunity.net/profile/card)
@prefix : <#>.

# "a" is the standard shorthand for "rdf:type".
:me a obelisk:Sculptor.

#...

If we want to put it graphically, we could use:

In the top half, you see the vocabulary in its own document, and in the bottom half, you see an entry on G:Armani’s profile document. The profile document does not define new classes and properties but is entirely composed of data. Thus, in the document where the actual obelisk is you say:

@prefix obelisk: <http://w3id.org/obelisk/> .
@prefix garmani: <https://garmani.solidcommunity.net/profile/card#> .
:actualObelisk
a obelisk:Obelisk;
obelisk:builtBy garmani:me.

Linking it all together

In my previous blog, I said that we only need one login any more, the WebId. The fact is, this login is much more than just a username. It’s the entry point to all your data!

The profile page

The webId is in fact the link to your profile document. Here, you can find the links to where everything is stored. Let’s take G.Armanis profile.

Armanis account

At the moment, I’m mostly interested in the prefix declarations and the storage portion of the :me anchor.

Prefixes

For storage purposes, we need the “solid”, the “sp” and the “g” prefix. Firstly, we have the vocabulary describing terms specific to solid. Secondly, we have the description of workspaces. This is where we store data and its privacy settings. Lastly, the “g” prefix is the root of the pod where the profile is stored. These prefixes are randomly chosen: they could be anything, as long as the IRI’s are valid.

Storage

In the storage section, we have (in triple notation)

:me sp:storage g: .

This translates to :me (the subject of this document) has a storage location that is located at the root of this pod. Or: the root of this pod is a storage location for G. Armani.

Next we have

:me solid:account g: .

This means that the root of this server holds G.Armanis account.

Then we come to the declaration of two typeIndexes. The publicTypeIndex holds the IRIs of the documents which are publicly readable and the privateTypeIndex holds the reference to the private documents.

If you want to access the public data, all you have to do is follow the link to the .ttl file.
This file then holds your data or links to other files with public data. However, this is the standard implementation. We can fully customize who (or which application) gets access to what. But that is the subject of another blog!

And now in one sentence

In Solid, we describe data using RDF vocabularies in statements. These statements are composed in the form of subject-predicate-object. Each of these components is or an IRI, or a literal. Because an IRI is a link, everything can be linked in reference documents. In Solid, we are identified by the webID, which is the IRI of the profile document, which in its turn is the starting point of the references to all our data.

I hope you now have an idea of how Solid data is built and linked to each other. Later, I will explain how to read the data and how security works.

If you are eager for more information about Solid or my internship, be sure to follow us on social media (Instagram, LinkedIn, Facebook, or Twitter). And if you can’t wait for that, don’t hesitate to contact us! By the way, you can also check out Ruben Verborghs resources about the semantic web and linked data.

Craftworkz emerging technologies the solid project solid