- Use URIs for things
- Use HTTP URIs
- Make these HTTP URIs dereferenceable, returning useful information about the thing referred to
- Include links to other URIs to allow the discovery of more things.
- “People should use terms from well-known RDF vocabularies such as FOAF, SIOC, SKOS, DOAP, vCard, Dublin Core to make it easier for client applications to process Linked Data”
prefix ex <http://my.url/> ex:jane ex:likes ex:chocalate
The dependency tree of the open annotations ontology — typical of a linked data vocabulary
The Linked Open Data Cloud is the great achievement of the linked data movement — it shows a vast array of interconnected datasets, incorporating many of the most important and well known resources on the internet such as wikipedia. Unfortunately, however, on closer examination, it is far less impressive and useful than it might appear — due to the lack of well-defined common semantics across these datasets, the links are little more than links — they cannot be automatically interpreted by computers.
What’s worse, where semantic terms are used, they are very often used erroneously. For example, it is commonplace to use the term owl:sameAs to create links between instances in different datasets that are about the same real world thing and likewise it is common to use the term owl:equivalentClass to refer to classes that refer to the same real world thing. In these cases, while the terms are well-defined, their defined meanings are quite different to how they are actually used — they assert that the classes or instances are logically the same thing and can be unified. If we follow the correct interpretation, the consequence is that everything blows up because they are almost never in fact logically equivalent.
This is not to say that such links are useless —it is manifestly useful to a human engaged in a data-integration task to know which data-structures are about the same real-world thing – but that falls far short of the vision of the semantic web — it is supposed to be a web of knowledge for machines, not humans.
The fundamentally decentralised conception of linked data carries further challenges. It take resources to create, curate and maintain high quality datasets and if we don’t continue to feed a dataset with resources, entropy quickly takes over. If we want to build an application that relies upon a third party linked dataset, we are relying upon whoever publishes that dataset to continue to feed it with resources — this is a risky bet in general and particularly risky when we are dealing with academic research in which almost all resources are focused on novelty and almost none on infrastructure and maintenance. For example, the wordnet linked dataset has been extensively used in commercial applications, but the dataset maintainers at the University of Southampton have long since run out of resources to maintain it . The semantic web — as illustrated in the diagram above — is predictably littered with abandoned ontologies and datasets. Simply continuing to build linked datasets without considering the resourcing problem is building on foundations of sand.
In summary, the linked data movement as it exists suffers from several critical problems which continue to prevent adoption outside of academic research. Firstly, the specifications are too weak to enable automatic processing by machines in any meaningful way. Secondly, it creates dependencies on third parties without any sustainable economic model. Even in a perfect world without free-riders, there are many situations in which a third party might be getting much more value from a dataset than the dataset maintainer who bears the cost.
If we actually want to make linked data and the semantic web work the first thing that we need to do is to radically reduce the ambition of the movement. Firstly, we should only create links between datasets that have well defined, sound and mutually compatible machine-readable semantics. Secondly, we should resource the maintenance of high-quality datasets as a public service, rather than leave them subject to the novelty-loving whims of academic research.