Open Standards
Open standards and best practices are employed by KBpedia in order to:
1) obtain the most accurate results; and 2) facilitate interoperability
with external data and systems. Our open standards are mostly based on
those from the World Wide Web Consortium (W3C), which established the standards for the
original Web and the design of Web pages and Web protocols. Specific W3C standards
used by KBpedia include:
-
Resource
Description Framework — RDF (v 1.1) is the basic
data model and language for the semantic Web. A statement, which is
also an assertion, is comprised as a triple of subject - predicate - object (or s-p-o). As the standard states, "The
abstract syntax has two key data structures: RDF graphs are sets of
subject-predicate-object triples, where the elements may be IRIs, blank
nodes, or datatyped literals. They are used to express descriptions of
resources. RDF datasets are used to organize collections of RDF graphs,
and comprise a default graph and zero or more named graphs." RDF gives
us the basic scaffolding for knowledge graphs and the description of
resources
-
RDF Schema
— RDFS (v 1.1) is a
data modeling vocabulary extension to RDF that gives us the constructs
for defining classes and instances, property domains and ranges, and
the subsumption hierarchy capabilities so essential to the basic logic
of knowledge graphs
-
Web Ontology
Language — OWL2
(so designated because it is the second version) is the fullest
language specification for our knowledge graphs. It provides a complete
set of vocabulary grammar to construct knowledge graphs that are
decidable and testable using description
logics. Our implementations build on RDF and RDFS, and supplement
the vocabulary with SKOS
-
Simplified
Knowledge Organization System — SKOS provides a basic
vocabulary for knowledge organization systems, such as thesauri,
taxonomies, classification schemes and subject heading systems, and a
richer pool of label and annotation primitives. All of these are useful
when integrating across multiple knowledge bases and schema
-
SPARQL —
SPARQL
(pronounced "sparkle", and is a recursive acronym for SPARQL Protocol
and RDF Query Language) is a set of specifications that provide a query
language and protocols to retrieve from and manipulate RDF graph
content. SPARQL is typically accessed via a Web endpoint to a triple
store knowledge base. We also may use the SPARUL
extension to enable the RDF store to be updated with
INSERT
and DELETE
methods
-
Semantic Web Rule
Language — though only a W3C submission, SWRL is nonethless commonly
used as an extension to OWL to provide if-then rule statements. SWRL includes a
high-level abstract syntax for Horn-like rules in
OWL.
Other standards, such as HTML, are also used where appropriate. We also
employ many open source standard libraries and tools, prominently the
ontology IDE, Protégé, the
OWL API and the search
engine Lucene.
In the use of these standards, we apply best practices, many of
which we have developed through our client work. Some of these
include the use of semsets for capturing the multiple labels that might
be applied to a given thing; how to construct and manage ontologies (also
known as knowledge graphs); ensuring multi-lingual capabilities; and
build and management workflows.
Most supporting KBpedia code is written in Clojure, in part due to its ability to run in
the Java virtual machine.