A blog about ideas relating to philoinformatics (or at least that have something to do with computer science or philosophy)

Wednesday, December 1, 2010

Bridging Ontologies: The Key to Scalable Ontology Design

It's been years since I've created an ontology (in the computing/informatics sense) but I'm going to give some advice on creating them anyways. When creating an ontology, it can be helpful to connect it up to other related ontologies. In fact, I think this is a requirement for building the semantic web (taking 'ontology' in a broad sense). You may want to ground your ontology (i.e. connect it to more generic or foundational ontologies; towards upper ontologies) or connect it to well known ontologies increasing the potential usefulness and adoption of your ontology. Whatever the reason, there are benefits in doing so if you want the data that your ontology schematizes to be more easily and automatically reusable. The potential downside is that you are forcing your users to endorse the ontology you're connecting to. So how exactly should one connect their ontology into the ontology ecosystem?

Most ontologies out there seem to me to be part of a stack of ontologies built by a single group of people. The ontologies tend to build directly on top of each other, meaning "lower" ontologies directly reference "upper" ontologies. Since the ontologies are developed by a single organization, it seems to make sense to directly connect to them because the organization (arguably) knows exactly what they are attempting to represent or what they mean. The fact that organizations tend to keep their ontologies rather isolated may be caused by a fear to commit to ontologies they didn't create.

The way ontologies are (or at least should be) developed allows the possibility for changes and updates. To accommodate this, one should develop ontologies with versioning. This way, someone using your ontology won't ever have it change on them and the developers can still maintain and change the ontology by introducing new versions. It's as simple as adding a version number to your ontology's url.

But this brings up the problem we face by directly referencing other ontologies. Let's imagine you have an ontology X that makes reference to another ontology Y and that ontology Y has a newer version available. You're planning on updating a term in X to make reference to essentially the same term but in the newer version of Y to keep X up to date. So you update X to the new version of Y even though it basically hasn't changed its meaning. The role an ontology fulfills is to describe a certain subject or topic and this intrinsic meaning has not changed. Yet you still need to change your ontology. Under these conditions, no matter how much concensus is formed around the accuracy of your ontology, you will never know when it is stable. In fact, this leads to a cascading of updates and changes required by upstream ontologies that reference your ontology, and so on. This is not a distributed web-scale ontology design pattern. We need a way to decouple our ontologies.

So, is there a design pattern we can use to avoid these dangers and burdens of connecting to other ontologies? Can we do better than simply identifying good stable ontologies and directly referencing only those ontologies in our own ontology? Yes!

Introducing: Content vs Bridging Ontologies

The key to scalable ontology design is what I call Bridging Ontologies. You write your intended ontology without referencing other ontologies and then create a separate ontology that is mainly made up of owl:sameAs and rdfs:subClassOf relationships between your terms and the target ontology's terms. I call these ontologies Content Ontologies and Bridging Ontologies, respectively. You only need to update your Bridging Ontology when either the source or target ontology changes. The nature of a Bridging Ontology makes it useless for anyone to reference in their own ontologies, which stops any potential cascade of changes throughout the web of ontologies. Of course users would still need to use the Bridging Ontologies and would likely need to collapse/deflate the owl:sameAs relationships into single terms for most visualizing, processing, or reasoning purposes.

I'll go out on a limb here and say that every ontology anyone creates should be isolated in this manner. The vision then becomes a web of ontologies of small Content Ontology nodes that satisfy specific "semantic roles" and then Bridging Ontology edges definied between these Content Ontologies. Since you don't need to adopt all of the Bridging Ontologies that are built for a Content Ontology, it is much easier to reach concensus on the Content Ontologies and then to pick and choose your Bridging Ontologies, choosing to commit (or not) to exactly how that content fits into the big picture. This allows for decoupled semantics rather than traditional inflexible semantics.

No comments: