Artilium Blog

Tuesday, June 02, 2009

Behavioural Processing and the Semantic Web

The dream of the Semantic Web is to have machines which can answer arbitrary questions. This is based on Sir Tim Berners-Lee’s vision of the Web as a universal medium for knowledge exchange. One of the applications usually proposed for the Semantic Web is an intelligent search engine – which provides a UI where you can ask questions, such as “When did Elvis Presley get married?” and it would answer “1st May 1967”.

One of the key advantages of the Semantic Web concept is that data is organised in a way that can be read and more importantly, understood by other machines, so that it can be used for a multitude of new purposes.  Whereas, conventional Web concepts enable data to be organised and presented for human consumption and the machines reading the data generally have no understanding of the meaning of the content, or its relationship to other related content.

An important aspect of the Semantic Web is how data is held as objects and how relationships between objects are defined. The Resource Description Framework (RDF) has been developed to allow these relationships to be described in a format that can be used by machines. An RDF statement will normally contain a subject, a verb and an object similar to natural language. This is called a triple. Multiple objects and verbs can be related to the subject. E.g.

<#EmmaHays> <#Likes> <#RobbieWilliams>, <#Oasis>;
<#Age> <#28>.

This RDF statement tells us that Emma Hays likes Robbie Williams, she also likes Oasis and is aged 28. In order to make sense of the data, the machine reading it needs to have knowledge of the subject (Emma Hays) and also understand the meaning of the verbs and objects used. Vocabularies, such as the Dublin Core, have been developed so that words can have a specific defined meaning without ambiguities.

3-D behavioural processing used in Artilium’s ARTA Mobile Applications Platform uses RDF to store useful information about subscribers. The subscriber is the subject (who) and for each subject we store useful information for delivering relevant services. The verbs relate to our three dimensions (describing where, when and what). We have found it convenient to store the data relevant to a subscriber in two categories i.e. live data and persistent data. The live data relates to where a subscriber is now, and what they are currently doing. The persistent data is information that is relevant to the subscriber irrespective of their current location and context. For example, the fact that Emma Hays likes Oasis is persistent data whilst the fact that she is currently travelling by train would be live data. In the RDF databases the persistent data changes slowly whereas the live data may change by the minute.

Because of the way RDF information is related and organised, the information can be used for purposes that it may not have been designed for. For example, if Emma’s live location is stored as “Rosyth Rail Station”. We could still answer a query like – Is Emma in Scotland? The RDF syntax libraries should already contain the knowledge that Scotland is a country, Rosyth Rail Station is in the town of Rosyth which is in the Region called Fife, and that Fife belongs to the country Scotland. In theory we should be able to ask the database any question about an individual subscriber or about groups of subscribers and get a useful response.

The Semantic Web is not just about building a better search engine that can answer questions intelligently. From our perspective it is about personalisation of applications, optimising the delivery of relevant services and understanding a subscriber’s context more clearly.

Posted on 06/02 at 07:39 AM

Name:

Email:

Location:

URL:

Remember my personal information

Notify me of follow-up comments?


back to the top