Software Scientific: the competitive edge of intelligence.

Markets

AI Database Systems


If you put tomfoolery into a computer, nothing comes out but tomfoolery. But this tomfoolery, having passed through a very expensive machine, is somehow ennobled and no-one dares criticise it.
Pierre Gallois

Introduction

Relational (and other) databases, whilst being 'aware' of the contents of structured fields (such as price, name, or age) are unable to exploit the information in fields containing free text of a descriptive and unstructured nature.

Database information is 'structured upon input' - that is to say the information is structured before or during data entry. If a subsequent structure turns out to be better, considerable work must be done to change it. The Concept Engine takes the opposite approach - 'structure on retrieval'. No structure is imposed until such time as a request for information is made. Both approaches are valid and have their place, but it should be possible to construct one system which uses both approaches as appropriate. Quite simply, this would be much more powerful - by integrating the Concept Engine into a database it would gain the ability to actively handle and manipulate free text.

SQL Extensions

Furthermore the SQL (structured query language) of a relational database would very naturally extend to include Concept Engine queries - no change in the SQL grammar is required. To do this we need postulate only a few extensions to SQL:

RELEVANCE(text, query) Evaluates the relevance of some text to a natural language query. The text could be very long indeed - a complete document. RELEVANCE could even be parallel multi-lingual, so the text and the query need not be in the same language!
SUMMARY(text, n) Returns a summary of the text, n sentences long.
ABSTRACT(text, query, n) Returns an abstract of text, based upon query.
CLASSIFY(text, profile) Returns the category to which text belongs based upon the profile generated earlier.

It should be stressed that all these extensions could be made using existing Concept Engine technology. In the following examples it is not necessary to completely understand the SQL to see how well the Concept Engine functionality "drops in" to SQL.

Retrieving by Relevance

Consider the relational database query:

  SELECT title, description
    FROM music
   WHERE mood .EQ. "PEACEFUL"

This will retrieve all the titles of music which have their mood described as peaceful - presumably by the original data entry operator. The problem is, of course, that one person's idea of peaceful is not the same as another's, so the music we wish may be missed.

Using a Concept-enabled query one could retrieve those compositions whose free-text descriptions best matched our requirements:

  SELECT title, description
    FROM music
   WHERE RELEVANCE(description,
                   "Peaceful or tranquil orchestral or organ music") > 50%

This would retrieve music suitable for our purposes, using the original music description. (Naturally this is a subjective query - but that's what the Concept Engine is about and able to deal with. It uses our knowledge base so it's got our world-view.)

Ranking by Relevance

As an alternative to the above query we could rank rather than select by relevance:

  SELECT title, description
    FROM music
   WHERE mood .EQ. "PEACEFUL"
ORDER BY RELEVANCE(description,
                   "Peaceful or tranquil orchestral or organ music") > 50%

This might be a better query, as the system would only need to evaluate the relevance of those compositions whose mood is described as "peaceful".

Ranking by Similarity

We could even rank the compositions by similarity to music we already liked (using a sub-select):

  SELECT title, description
    FROM music
   WHERE mood .EQ. "PEACEFUL"
ORDER BY RELEVANCE(description, (
            SELECT  description
              FROM  music
             WHERE  title .EQ. "ALLEGRI MISERERE"))

The purpose of this complex query is to show just how naturally the functions of the concept engine fit into the SQL paradigm. In this case the Concept Engine would automatically generate on the user's behalf a natural language query which described the Allegri, and use that for the ranking.

Retrieving Abstracts and Summaries

Suppose however the description is rather long - many pages in fact. The database combined with the Concept Engine could then return either simply a summary of the description:

  SELECT title, DESCRIPTION(description, 3)
    FROM music
   WHERE mood .EQ. "PEACEFUL"
ORDER BY RELEVANCE(description,
                   "Peaceful or tranquil orchestral or organ music") > 50%

Or we could use an abstract, biased by our original query, so that we can see how well this composition meets our requirements.

  SELECT title, ABSTRACT(description,
                         "Peaceful or tranquil orchestral or organ music", 3)
    FROM music
   WHERE mood .EQ. "PEACEFUL"
ORDER BY RELEVANCE(description,
                   "Peaceful or tranquil orchestral or organ music") > 50%

(In both cases the number '3' indicates the maximum length of abstract or summary required, measured in sentences.)

Data Entry

Data entry could be much faster and more consistent if Classify were used to categorise documents, as shown in this example.

   INSERT title, CLASSIFY(description, profile), description
     INTO music

In this case Classify is using the profile to work out the mood category to which this peace belongs.

Data Validation

Assuming that PROFILE had generated a profile from the database, CLASSIFY would check for consistency. This query would retrieve all inconsistently manually classified entries:

  SELECT title, mood
    FROM music
   WHERE mood .NE. CLASSIFY(description, profile)