XAM Query Language

As well as providing vendor-independent means of creating, retrieving, modifying and deleting XSets, the SNIA XAM v1.0 specification also defines a query language (XAM QL), based on a subset of the SQL language, for selecting and retrieving the XUIDs of XSets based on content-defined criteria.

The set of reserved words for this query language is quite small: select, where, and, or, not, like, exists, binding, readonly, typeof, length, date, TRUE, FALSE, before, after, contains, and within. By design, XAM queries look like an SQL select statement. The query language is case insensitive and uses the ASCII character set.

Here is an example of a simple XAM query:

select ".xset.xuid" where "com.example.name" = ’Tuckers Plantation'


Version 1.0 of the XAM specification defines two levels of query language support, i.e. Level 1 and Level 2. Level 1 defines queries on properties and field attributes in XSets and is mandatory. Any XSet property value that is accessible to an application program via the XAM library can be queried. Level 2 extends Level 1 to support queries on XStreams and is optional. Both levels of query are accessed through a single, defined job type that all XAM providers must support. Since no vendor that I am aware of has actually implemented level 2 queries, the remainder of this post focuses on level 1 queries.

A XAM query statement consists of a mandatory select clause followed by an optional where clause. For XAM v1.0 the only valid select clause is select '.xset.xuid'. This specifies that the application is requesting a list of XUID values. For example

select ".xset.xuid "


will return a list of every XSet that is readable at the time of the query.

The where clause is used to specify a subset of XSets to be matched. For level 1 queries it is restricted to comparisons between XSet properties and literal values and/or field attributes and literal values.

select ".xset.xuid" where ".xam.time.xuid" > date(’2009-02-01T00:00:00.0’)


will return the list of all Xsets which were created on or after February 1st 2009.

The following table shows which field and literal types can be validly compared.
[table id=14/]
The XAM library validates that strings and strings liberals are conforming UTF-8 strings. Non-conforming UTF-8 string literals generate a XAM non-fatal query syntax error. Issues such as single versus multiple glyph characters and non printable characters are unspecified and are implementation and application dependent. String comparisons are case sensitive and comparison operators operate on a byte-by-byte basis. For relational operators the relationships are defined by the byte values. All data normalization is the responsibility of the application.

The supported escape sequences are the following. Any other escape sequence will generate an non-fatal error.
[table id=15/]
String literals must be quoted with single quotes. String literals which contain one or more single quote characters must escape each single quote character using a backslash. For example, the string literal of ’This is a string literal’ would be represented in a query as

select ".xset.xuid" where "com.example.string-property" = ’This is a string literal’


To use a single quote in a string, guard it with a backslash

select ".xset.xuid" where "com.example.claimed.ownership" = ’Tucker\’s’


All instances of field names in a XAM query string must be quoted with double quotes. If a field name contains double quote characters, each double quote character must be escaped using a backslash. For example, the field name for the xam_boolean property com.example.”qstring” should be represented in a query as

select ".xam.xuid" where "com.example.\"qstring\"" = TRUE


If a field name contains a backslash character, the backslash itself must be escaped with another backslash. For example, the field name for the xam_double property com.example.file\ratio should be represented in a query as

select ".xam.xuid" where "com.example.file\\ratio" = 100.1


The query language accepts date-time and XUID-valued literals using the selector functions date() and xuid() respectively. The date() function takes a properly formed date-time value, specified as a literal string which is consistent with the XAM date-time specification.

select ".xset.xuid" where ".xam.time.xuid" = date(’2009-06-01T00:00:00.0’)


The xuid() function expects an XUID which is in the form of a base64 encoded string literal. An improperly formed string literal generates a non-fatal error during query parsing.

Field attributes are accessed using the exists, typeof, readonly, binding, and length field attribute accessor functions. The exists() function tests for the existence of an named field attribute (property) in an XSet. It evaluates to TRUE if an XSet contains the named field otherwise it evaluates FALSE.

select ".xset.xuid" where exists("com.example.name")


The typeof() function returns the MIME type of a named field in an XSet which is a string-valued property.

select ".xset.xuid" where typeof("com.example.data") = ’text/plain’
select ".xset.xuid" where typeof("com.example.data") like ’text%’


This function can be used whenever an application could use a field reference to a string-valued property. Note that comparisons with any non-string literal value generate a non-fatal error during the parsing of the query.

The readonly() function evaluates to TRUE when a field in an XSet is marked as readonly.

select ".xset.xuid" where readonly("com.example.flag")
select ".xset.xuid" where not readonly("com.example.name")


The binding() function evaluates to TRUE when a field in an XSet is marked as binding.

select ".xset.xuid" where binding("com.example.case_id")
select ".xset.xuid" where not binding("com.example.subject")


The length() function returns the length, in bytes, of a named field.

select ".xset.xuid" where length("com.example.data") > 1024


When used on property fields this function returns the length as defined for stypes. Note that this function should not be used to compare the number of characters in a string as this comparison depends on the character encoding being used.

Turning now to logical operators. Subclauses within the where clause may be combined and modified by using the logical operators and, or and not. These operators are similar to their SQL counterparts.

The and operator requires both subclauses to evaluate to TRUE before including an XSet in the results. For example

select ".xset.xuid" where typeof("com.example.stream") = ’image/gif’ and length("com.example.stream") > 4096


selects only those XSets containing GIF images whose size is greater than 4096 bytes. The or operator evaluates to TRUE if either subclause evaluates to TRUE. For example

select ".xset.xuid" where typeof("com.example.stream") = ’image/jpeg’ or typeof("com.example.stream") = ’image/gif’


selects only those XSets containing a named stream of image type JPEG or GIF. The not operator negates a boolean expression. For example

select ".xset.xuid" where not binding("com.example.property")


selects all XSets with the nonbinding property com.example.property.

As in SQL certain operators take precedence over other operators in queries. XAM QL operator precedence is as follows:
[table id=16/]
Operators of the same precedence are evaluated left to right within a query. However operator precedence may be overridden using parentheses as shown by the following example.

select ".xset.xuid" where not "com.example.bool-prop" and  "com.example.int-prop" = 64
select ".xset.xuid" where not ("com.example.bool-prop" and "com.example.int-prop" = 64)


In the first example, the not operator only applies to the com.example.bool-prop property. In the second example, the not operator applies to both com.example.bool-prop and com.example.int-prop = 42.

Now that we are somewhat familiar with XAM QL syntax, the next question to be answered is how do we tell a XAM Storage System to execute a query and return the results to us. Well it turns out that a query is executed as a particular type of XAM job called a query job. The input to a query job is an XSet which you must create. This XSet must contain two items - the field org.snia.xam.job.command which is always set to 'xam.job.query' and a XStream xam.job.query.command which is a UTF-8 text stream (MIME type 'text/plain; charset=utf-8’) containing the actual query expression string.

A query job may optionally be committed before or during the execution of the job, but this commit is not required. The ability to commit a running job is optional and implementation dependant. An application can determine if job commit is supported by checking the XSystem boolean property .xsystem.job.commit.supported. Failure to commit a query job before closing the XSet means the results of the query job are not persistently stored.

A query job is executed by invoking submitJob. If either of the two required items in the query job XSet is not correctly filled in, the standard job error field .xam.job.error is added to the query job XSet and it's value is set to either org.snia.xam::not_a_job or org.snia.xam::unspecified_command and the query job is aborted. As shown in the sample application below, the state of a query job may be determined by examining the value of .xam.job.status.

When a query job is successful, the results are stored in the query job XSet in the form of a new XStream named xam.job.query.results. This XStream has a MIME type of application/vnd.snia.xam.query.xuid_list and contains the XUIDs for the set of XSets resulting from the evaluation of the query. Each XUID is stored in a binary format as part of an 80-byte record. If an XUID is shorter than 80 bytes, the record is zero padded to 80 bytes.

A number of other properties relating to the results XStream are also set up. The xam_int property xam.job.query.results.count contains the current count of the XUID records in the results XStream. This property is updated as results are entered into the results XStream during query processing. An application can use this to provide status information to users as the query job executes. The xam_string property xam.job.query.level indicates the query level which matches the results. Its value is either org.snia.xam.job.query.level.1 or org.snia.xam.job.query.level.2.

It is important to remember that a query job operates within the roles and permissions granted to the connection. That means the results XStream only includes those XSets that are visible and accessible to an application, at least from a read perspective, according to the role under which the query job is executed.

The query job may set the following error codes in .xam.job.error:
[table id=17/]
When a query job is halted for whatever reason, the standard job health and status fields, i.e .xam.job.errorhealth and .xam.job.status, are set to appropriate values and the XAM Storage System may place zero or more additional XUIDs in the results XStream. Thus an application should always see complete XUID values in the results XStream. Resuming a halted query job is not supported.

Note that, unlike a traditional RDBMS, there is no locking of an XAM Storage System during the execution of a query job. Thus a query result is not an instantaneous snapshot of a XAM Storage System. The contents of a XAM Storage System may or may not change while a query job is executed. The general rule is that any XSet that has been stored before the start of the query job is included in the results XStream if it meets the query criteria, and any XSet that has been stored after the query job has completed is not be included in the results XStream. XSets that are stored during the query job execution may or may not be included. The same criteria applies to the results XStream. It may include XUIDs of XSets that are no longer in the XAM Storage System and XUIDs of XSets that no longer meet the search criteria due to a change in the XSet after the particular XSet was queried.

Here is a simple Java application that queries an XAM Storage System to determine it's level of support for XAM queries.

import org.snia.xam.XAMException;
import org.snia.xam.XAMLibrary;
import org.snia.xam.XSystem;
import org.snia.xam.toolkit.XAMXUID;
import org.snia.xam.util.XAMLibraryFactory;

public class CheckQuerySupport
{
private static XAMLibrary xamLib;
private static XSystem xSystem;

public static void main(String[] args)
{
String xri = "snia-xam://centera_vim!XXX.XXXX.XXX.XXX?/home/fpm/xam/xamconnect.pea";

long exitCode = 0;
boolean isSupported = false;

try
{
xamLib = XAMLibraryFactory.newXAMLibrary();
System.out.println("Connecting to XSystem: " + xri + "\n");

xSystem = xamLib.connect(xri);

// support job commit?
isSupported = xSystem.getBoolean(XSystem.XAM_XSYSTEM_JOB_COMMIT_SUPPORTED);
System.out.println("Support job commit? " + (isSupported ? "Yes" : "No"));

// support continuance of job?
isSupported = xSystem.getBoolean(XSystem.XAM_XSYSTEM_JOB_QUERY_CONTINUANCE_SUPPORTED);
System.out.println("Support job continuance? " + (isSupported ? "Yes" : "No"));

// support level 1 query?
isSupported = xSystem.getBoolean(XSystem.XAM_XSYSTEM_JOB_QUERY_LEVEL1_SUPPORTED);
System.out.println("Support level 1 query? " + (isSupported ? "Yes" : "No"));

// support level 2 query?
isSupported = xSystem.getBoolean(XSystem.XAM_XSYSTEM_JOB_QUERY_LEVEL2_SUPPORTED);
System.out.println("Support level 2 query? " + (isSupported ? "Yes" : "No"));

xSystem.close();
System.out.println("\nClosed connection to XSystem");

} catch (XAMException xe) {
exitCode = xe.getStatusCode();
System.err.println("XAM Error occured: " + xe.getMessage());
} catch (IllegalArgumentException e) {
System.out.println(e.getMessage());
e.printStackTrace();
exitCode = 1;
}

System.exit((int) exitCode);
}
}

Here is the output from this application when connected to an EMC Centera XAM Storage System

Connecting to XSystem: snia-xam://centera_vim!XXX.XXX.XXX.XXX?/home/fpm/xam/xamconnect.pea

Support job commit? No
Support job continuance? No
Support level 1 query? Yes
Support level 2 query? No

Closed connection to XSystem

Here is a longer Java application that implements the three sample XSets and the sample queries which are discussed in section 10.8 of the XAM Architecture V1.0 document.

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.snia.xam.XAMException;
import org.snia.xam.XAMLibrary;
import org.snia.xam.XSet;
import org.snia.xam.XStream;
import org.snia.xam.XSystem;
import org.snia.xam.XUID;
import org.snia.xam.toolkit.XAMXUID;
import org.snia.xam.util.XAMLibraryFactory;

public class QueryExample
{
private static XAMLibrary xamLib;
private static XSystem xSystem;

private static void CreateXset(int parm1, String parm2, double parm3)
throws XAMException
{
System.out.print("Create XSet 1 .... ");

XSet xSet = xSystem.createXSet(XSet.MODE_UNRESTRICTED);
xSet.createProperty("com.example.rhc", false, true);
xSet.createProperty("com,example.foo", false, parm1);
xSet.createProperty("com.example.bar", false, parm2);
xSet.createProperty("com.example.num", false, parm3);

XUID xuID = xSet.commit();
System.out.println("XUID: " + xuID.toString());

xSet.close();
}

private static void CreateXset(int parm1, int parm2, int parm3)
throws XAMException
{
System.out.print("Create XSet 2 .... ");

XSet xSet = xSystem.createXSet(XSet.MODE_UNRESTRICTED);
xSet.createProperty("com.example.rhc", false, true);
xSet.createProperty("com,example.foo", false, parm1);
xSet.createProperty("com.example.bar", false, parm2);
xSet.createProperty("com.example.num", false, parm3);

XUID xuID = xSet.commit();
System.out.println("XUID: " + xuID.toString());

xSet.close();
}

private static void CreateXset(int parm1, int parm2)
throws XAMException
{
System.out.print("Create XSet 3 .... ");

XSet xSet = xSystem.createXSet(XSet.MODE_UNRESTRICTED);
xSet.createProperty("com.example.rhc", false, true);
xSet.createProperty("com,example.foo", false, parm1);
xSet.createProperty("com.example.num", false, parm2);

XUID xuID = xSet.commit();
System.out.println("XUID: " + xuID.toString());

xSet.close();
}

private static void QueryXsystem(String queryString)
throws XAMException
{
boolean finished = false;
final int XAM_MAX_XUID = 80;
int resultCount = 0;
String status;

System.out.println("Create query XSet. Query string: " + queryString);

XSet query = xSystem.createXSet(XSet.MODE_UNRESTRICTED);
query.createProperty(XSet.XAM_JOB_COMMAND, true, XSet.XAM_JOB_QUERY);

byte[] buffer = queryString.getBytes();
XStream queryStream = query.createXStream(XSet.XAM_JOB_QUERY_COMMAND, true, XAMLibrary.TEXT_PLAIN_MIME_TYPE);
queryStream.write(buffer);
queryStream.close();

System.out.println("Submit query job ....");
query.submitJob();

System.out.println("Wait for query job to finish ....");

while (!finished)
{
// check status of query job
status = query.getString(XSet.XAM_JOB_ERRORHEALTH);
if (status.equals(XSet.XAM_JOB_ERRORHEALTH_ERROR))
{
System.out.println("ERROR: Errorhealth - " + query.getString(XSet.XAM_JOB_ERROR));
query.haltJob();
System.out.println("ERROR: Job halted");
break;
}

status = query.getString(XSet.XAM_JOB_STATUS);
resultCount = (int) query.getLong(XSet.XAM_JOB_QUERY_RESULTS_COUNT);

// uncomment if you want continuous job status
// System.out.println("Job Status: " + status + " Result count: " + resultCount);

// exit loop if job complete
if (status.equals(XSet.XAM_JOB_STATUS_COMPLETE))
{
finished = true;
}
}

System.out.println("Query job completed ....");
System.out.println("Result set final count is: " + resultCount);

if (resultCount > 0)
{
// open query job result stream
XStream results = query.openXStream(XSet.XAM_JOB_QUERY_RESULTS, XStream.MODE_READ_ONLY);

byte rawXUID[] = new byte[XAM_MAX_XUID];
long bytesRead = 0;

// output XUIDs in result set
while( (bytesRead = results.read(rawXUID, XAM_MAX_XUID)) >= 0 )
{
XAMXUID resultXUID = new XAMXUID(rawXUID);
System.out.println("XUID: " + resultXUID.toString());
}

results.close();
}

query.close();
}

public static void main(String[] args)
{
String xri = "snia-xam://centera_vim!XXX.XXX.XXX.XXX?/home/fpm/xam/xamconnect.pea";
long exitCode = 0;

try
{
xamLib = XAMLibraryFactory.newXAMLibrary();

// uncomment next 2 lines if you want extensive XAM Library logging
// xamLib.setProperty(XAMLibrary.XAM_LOG_VERBOSITY, 1);
// xamLib.setProperty(XAMLibrary.XAM_LOG_LEVEL, 5);

System.out.println("Connecting to XSystem " + xri + "\n");

xSystem = xamLib.connect(xri);
xSystem.setProperty(".xsystem.log.verbosity", 1);
xSystem.setProperty(".xsystem.log.level", 5);

CreateXset(1, "string", 123.55);
CreateXset(77, 42, 100);
CreateXset(6, 200);

xSystem.close();

System.out.println("\nClose/reopen XSystem\n");

xSystem = xamLib.connect(xri);

// uncomment one or more of the following queries
// QueryXsystem("select \".xset.xuid\"");
QueryXsystem("select \".xset.xuid\" where (\"com.example.foo\" > 0) and (\"com.example.foo\" < 50)");
// QueryXsystem("select \".xset.xuid\" where (\"com.example.bar\" > 0) and (\"com.example.bar\" < 100)");
// QueryXsystem("select \".xset.xuid\" where exists (\"com.example.bar\")");
// QueryXsystem("select \".xset.xuid\" where \"com.example.bar\" like '%ing%'");
// QueryXsystem("select \".xset.xuid\" where \"com.example.num\" >= 124");
// QueryXsystem("select \".xset.xuid\" where \"com.example.num\" >= 124.6 ");
// QueryXsystem("select \".xset.xuid\" where (\"com.example.num\" >= 123) and typeof(\"com.example.num\") = 'application/vnd.snia.xam.int'");

xSystem.close();
System.out.println("\nClosed connection to XSystem");

} catch (XAMException xe) {
exitCode = xe.getStatusCode();
System.err.println("XAM Error occured: " + xe.getMessage() + "("
+ exitCode + ")");
} catch (IllegalArgumentException e) {
System.out.println(e.getMessage());
e.printStackTrace();
exitCode = -1;
}

System.exit((int) exitCode);
}
}

Note that a results XStream produced by a query job may be consumed by an application before the XAM Storage System has finished storing all of the XUIDs in the results XStream if XStream.asyncRead is used instead of XStream.read.

Well I think I have provided enough information and examples to enable you to go play with the XAM query language by yourself. It you need help the I am sure the XAM Developer Group on Google Groups will be more than happy to help you.

In parting I would point out that since XAM Storage System providers are currently only required to implement support for queries over the content metadata, I suspect that support for actual full content-based searches using XAM may not occur for some time.
 

0 comments:

Post a Comment