Thursday 8 April 2010

MarkLogic: Query Optimisation Notes (Part One: getting started with xdmp:query-meters())

Here are some brief notes regarding query performance tuning in MarkLogic using xdmp:query-meters()

A simple starting point involves using xdmp:estimate to find out how many documents you currently have in MarkLogic:

As this is an estimate - and as such, is returning the result direct from the indexes - it's going to return a result set almost instantaneously. Changing the xdmp:estimate to fn:count will take fractionally longer to compute a result. Also, removing the /qm:elapsed-time/text() will give you the full breakdown of where the indexes and caches are being hit (and where they're not).
You could express that like so:

xdmp:query-meters tend to become really useful when you use them with cts:searches, a simple example of such use would be:

On a big result set, however, this could take a while as it will return all of the matched XML documents in that set - what normally happens is that the query will resolve quickly, then the rest of the time can be taken up sending vast quantities of XML back to the browser over the network. This can - and with big result sets often does - cause both your machine and your browser to become unresponsive, so it's always best to estimate the size of the resultset before you attempt this!

So for situations where you want the output from query meters but don't want MarkLogic to stream megabytes of XML back to cq/DQ (or your middle tier layer), you can use fn:count. After all, in most cases when you're getting query stats you're probably more interested in the result timings rather than the result set. So here's the query re-written to return just the number of records (and not the results themselves:

Part two will discuss example usage(s) for the xdmp:query-trace(true()) and xdmp:query-trace(false()) functions.

1 comment:

Brajendu Kumar Das said...

PLS Find an interesting Solution :

There's no need to separately delete a document in order to replace it. The xdmp:document-insert() function does that automatically. Remember: the format is just an aspect of a document's content—a description of what type of node the document node has as its child: element, text, or binary. It's not a separate piece of metadata that has to be updated.

Since xdmp:document-insert() takes a document node argument (rather than a string or file name), the format of the document is already determined, because the document has already been constructed. Only the functions that are responsible for constructing documents (loading and parsing a string or file into a tree structure) need be concerned about what format should be used to interpret the input. Those would include xdmp:document-get(), xdmp:document-load() (which combines document-get with document-insert), and xdmp:unquote() (whose format option defaults to "format-xml").

Brajendu Kumar Das