Wednesday 27 April 2011

VirtualBox: Minimal Install of Fedora - no Network on restart

If you've successfully installed Fedora (Minimal network install) and you've found you don't have your network connection on first boot, try this from the shell:

Java: Handling Large XML Documents with VTD-XML

Here's a working example of how VTD-XML can be used to parse a large (~330Mb) XML file. In this example, I'm using AutoPilotHuge to match all page elements in the file. For each element encountered (there are 10s of thousands), the element data will be written to a file using the element's index within the document as a filename:

Tuesday 19 April 2011

XQuery: Replace example

The general format for the replace function in XQuery:

Returns 2007 (the four decimal values in the node)

Monday 11 April 2011

MarkLogic: Performing an operation across two databases using xdmp:eval

There could be situations where you want to perform operations across two databases. One such example could be to quickly clone a batch of documents into a second database.

This can be achieved using xdmp:eval.

In this example, if you have several documents in a database called "DatabaseA" (and in CQ this is selected as your Content Source). An xdmp:eval statement could be used to write those documents into DatabaseB like so:

Thursday 7 April 2011

MarkLogic: Sending and Receiving XML Content over HTTP

A fairly common requirement for an application built on MarkLogic Server could be to:

1. Wrap up content in an XML element and send it to a web service
2. On receipt, extract it from the request-body for further processing (insert, validate, transform etc)

To encode the XML element for transmission, you can use a combination of the right content-type options for xdmp:http-post (or xdmp:http-put) and xdmp:quote the element data:


For the service receiving the data, you use a combination of xdmp:unquote and xdmp:quote to extract the element from the request-body, but bear in mind that unquote will return a document-element. This can be easily resolved by switching the the first node() in the document like so: