Friday, 16 September 2011

XQuery/MarkLogic: pulling data out of an Excel spreadsheet

Excel spreadsheets can be saved in an xml format - in such cases, it's simple to dump the xml into a CQ buffer (if you're using MarkLogic) and parse the information in adjacent column cells.

In this example, I'm taking a very simple spreadsheet structure to illustrate the procedure:

Val 1a Val 1b
Val 2a Val 2b
... ...

Below is an example of how to parse the XML, pull out the information from the relevant cells and strip white space for good measure:

You should end up with something like this:

