Archive for the ‘Xml’ Category

Searching data in xml efficiently

Sunday, October 16th, 2011

There is many way to search data in xml documents.

For example I needed find Spanish subtitles in given xml:

<?xml version="1.0" encoding="UTF-8"?> 
<import>
	<video>
		<title>Movie_1</title>
		<id>123</id>
		<meta_data>
			<duration>1:59:12</duration>
		</meta_data>
		<subtitles>
			<subtitle>ftp://user:pass@example.com/123/subtitles_fi.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_en.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_es.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_pl.sub</subtitle>
		</subtitles>
	</video>
</import>

The most easy way is of course:

/**
 * Return callback function which check if fileURL contains proper language version 
 */
$fFindSubtitlesByLang = function ($lang) {
	$lang .= '.'; //to be sure that no part of 'example' word is searching
	return function ($fileUrl) use ($lang) {
		return strpos($fileUrl, $lang);
	};
};
 
//normal usage of Simple Xml Element
$subtitles = (array) $xml->video->subtitles;
$spanishSubtitles = reset(array_filter($subtitles['subtitle'], $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

It works, great. But there is also possibility to use Xpath

$subtitles = $xml->xpath('/import/video/subtitles/subtitle');
$spanishSubtitles = reset(array_filter($subtitles, $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

In my opinion this solution looks much better. Special designed function (xpath) is used for traversing XML, but this attempt is slower than previous example. This is not good.

So lets try modify xpath a little:

$spanishSubtitles = reset($xml->xpath('/import/video/subtitles/subtitle[contains(., "es.")]'));
echo $spanishSubtitles .PHP_EOL;

Viola! Fast, readable, only 2 lines of code, not need to use anonymous function. Also adding some additional logic (e.g. specify id of element which I am looking for) is extremely easy.

More info about available function can be found on W3C pages http://www.w3schools.com/xpath/xpath_functions.asp

All files are available here: https://github.com/mrok/xpath-examples