Searching data in xml efficiently

There is many way to search data in xml documents.

For example I needed find Spanish subtitles in given xml:

<?xml version="1.0" encoding="UTF-8"?> 
<import>
	<video>
		<title>Movie_1</title>
		<id>123</id>
		<meta_data>
			<duration>1:59:12</duration>
		</meta_data>
		<subtitles>
			<subtitle>ftp://user:pass@example.com/123/subtitles_fi.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_en.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_es.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_pl.sub</subtitle>
		</subtitles>
	</video>
</import>

The most easy way is of course:

/**
 * Return callback function which check if fileURL contains proper language version 
 */
$fFindSubtitlesByLang = function ($lang) {
	$lang .= '.'; //to be sure that no part of 'example' word is searching
	return function ($fileUrl) use ($lang) {
		return strpos($fileUrl, $lang);
	};
};
 
//normal usage of Simple Xml Element
$subtitles = (array) $xml->video->subtitles;
$spanishSubtitles = reset(array_filter($subtitles['subtitle'], $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

It works, great. But there is also possibility to use Xpath

$subtitles = $xml->xpath('/import/video/subtitles/subtitle');
$spanishSubtitles = reset(array_filter($subtitles, $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

In my opinion this solution looks much better. Special designed function (xpath) is used for traversing XML, but this attempt is slower than previous example. This is not good.

So lets try modify xpath a little:

$spanishSubtitles = reset($xml->xpath('/import/video/subtitles/subtitle[contains(., "es.")]'));
echo $spanishSubtitles .PHP_EOL;

Viola! Fast, readable, only 2 lines of code, not need to use anonymous function. Also adding some additional logic (e.g. specify id of element which I am looking for) is extremely easy.

More info about available function can be found on W3C pages http://www.w3schools.com/xpath/xpath_functions.asp

All files are available here: https://github.com/mrok/xpath-examples

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" cssfile="">