Searching data in xml efficiently

October 16th, 2011

There is many way to search data in xml documents.

For example I needed find Spanish subtitles in given xml:

<?xml version="1.0" encoding="UTF-8"?> 
<import>
	<video>
		<title>Movie_1</title>
		<id>123</id>
		<meta_data>
			<duration>1:59:12</duration>
		</meta_data>
		<subtitles>
			<subtitle>ftp://user:pass@example.com/123/subtitles_fi.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_en.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_es.sub</subtitle>
			<subtitle>ftp://user:pass@example.com/123/subtitles_pl.sub</subtitle>
		</subtitles>
	</video>
</import>

The most easy way is of course:

/**
 * Return callback function which check if fileURL contains proper language version 
 */
$fFindSubtitlesByLang = function ($lang) {
	$lang .= '.'; //to be sure that no part of 'example' word is searching
	return function ($fileUrl) use ($lang) {
		return strpos($fileUrl, $lang);
	};
};
 
//normal usage of Simple Xml Element
$subtitles = (array) $xml->video->subtitles;
$spanishSubtitles = reset(array_filter($subtitles['subtitle'], $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

It works, great. But there is also possibility to use Xpath

$subtitles = $xml->xpath('/import/video/subtitles/subtitle');
$spanishSubtitles = reset(array_filter($subtitles, $fFindSubtitlesByLang('es')));
echo $spanishSubtitles .PHP_EOL;

In my opinion this solution looks much better. Special designed function (xpath) is used for traversing XML, but this attempt is slower than previous example. This is not good.

So lets try modify xpath a little:

$spanishSubtitles = reset($xml->xpath('/import/video/subtitles/subtitle[contains(., "es.")]'));
echo $spanishSubtitles .PHP_EOL;

Viola! Fast, readable, only 2 lines of code, not need to use anonymous function. Also adding some additional logic (e.g. specify id of element which I am looking for) is extremely easy.

More info about available function can be found on W3C pages http://www.w3schools.com/xpath/xpath_functions.asp

All files are available here: https://github.com/mrok/xpath-examples

Storing sessions in memcache(d)

October 15th, 2011

Sometimes I hear idea about storing user session data in memcache. It is great idea, memcache is fast, easy to scale, cached data can be shared between many servers, so it is not necessary to take care about advanced load balancing. Yeach it is a trap. Even official Memcached FAQ says that:

http://code.google.com/p/memcached/wiki/NewProgrammingFAQ#Why_is_memcached_not_recommended_for_sessions?_Everyone_does_it!
http://dormando.livejournal.com/495593.html

To prove that described situation can easily happened I wrote some example code available here https://mrok@github.com/mrok/memcache_test.git. Download it, fix memcache configuration section, run and wait for first missing key. Do you still think it is good idea?

Using Zend Framework components separately

August 7th, 2011

Did you have a situation when you are working on a project which is not using Zend Framework (ZF) and you need provide functionality which is already implemented within ZF? I did, and I do not like to reinvent a wheel. This is what I did:

Task: Automatically detect user preferred language and set proper month names for select box and proper currency sign.

What should be done:

1. Add needed component and dependencies into product structure –  the best way IMHO is create something called ‘lib’ directory and put there all third part library.

2. Setup autoloader for ZF.

set_include_path(get_include_path() . PATH_SEPARATOR . getcwd() . '/library');
require getcwd() . '/library/Zend/Loader/Autoloader.php';

$autoloader = Zend_Loader_Autoloader::getInstance();

If your additional libraries does not have commons namespaces structure you can use:

$autoloader->setFallbackAutoloader(true)

or write your own autoloader and register it by pushAutoloader method.

3. Use ZF as you like:

$locale = new Zend_Locale();
echo 'Language: ' . $locale->getLanguage() . '<br />';   //return da_DK
$list = $locale->getTranslationList('Month', $locale);
var_dump($list);
string 'januar' (length=6)
string 'februar' (length=7)
string 'marts' (length=5)
string 'april' (length=5)
string 'maj' (length=3)
string 'juni' (length=4)
string 'juli' (length=4)
string 'august' (length=6)
string 'september' (length=9)
string 'oktober' (length=7)
string 'november' (length=8)
string 'december' (length=8)
$currency = new Zend_Currency($locale);

var_dump($currency->getShortName()); //return DKK
var_dump($currency->getName()); //return Dansk krone
var_dump($currency->getSymbol()); //return kr

P.S You should also add Currency.php and Currency directory to your lib dir (those are not listed at first point).

Symfony 2.0 released

July 28th, 2011

I have waited long time for this moment. And now it is http://symfony.com/blog/symfony-2-0

More than 2 years after PHP 5.3 release, with all namespace, closures and gc goods, PHP community get first* modern framework for this language.

* I am not sure if lithium was not faster ;)  http://lithify.me/en

Work effectively with mocked Web Services using SoapUI

July 18th, 2011

Today at work we get interesting case. How to test web-service(WS) client when only WSDL description is available,  WS was still in development?

WSDL document is rather self descriptive so it is not a tragedy. At least we know what can be expected as response after specified request.

With help of SoapUI live become easier.

Lets try to create simulation of web-service when only WSDL file is available.

1. Create new project (I use SoapUI v4.0.0):

- for test purpose I choose Allegro Web API – the biggest Polish online auction system

Probably you also noticed option “Create a Web Service Simulation …” – do not forget checked it.

2. Next form lets us generate mocks for chosen services:

I chose doLogin method to avoid passing 25 different parameters.

3. Click ok, provide name of your new created mockService, again click ok

4. If everything went OK your screen should look like image bellow:

 

5. Double click on “do Login” service to open MockResponses List.

6. Right click on ‘Response 1′ and Show MockResponse Editor

In editor we can create response text as we want. For example.

<soapenv:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:urn="urn:AllegroWebApi">
   <soapenv:Header/>
   <soapenv:Body>
      <urn:doLoginResponse soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
         <session-handle-part xsi:type="xsd:string">abcdef098765</session-handle-part>
         <user-id xsi:type="xsd:long">345</user-id>
         <server-time xsi:type="xsd:long">12345678</server-time>
      </urn:doLoginResponse>
   </soapenv:Body>
</soapenv:Envelope>

7. Run server 

8. Right click on the ‘Response 1′ Mock and choose ‘Open request’ option.

Then by clicking on green arrow you can test your MockServer. In response window you should receive the same xml which you provide in 6th step.

Lets try this outside SoapUI and write any Python code (Suds module available here):

from suds.client import Client

WSDLURL = 'http://szymon:8088/mockAllegroWebApiBinding?WSDL'

if __name__ == '__main__':
    client = Client(WSDLURL)

    params = {'user-login' : 'testUser',
              'user-password' : 'pass',
              'country-code' : 12,
              'webapi-key': 'xxxxYYYYY',
              'local-version' : 32546}
    response = client.service.doLogin(**params)
    print response

Response

(reply){
   session-handle-part = "abcdef098765"
   user-id = 345
   server-time = 12345678
 }

Do not forget change value for WSDLURL.

It is very easy way to test not exist yet webservice. Manualy defined reponse are usefull for modes:

  • SEQUENCE
  • RANDOM
  • QUERY_MATCH
  • XPATH

more information abut them are available on soapUI project page.

What in case that we need more logic to prepare resposne and it should be depended from request parameter? It is also not a problem. Everything what we need is set response mode as SCRIPT and write logic using groovy.

For example lets return user-id as doubled country code.

def util = new com.eviware.soapui.support.GroovyUtils(context)
def xml = util.getXmlHolder(mockRequest.requestContent)
context.userId = xml.getNodeValue("//country-code").toInteger() * 2

then in response editor change

         <user-id xsi:type="xsd:long">345</user-id>

into      

         <user-id xsi:type="xsd:long">${userId}</user-id>

now the returned response (for country-code = 12) looks like:

(reply){
   session-handle-part = "abcdef098765"
   user-id = 24
   server-time = 12345678
 }

It works! ;)
Here is article about more advanced creation of response in groovy and currently available API.

Http build query

July 18th, 2011

Today during code review Olle noticed that there is much simpler way to build query part of url then concatenate strings in loop.

For example instead of:

$aParams = array('id' => 12,
	'method' => 'post',
	'callback_function_name' => 'add');
$sQuery = '?';
foreach ($aParams as $key => $value) {
	$sQuery .= $key .'=' .$value .'&';
}
$sQuery = substr($sQuery, 0, -1); //to remove last &
$sUrl = $sHost .$sQuery;

Solution proposed by Olle was:

$aParams = array('id' => 12,
	'method' => 'post',
	'callback_function_name' => 'add');
$sQuery = http_build_query($aParams, '', '&');
$sUrl = $sHost . $sQuery;

I checked that function in manual and it looks quite powerful. One thing which considered me was performance, of course native C function are faster then loop, etc, but is not that function too powerful and slow to use it in responsible part of code with huge load?

Lets check:

$aParams = array('id' => 12,
	'user_id' => 234,
	'method' => 'post',
	'callback_function_name' => 'add');

$fStarttime = microtime(true);

for ($i = 0; $i < 100000; $i++) {
	$sQuery = http_build_query($aParams, '', '&');
}
echo 'http_build_query: ' . (microtime(true) - $fStarttime) .PHP_EOL;

$fStarttime = microtime(true);

for ($i = 0; $i < 100000; $i++) {
	$sQuery = '';
	foreach ($aParams as $key => $value) {
		$sQuery .= $key . '=' . $value . '&';
	}
	$sQuery = substr($sQuery, 0, -1); //to remove last &
}
echo 'concatenation: ' . (microtime(true) - $fStarttime);

and results:

http_build_query: 0.69601511955261
concatenation:    1.1222679615021

As you see it is definitely easier, faster way and needs less lines of code.

Tip: there is an extension called http://dk.php.net/manual/en/book.http.php which posses more useful function.

In PHP5.4 second approach will faster due to introduce caches to eliminate repeatable run-time bindings of functions, classes, constants, methods and properties – more info about this feature here