Parallel run-tests.php – development environment

In 2009 Stefan Priebsch, a GSOC student and I wrote a version of run-tests.php that runs PHPTs in parallel. This helped reduce run times on multi-core machine by a decent amount. In September 2009, IBM wanted me to work on OSGi – a fun job but completely impossible to combine with PHP development – Stefan was busy with thePHP.cc and the GSOC student took the money and ran. So development stopped – which was a pity because all we really had left to do was implement –REDIRECTTEST–

 

A couple of weeks ago I decided that I’d look to see if the code still ran. It does, but there are a few fixes needed which I am working on now, after that I’ll implement the missing sections.

 

Having not worked on PHP for a while I found that the most painful thing was recreating the development environment that I had in 2009. Here is a brief list of what I had to do in terms of set up on a Mac, it assumes that everything necessary to compile C (eq gcc, make, autoconf…) and an svn client is already installed.

  1. Get PHP from here http://www.php.net/downloads.php. ( I used the PHP 5.4.0 tar file)
  2. Extract it.
  3. Run ./configure –with-zlib –exec-prefix=/usr/local/php540
  4. make
  5. sudo make install

This gets PHP installed as /usr/local/php540/bin/.

After that I installed:

    1. PHPUnit (using pear)
    2. Phing (pear again)
    3. XDebug 2.2.0 (downloaded, built and installed).
    4. Doxygen (download tar, configure; make;  sudo make install;)

The next step was to checkout the parallel run-tests.php code,

svn co https://svn.php.net/repository/
              php/phpruntests/trunk phpruntests

cd phpruntests

Before trying to run anything there is a configuration file that needs to be set up, ‘phpdefinitions.txt’, which can be found in the phpruntests directory. Open it up and modify it as indicated to add the paths to various PHP executables.

You may wonder why we don’t just set the environmental variables, TEST_PHP_EXECUTABLE, TEST_PHP_CGI_EXECUTABLE. The answer is that because the unit tests test parts of the code that check which PHP executable to use we will ignore any pre-existing settings. When running unit tests, we specify which version of PHP to use using the -p flag.

The php versions are read from phpdefinitions.txt in tests/rtTestBootstrap.php and stored in global variable called RT_PHP_PATH and RT_PHP_CGI_PATH.

The phpdefinitions.txt file is also used as a phing properties file. The PHP executables in this case are used in the QA target.

Unit tests can be run using

phing test 

A more extensive set of tests can be run by using this command:

phing qa 

Running this will run the unit tests, generate test coverage, doxygen documentation and compare the output from a set of tests run using the new code with the same set run using the old code. It all takes a while to run – 15 minutes on my Mac.

To make life slightly easier for myself I downloaded Eclipse PDT, added Subclipse to it and added PHPUnit as an external tool. This last bit means that I can highlight unit tests and run them from Eclipse.

Next steps – work through my code and see if I can remember how it works.

 

 

 

Running PHP tests – the basics

Before diving into how the parallel code works it is worth just covering how run-tests.php works – at a very simple level.

The easiest way to run a PHP test case is from the command line, like this:

The run-tests.php code parses the file “mytest.phpt”, finds the php code in the FILE section and runs it like this:

The run-tests.php code goes on to compare the output from the above with some expected test output. If they are the same the test passes, if not it fails.

As I said, this is a very simple example. The run-tests.php code can be thought of in three parts:

  1. The code required to set up the test environment and “php_options
  2. Code that handles running tests.
  3. Retrieving, comparing and displaying output.

Over the next few posts I’ll work through each of these stages, explaining how they are handled by the new parallel run-tests code.

Running ‘redirected’ PHP tests

The first step in implementing the REDIRECTTEST section in parallel runtests code is to get some redirected tests running on my Mac. I’ve used PDO and MySQL, principally because I already had MySQL installed so it seemed the easiest place to start.

This is what I had to do to get them to run:

$cd php/source/code
$./configure --with-zlib --with-pdo-mysql 
         --with-mysql-sock=/var/mysql/mysql.sock
$make

Missing the location of mysql.sock results in tests being skipped and errors like this: “SQLSTATE[HY000] [2002] No such file or directory”, it took me a while to work out what it meant.

The PDO tests assume the following if you do not specify properties using environmental variables:

  1. The MySQL database has a user called ‘root’
  2. The ‘root’ user has no password
  3. There is a database called ‘test’

The alternative to using the defaults is to assign some environment variables as follows:

$export PDO_MYSQL_TEST_DSN="mysql:host=localhost;dbname=test"
$export PDO_MYSQL_TEST_USER=your_mysql_uid
$export PDO_MYSQL_TEST_PASS=your_mysql_pwd

After setting those things up I was able to run:

$export TEST_PHP_EXECUTABLE=/php/to/test/php
$php runtests.php  ext/pdo_mysql/tests/common.phpt

The second command assumes that I’m in the top level directory of the PHP source code. The summary output is as follows:

Number of tests : 62 60
Tests skipped : 2 ( 3.2%) ——–
Tests warned : 0 ( 0.0%) ( 0.0%)
Tests failed : 0 ( 0.0%) ( 0.0%)
Expected fail : 0 ( 0.0%) ( 0.0%)
Tests passed : 60 ( 96.8%) (100.0%)

Which is good enough to start work with. I guess the next thing is to work out what the REDIRECTTEST part actually does.

 

What REDIRECT does

It’s really simple and very useful for PDO where there is a requirement to run the same set of tests against a range of different databases.

Here is a picture:

Parallel run-tests performance

The development of parallel run-tests.php is moving along. Unfortunately paid work is going to get in the way a bit for the next few weeks so things will move a little slowly.

I’m still thinking about the REDIRECTTEST implementation. I don’t think it’s that hard, the other thing that needs to be thought through is exactly how to avoid clashes between, for example, msql and mysqli tests. I think that the latter is best addressed by having a configuration file of test directories that have to be run in sequence. In a parallel run, one processor would be set to work through these while other tests were scheduled randomly to the remaining processors. Anyway – it needs some thought and the parallel execution code is the part I’m least familiar with.

The other thing that really needs to be done is some more extensive performance work. I have done a couple of runs on a dual core Mac, just to get some data points.

The tests in the timing benchmark (phpruntests/QA/QATimedBucket.tgz) are taken from the PHP development stream and are all tests under:



ctype date dom ereg fileinfo filter iconv json libxml pcre phar 
posix reflection session spl sqlite3 standard tokenizer xml 
xmlreader xmlwriter zlib

 

I executed these tests three times. Run 1 uses the current PHP development stream version of run-tests.php. Run 2 uses the parallel version of run-tests.php run in sequential mode and Run 3 uses the parallel version run over two processors.

Here are the times:


Run 1 298 seconds
Run 2 293 seconds
Run 3 207 seconds

There are some minor differences in the test results between running the standard and new versions of run-tests, in summary these are:


             Run 1        Runs 2&3
PASS         6151          6129
SKIP          446           447
XFAIL          28            27
WARN            0             2
FAIL           11            12
BORK            0            19

These differences are mainly accounted for by the new version of run-tests.php being much stricter in what it allows in a test case. For example, any empty section will cause a ‘BORK’, and section it doesn’t recognise will also cause a ‘BORK’. The next job on my list is to go through the tests and either fix tests (or my code) so that the old and new versions give exactly the same results.

After that I’d like to find an 8 way machine and get some better data points. Any offers?

Update – parallel run-tests.php

A few more people sent me performance figures for parallel run-tests.php last night, thanks to Olivier Doucet, Stefan Marr and Sean Coates and Chris Jones we have a few good points on the graph.

A few things to note:

  1. We are mainly running different sets of tests. The biggest runs are mine and Sean’s which have 6636 tests and run as 47 groups. This set of tests can be found in ~/phpruntests/AQ/QATimedBucket.tgz
  2. I’ve removed any variation in CPU and processor type by just plotting times as a fraction of the sequential time
  3. Sean ran the 47 groups over 47 processes, the time for this was 0.28 of the single process time. It looks from these results as though distributing between 4 processors is optimal

Thanks very much for running these!