Perl/PHP Benchmarking Tools -- Overview

I created the 2 files included in the above download for the company I work for. They were nice enough to grant me permission to release them on my website. These files are primarily intended for benchmarking production code which interfaces with a database or other external data store. When you have a front-end PHP or back-end Perl script that has performance problems in production, it can be difficult to track down exactly which SQL query or function block is responsible. These 2 files provide a very simple way of starting and stopping multiple timers throughout your code. These timers have millisecond precision (via the Time::HiRes CPAN module in Perl) and are logged to a tab-delimited .tsv file for eventual analysis.

By allowing the benchmarking code to run in production for several hundred or several thousand iterations, you can collect statistics on what your programs are doing in real-life situations. Once the .tsv file has collected a sufficient amount of data, you can get average run-times for each of your timers via a simple awk shell command (or by opening the .tsv file in a spreadsheet program).

Source Code for the PHP Benchmark class

Usage (PHP)

<?php
/**
 * This file demonstrates usage of the benchmark.php
 * Benchmarking Class in a hypothetical PHP Session
 * login script.
 */

// First, include the benchmark class.
// It's a singleton, so there's no need to instantiate it.
require_once('/path/to/benchmark.php');

// Then, if desired, you can change the filenames and file locations
// of the header file and analysis file.
Benchmark::$logFilename    '/tmp/customer_login_analysis.tsv';
Benchmark::$headerFilename '/tmp/customer_login_header.tsv';

// Make a note in the benchmark file to record who is logging in
Benchmark::injectValue($loginName'Customer Login Name');

// Set your initial timer to time the entire login process
Benchmark::timerStart('timer1''Login process: overall');

// Next, we time our first SQL query
Benchmark::timerStart('timer2''SQL query 1');
$sql "SELECT name, address, email, phone
          FROM accounts
         WHERE login = '
{$loginName}'";
list(
$name$address$email$phone) = custom_fetch($sql);

// End the timer2 timer
Benchmark::timerStop('timer2');

// Next, we time an SQL insert
Benchmark::timerStart('timer3''SQL query 2');
$sql "INSERT INTO login_history 
                    (login, timestamp)
             VALUES ('
{$loginName}', CURRENT_TIMESTAMP)";
custom_insert($sql);

// End the timer3 timer
Benchmark::timerStop('timer3');

// End the original timer for the overall login process
Benchmark::timerStop('timer1');

// Finally, write all the benchmarking data to a file
Benchmark::printStats();
?>

Calling Benchmark::printStats() creates 2 files:

  • The header file lists the descriptions entered for each column of data in the analysis file.
  • The analysis file contains the actual data from the timers — in this example, let's assume the login script generated 5 different lines of data from 5 different logins.

Contents of customer_login_header.tsv:
Customer Login Name   Login process: overall   SQL query 1   SQL query 2

Contents of customer_login_analysis.tsv:
Johnny_Locke   1.1345   0.8374   0.2971
Hurley1976   1.2309   0.8005   0.4304
BeLinus   1.1675   0.9172   0.2503
Anna-Lucia334   1.0458   0.7892   0.2566
CrankshaftRocks   1.1933   0.8305   0.3628

Using awk to Analyze the Results

Once you have data loaded in your analysis files, you can use the awk shell program to loop through the file and find an average for each of your timers. A sample awk command to pull averages for our sample customer_login_analysis.tsv file is provided below. awk comes pre-installed on all Linux and OS-X systems. If you're using Windows, you can try using gawk.

Sample awk command:

awk '{ col2 += $2; col3 += $3; col4 += $4 } END { print "---- Login process: overall ----\naverage: ", col2/NR, " samples: ", NR,"\n---- SQL query 1 ----\naverage: ", col3/NR, " samples: ", NR,"\n---- SQL query 2 ----\naverage: ", col4/NR, " samples: ", NR }' /tmp/benchmark_analysis.tsv

Output from the above command:

---- Login process: overall ----
average: 1.1544 samples: 5
---- SQL query 1 ----
average: 0.83496 samples: 5
---- SQL query 2 ----
average: 0.31944 samples: 5

Bingo! It looks like the culprit for the slow login process is that first SELECT query.

Usage (Perl)

The Perl version works basically the same as the PHP version, except that some of the options available in the PHP version – such as injectValue() – aren't yet in the Perl version. Maybe I'll bring the Perl version up-to-date one of these days.

Python? Ruby? FORTRAN?

If you're interested in porting my PHP Benchmark class to another language, feel free to Contact Me.
dlg
kruger