Monday, November 12, 2012

PDQ 6.0 is On Its Way

PDQ (Pretty Damn Quick) version 6.0.β is in the QA pipeline. Although this is a major release, cosmetically, things won't look any different when it comes to writing PDQ models. All the big changes have taken place under the hood in order to make PDQ more consistent with the R statistical environment.

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

> library(pdq)
> source("/Users/njg/PDQ/Test Suites/R-Test/mm1.r")
                ***************************************
                ****** Pretty Damn Quick REPORT *******
                ***************************************
                ***  of : Thu Nov  8 17:42:48 2012  ***
                ***  for: M/M/1 Test                ***
                ***  Ver: PDQ Analyzer 6.0b 041112  ***
                ***************************************
                ***************************************
...

The main trick is that the Perl and Python versions of PDQ will remain entirely unchanged while at the same time invisibly incorporating significant changes to accommodate R.

Motivation (added 11/25/12)

The PDQ library is written in C and uses the *nix interfaces stdlib and stdio. To date, we have used SWIG to generate the other language variants: Perl, Python, R, etc. In general, this works but:
  • R Console: Since PDQ-R was released within PDQ 5, it has been observed that it can sometimes crash the R-console, e.g., when an error condition is issued.
  • CRAN: That and some other issues have meant that the eventual goal of making PDQ-R available on CRAN has always been thwarted. It was not even worth attempting to check it in.

There has always been a barrier for people wanting to use PDQ on Windows, and especially PDQ-R under Windows R. That PDQ-R was unfit to deposit on CRAN, also denied R a flexible queueing network analysis package.

Although I have become more and more enamored with R for computer system performance analysis and data modeling (especially with PDQ-R), I am not an R developer, so things have continued to languish because:

  • The reason for the R-console crashes was not easy to determine.
  • I was traveling under the mistaken impression that R core development had taken a direction more favorable to the C++ language.
  • An early proposal that would have made PDQ CRAN-ready (and possibly solve some of the other issues) was to port the PDQ library to C++ and apply the Rccp interface. SWIG also handles C++. Although Rccp would certainly have facilitated this approach, I was philosophically opposed to using the C++ language. As Alan Kay said: "I invented the term 'Object-Oriented', and I can tell you I did not have C++ in mind."

More recently, however, it was realized that we could keep the C code and apply the official foreign language extensions in the Rlib API. Additionally, a pathway was found for making PDQ-R available in Windows environments.

Example Problem

Suppose you define a queueing node called "node" where the arrival rate λ = 1.5 of the work stream "work" is greater than the service rate μ = 1.0 in the following PDQ-R code:

CreateOpen("work", 1.5)
CreateNode("node", CEN, FCFS)
SetDemand("node", "work", 1.0)
The condition λ > μ is equivalent to having a node utilization ρ > 1, which corresponds to an unstable queue. PDQ detects this error, reports it and stops the computation, which would otherwise produce bogus results. That is how it works in the Perl and Python versions of PDQ: diagnostic error messages appear in the terminal window where you are running your models.

In the current 5.0.4 release of PDQ, however, such an error causes the R-console to crash, so that this diagnostic error message


> source("/Users/njg/PDQ/Test Suites/R-Test/errmsg.r")
Error in Solve(CANON) : ERROR in procedure 'canonical()': 
Arrival rate 1.500 for stream 'work' exceeds saturation thruput 1.000 of node 'node' with demand 1.000

is never seen, and is therefore not diagnostic—not to mention that crashing the R-console is a bit uncool. Similarly for any other PDQ 5 error messages. This will no longer happen in PDQ version 6.

Solutions

We have Paul Puglia (read more details) and Denny Chen to thank for resolving these issues. The new modifications apply the correct R wrappers to replace the following C functions:
  1. exit() will eliminate crashing the R-console. (Denny and Paul)
  2. fprintf() will allow error messages to appear correctly in the R-console. (Denny and Paul)
  3. malloc() and will enable PDQ to run under Windows memory management. (Paul)
  4. Optimized distribution path for PDR-R build on Windows platforms. (Paul)
These changes have also allowed us to continue to use SWIG to produce all the language variants for PDQ from the same C-code library. Otherwise, we would have been required to stick a fork in it.

Demo Models

In addition to the code modifications, Paul has also ported many of the models in my Perl::PDQ book to the R language that can be accessed using the demo command.

> demo(package="pdq")
Demos in package ‘pdq’:
cluster     From examples/ppdq_2005/pdq_models/cluster.pl
diskoptim   From examples/misc/diskoptim.pl
ebiz        From examples/ppdq_2005/pdq_models/ebiz.pl
elephant    From examples/ppdq_2005/pdq_models/elephant.pl
feedforward From examples/ppdq_2005/pdq_models/feedforward.pl
florida     From examples/ppdq_2005/pdq_models/florida.pl
httpd       From examples/ppdq_2005/pdq_models/httpd.pl
mmm         From examples/MultiServer/mmm.py
moreq       Example script 1 from original package
morez       Example script 2 from original package
mm1         Example Script 3 from original package
spamcan1    From examples/Linux Magazine/spamcan1.py
spamcan2    From examples/Linux Magazine/spamcan2.py

Release Date

We don't have an exact release date yet, primarily because it needs more testing under different variants of the Windows OS. And we are still discovering little wrinkles like, builds using R 2.15.2 have to be compiled with a gcc 4.2 compiler, etc.

Nonetheless, we anticipate that PDQ 6.0 will make a nice Xmas present.

1 comment:

Neil Gunther said...

Dirk Eddelbuettel, an author and maintainer of the Rccp package, expressed a concern that the wording in my original post had unfairly implicated Rccp as the culprit preventing PDQ-R from being accepted on CRAN. It certainly was not my intention to cast aspersions on Rccp, and no one else who reviewed my draft
saw it that way either; especially since Rccp has never been tried with PDQ. In any case, any unfortunate misinterpretations can mostly be attributed to excessive brevity on my part. Hopefully, the expanded Motivation section now delineates the PDQ issues more clearly.