## Monday, June 29, 2009

### PDQ 5.0 Test Suite or ... How I Spent My Weekend

I was planning to blog about the amazing time I had at Velocity 2009 last week, when this landed in my mailbox (edited for space and privacy):

Subject: Seeking help with PDQ-R ...
Date: Thu, 25 Jun 2009 15:51:21 -0500

My name is James and I've been trying to learn to properly use PDQ after reading two of your books, "Guerrilla Capacity Planning" and "Analyzing Computer System Performance with Perl::PDQ." I'm still getting a grip on PDQ-R. ... I decided to set about of re-creating the queue circuit in the study with PDQ-R as an exercise. ...
The output of my code yields:
[1] "Manual response time for class 1 is 0.864179 seconds"
[1] "PDQ-R response time for class 1 is 0.313637 seconds"
[1] "Manual response time for class 2 is 6.105397 seconds"
[1] "PDQ-R response time for class 2 is 3.552873 seconds"
[1] "Manual response time for class 3 is 4.535833 seconds"
[1] "PDQ-R response time for class 3 is 4.535833 seconds"
If you could give my code a look over and give me some hints I would really appreciate it.

It turns out that James N. had discovered a bug (gasp!) in PDQ, which is why we have users. (jk) The above output refers to a simple model of a database system comprising 3 resources (call them: cpu, disk1 and disk2) and 3 transaction streams (work1, work2, work3) and no limit on the queue lengths, i.e., an open queueing network or circuit. Here's what my rendition looks like:
# PDQ-R modellibrary(pdq)# Request rates of the 3 transaction streams into the DBMSXsys<-c(50/150, 80/150, 70/150)# Service demands at each resourceDcpu<-c(0.096, 0.615, 0.193)Ddk1<-c(0.088, 0.683, 0.763)Ddk2<-c(0.119, 0.795, 0.400)# Start PDQ code with Init callInit("James' DB Model");# Define the 3 transaction workloadsworkname<-1:3for (w in 1:3) { workname[w] <- sprintf("work%d", w) CreateOpen(workname[w], Xsys[w])}# Define the 3 resourcesCreateNode("cpu", CEN, FCFS)CreateNode("dk1", CEN, FCFS)CreateNode("dk2", CEN, FCFS)for (w in 1:3) { SetDemand("cpu", workname[w], Dcpu[w]) SetDemand("dk1", workname[w], Ddk1[w]) SetDemand("dk2", workname[w], Ddk2[w])}Solve(CANON)Report()

To hunt down the problem, I rewrote the PDQ-R model in C, just in case there were any translation problems with SWIG-ing PDQ/lib into PDQ-R, Perl PDQ, PyDQ, etc.
/*       multiclass-open.c    Created by NJG on Thursday,  June 25, 2009    Updated by NJG on Sunday,  June 28, 2009*/#include < stdio.h >#include < stdlib.h >#include < string.h >#include < math.h >#include "PDQ_Lib.h"int main(void) {    extern void     exit();    extern char     s1[];        char            *p; // dummy pointer for names    char            *devname[3];    char            *workname[3];    int             i, j, k, n, s, w;    double          actualtR[4][3];        // Expected RT values    double  expectR[4][3] = {                {0.174, 1.118, 0.351},                {0.351, 2.734, 3.054},                {0.340, 2.270, 1.142},                {0.865, 6.122, 4.546}                               };    // Request rates of the 3 transaction streams into the DBMS    double Xsys[] = {50.0/150, 80.0/150, 70.0/150};            // Service demands    double Dcpu[] = {0.096, 0.615, 0.193};    double Ddk1[] = {0.088, 0.683, 0.763};    double Ddk2[] = {0.119, 0.795, 0.400};            // Name the workloads    for(w = 0; w < 3; w++) {        resets(s1);        sprintf(s1, "work%d", w+1);        if ( (p = (char *) malloc(strlen(s1) * sizeof(char)) ) != NULL) {            strcpy(p, s1); // copy into assigned storage            workname[w] = p;        }         else {             printf("malloc failed!\n");            exit(-1);         }    }    free(p);        // Name the resources    for(k = 0; k < 3; k++) {        resets(s1);        if (k == 0) sprintf(s1, "%s", "cpu");        if (k == 1) sprintf(s1, "%s", "dk1");               if (k == 2) sprintf(s1, "%s", "dk2");               if ( (p = (char *) malloc(strlen(s1) * sizeof(char)) ) != NULL) {            strcpy(p, s1); // copy into assigned storage            devname[k] = p;        }         else {             printf("malloc failed!\n");            exit(-1);         }    }    free(p);        /************************** Start PDQ code **********************/    PDQ_Init("Multiclass Test Model");        // Create workloads    for(w = 0; w < 3; w++) {        s = PDQ_CreateOpen(workname[w], Xsys[w]);    }        // Create resources    n = PDQ_CreateNode("cpu", CEN, FCFS);    n = PDQ_CreateNode("dk1", CEN, FCFS);    n = PDQ_CreateNode("dk2", CEN, FCFS);        // Assign demands     for(w = 0; w < 3; w++) {        PDQ_SetDemand("cpu", workname[w], Dcpu[w]);        PDQ_SetDemand("dk1", workname[w], Ddk1[w]);         PDQ_SetDemand("dk2", workname[w], Ddk2[w]);    }        PDQ_Solve(CANON);            printf("Expected Response Times\n");    for(i = 0; i < 4; i++) {        for(j = 0; j < 3; j++) {            printf("%4.3f\t", expectR[i][j]);        }        printf("\n");    }    printf("--------------------------\n");          printf("Actual Response Times\n");    for(i = 0; i < 4; i++) {        // System response times for QNM        if (i == 3) {            for(w = 0; w < 3; w++) {                printf("%4.3f\t",                 actualtR[i][w] = PDQ_GetResponse(TRANS, workname[w]));            }        }                // Residence times per resource        if (i < 3) {            for(w = 0; w < 3; w++) {                printf("%4.3f\t",                 actualtR[i][w] = PDQ_GetResidenceTime(devname[i], workname[w], TRANS));            }        }            printf("\n");    }    } // main

This code also compares actual (meaning, computed by PDQ) with expected values (embedded as 2-d array) of residence times due to each workload at each resource. The "expected" values can come from any one of a number of sources such as: measurements, other models, other tools, etc. This forms the basis of the test code approach.

The problem seen by James turns out to arise from a conflict between the way resource utilizations are computed for the new multi-server queues (released in PDQ 5.0.1) and multi-class workloads. When the PDQ lib is corrected, the agreement can be observed in this output:
[njg]~/PDQ/Test Suite/C-PDQ% ./mulclass-openExpected Response Times0.174    1.118    0.351    0.351    2.734    3.054    0.340    2.270    1.142    0.865    6.122    4.546    <----------------------------Actual Response Times0.175    1.118    0.351    0.352    2.728    3.048    0.340    2.274    1.144    0.866    6.120    4.543    <--

The last line in each of the above tables corresponds to the "manual" values that James was reporting in his email.

The PDQ test cases had not kept up with new code developments; one of the hazards of only having severely punctuated time to work on PDQ. The new release PDQ 5.0.2 should be available for download later this week. I'll send out an email notice at that time.