Thursday, June 26, 2014

Load Average in FreeBSD

In my Perl PDQ book there is a chapter, entitled Linux Load Average, where I dissect how the load average metric (or metrics, since there are three reported numbers) is computed in shell commands like uptime, viz.,
[njg]~/Desktop% uptime
16:18  up 9 days, 15 mins, 4 users, load averages: 2.11 1.99 1.99

For the book, I used Linux 2.6 source because it was accessible on the web with convenient hyperlinks to navigate the code. Somewhere in the kernel scheduler, the following C code appeared:


#define FSHIFT    11          /* nr of bits of precision */ 
#define FIXED_1   (1<<FSHIFT) /* 1.0 as fixed-point */ 
#define LOAD_FREQ (5*HZ)      /* 5 sec intervals */ 
#define EXP_1     1884        /* 1/exp(5sec/1min) fixed-pt */ 
#define EXP_5     2014        /* 1/exp(5sec/5min) */ 
#define EXP_15    2037        /* 1/exp(5sec/15min) */ 

#define CALC_LOAD(load,exp,n) \
 load *= exp; \
 load += n*(FIXED_1-exp); \
 load >>= FSHIFT;

where the C macro CALC_LOAD computes the following equation

Friday, June 6, 2014

The Visual Connection Between Capacity And Performance

Whether or not computer system performance and capacity are related is a question that comes up from time to time, especially from those with little experience in either discipline. Most recently, it appeared on a Linked-in discussion group:
"...the topic was raised about the notion that we are Capacity Management not Performance Management. It made me think about whether performance is indeed a facet of Capacity, or if it belongs completely separate."

As a matter of course, I address this question in my Guerrilla training classes. There, I like to appeal to a simple example—a multiserver queue—to exhibit how the performance characteristics are intimately related to system capacity. Not only are they related but, as the multiserver queue illustrates, the relationship is nonlinear. In terms of daily operations, however, you may choose to focus on one aspect more than the other, but they are still related nonetheless.

Thursday, June 5, 2014

Importing an Excel Workbook into R

The usual route for importing data from spreadsheet applications like Excel or OpenOffice into R involves first exporting the data in CSV format. A newer and more efficient CRAN package, called XLConnect (c. 2011), facilitates reading an entire Excel workbook and manipulating worksheets and cells programmatically from within R.

XLConnect doesn't require a running installation of Microsoft Excel or any other special drivers to be able to read and write Excel files. The only requirement is a recent version of a Java Runtime Environment (JRE). Moreover, XLConnect can handle older .xls (BIFF) as well as the newer .xlsx (Office XML) file formats. Internally, XLConnect uses Apache POI (Poor Obfuscation Implementation) to manipulate Microsoft Office documents.

As a simple demonstration, the following worksheet, from a Guerrilla Capacity Planning workbook, will be displayed in R.

First, the Excel workbook is loaded as an R object:

Monday, June 2, 2014

Guerrilla Training in July 2014

Time to register online for the upcoming GBoot (July 17) and GCaP (July 21) training classes. This is your fast track for learning how to do enterprise performance and capacity management.

Entrance Larkspur Landing hotel Pleasanton California

Classic topics include:

  • How performance metrics are related to each another
  • Queueing theory for those who can't wait
  • How to quantify scalability with the Universal Scalability Law (USL)
  • IT Infrastructure Library (ITIL) for Guerrillas
  • The Virtualization Spectrum from hyperthreads to hyperservices

All classes are held at the Larkspur Landing hotel in Pleasanton, California. Directions are available on the registration page. Larkspur Landing also provides free wi-fi Internet in their residence-style rooms as well as the training room.

Saturday, April 26, 2014

Guerrilla Data Analysis Class, May 19th

This is your fast track to enterprise performance and capacity management with an emphasis on applying R statistical tools to your performance data.

Discounts are available for 3 or more people from the same company. Enquire for details.

Entrance Larkspur Landing hotel Pleasanton California

All classes are held at our lovely Larkspur Landing location in Pleasanton. Larkspur Landing also provides free Wi-Fi Internet access in their residence-style rooms, as well as our classroom.

Some of the topics covered will include:

  • Introduction to using R
  • How to Detect Bad Data
  • Measurement Errors and Analysis
  • Multivariate Linear and Nonlinear Regression
  • Quantitative Scalability Analysis
  • Statistical forecasting techniques for CaP
  • Machine learning algorithms applied to CaP Data
  • Wild Not Mild Data: SQL access, web traffic, data recovery
  • Taming the Data Torrent with PCA and PerfViz techniques

Attendees should bring their laptops to the class as course materials are provided on a flash drive and calculational tools like OpenOffice, Excel and R will be useful.

Before registering online, take a look at what former students have said about their Guerrilla training experience.

Tuesday, April 1, 2014

Melbourne's Weather and Cross Correlations

During a lunchtime discussion among recent GCaP class attendees, the topic of weather came up and I casually mentioned that the weather in Melbourne, Australia, can be very changeable because the continent is so old that there is very little geographical relief to moderate the prevailing winds coming from the west.

In general, Melbourne is said to have a mediterranean climate, but it can also be subject to cold blasts of air coming up from Antarctic regions at any time, but especially during the winter. Fortunately, the island state of Tasmania acts as something of a geographical barrier against those winds. Understanding possible relationships between these effects presents an interesting exercise in correlation analysis.

Thursday, March 13, 2014

Performance of N+1 Redundancy

How can you determine the performance impact on SLAs after an N+1 redundant hosting configuration fails over? This question came up in the Guerrilla Capacity Planning class, this week. It can be addressed by referring to a multi-server queueing model.

N+1 = 4 Redundancy

We begin by considering a small-N configuration of four hosts where the load is distributed equally to each of the hosts. For simplicity, the load distribution is assumed to be performed by some kind of load balancer with a buffer. The idea of N+1 redundancy is that the load balancer ensures all four hosts are equally utilized prior to any failover.

Consider the constraint that none of the hosts should use more than 75% of their available capacity: the blue areas on the left side of Fig. 1. The total consumed capacity is assumed to be $4 \times 3/4 = 3$ or 300% of the total host configuration (rather than all 4 hosts or 400% capacity). Then, when any single host fails, its lost capacity is compensated by redistributing that same load across the remaining three available hosts (each running 100% busy after failover). As we shall show in the next section, this concept involves a misconception.

The circles in Fig. 1 represent hosts and boxes represent incoming requests buffered at the load-balancer. The blue area in the circles signifies the available capacity of a host, whereas white signifies unavailable capacity. When one of the hosts fails, its load must be redistributed across the remaining three hosts. What Fig. 1 doesn't show is the performance impact of this capacity redistribution.