Thursday, June 26, 2014

Load Average in FreeBSD

In my Perl PDQ book there is a chapter, entitled Linux Load Average, where I dissect how the load average metric (or metrics, since there are three reported numbers) is computed in shell commands like uptime, viz.,
[njg]~/Desktop% uptime
16:18  up 9 days, 15 mins, 4 users, load averages: 2.11 1.99 1.99

For the book, I used Linux 2.6 source because it was accessible on the web with convenient hyperlinks to navigate the code. Somewhere in the kernel scheduler, the following C code appeared:

#define FSHIFT    11          /* nr of bits of precision */ 
#define FIXED_1   (1<<FSHIFT) /* 1.0 as fixed-point */ 
#define LOAD_FREQ (5*HZ)      /* 5 sec intervals */ 
#define EXP_1     1884        /* 1/exp(5sec/1min) fixed-pt */ 
#define EXP_5     2014        /* 1/exp(5sec/5min) */ 
#define EXP_15    2037        /* 1/exp(5sec/15min) */ 

#define CALC_LOAD(load,exp,n) \
 load *= exp; \
 load += n*(FIXED_1-exp); \
 load >>= FSHIFT;

where the C macro CALC_LOAD computes the following equation \begin{equation} Q_m(t) = Q_m(t-1) \; e^{-5/(60 m)} + n(t) \; \big[ 1 - e^{-5/(60 m)} \big] \label{eqn:ema} \end{equation} This equation, it turns out, corresponds to the exponential moving average (EMA) of the run-queue length. The EMA is generally more familiar in the context of financial analysis (from whence the idea for the OS performance metric was lifted circa 1965). The CALC_LOAD macro is updated (internally) every 5 seconds and the $m$-index in eqn.\eqref{eqn:ema} refers to the weight associated with each of the 1, 5, or 15 minute averaging windows (the reason there are three numbers displayed in uptime).

The backstory is that at the time I was writing the book chapter (about ten years ago), Linux was potentially at risk of becoming swept up in the infamous SCO Group law suit that was aimed to collect licensing fees because SCO (thought it) had inherited the original AT&T Unix license from its acquisition of Novell and Unixware (insert long story here).

Eventually, Mr. Torvalds went public to assert that Linux was not Unix since he had written all the code himself. Although this was certainly true in the beginning, by that time Linux was open source and complicated (like Unix). What if someone had contributed the load average code by lifting it from some Unix source? Would he know or recall that detail? And anyway, why reinvent the wheel? A lot of modern programming is about reuse. More importantly, I was writing about Linux code and I didn't want the publication of my book to inadvertently get me sucked into the legal vortex: not that there would've been any rational cause, but rationality did not seem to be SCO Group's strong suit. Echoes of their law suit are still present today, even though SCO Group went bankrupt.

Motivated by all these machinations at that time, I decided to see if I could find other kernel code and compare the implementation of load average with the Linux source. If the codes were different, it would be pretty clear that every Unix and Linux variant implements their own code. Although I had Solaris running at the time, I didn't have access to the source, so FreeBSD was the obvious other candidate.

The completely unexpected result of all this investigation was that although FreeBSD had hooks for calling the load average function, there was no implementation! Apparently, it was not considered important by FreeBSD developers. For example:

Subject: Re: System call for querying system load?
Newsgroups: comp.unix.programmer,
Date: 1996/09/15

In <51cjj1$> writes:
> I'm looking for a system call/subroutine that will 
> let me query the system load. ...

There isn't any.  Retrieving the load average is a 
very non-portable operation, apparently nobody in the 
standards committees thought about a system call for this.
Nonetheless, this NULL result convinced me that each Unix and Linux variant differs and therefore Linus Torvalds would be vindicated in court—if it ever actually got that far. Fortunately, it didn't and my book was published in 2004 without incident.

Fast forward to yesterday and Guerrilla alumnus Stefan Parvu was telling me that he recently needed to replace Linux with FreeBSD to achieve throughput performance in certain storage systems (insert another long story). In a fit of deja vu, I asked him if the load average was now implemented in FreeBSD and he told me it was. Half jokingly, I suggested that while he was down in the basement of the kernel perhaps he could send me the code. He did and here it is:

* Compute a tenex style load average of a quantity on
* 1, 5 and 15 minute intervals.
static void
loadav(void *arg)
 int i, nrun;
 struct loadavg *avg;

 nrun = sched_load();
 avg = &averunnable;

 for (i = 0; i < 3; i++)
 avg->ldavg[i] = (cexp[i] * avg->ldavg[i] +
 nrun * FSCALE * (FSCALE - cexp[i])) >> FSHIFT;

 * Schedule the next update to occur after 5 seconds, 

Clearly different code from Linux but it still ends up calculating eqn.\eqref{eqn:ema}. I'll discuss this revelation and more in the upcoming Guerrilla classes.

No comments: