Friday, April 16, 2010

Significant Figures in R and Rounding

This is a follow-on to my previous post about determining significant digits or sigdigs, in performance and capacity management calculations. See Significant Figures in R and Info Zeros

Once we know how to identify significant digits, inevitably we will be faced with rounding the result of a calculation to the least number of sigdigs. Whereas the signif() function in R suffered from truncating trailing info-zeros in measured values, when it comes to rounding, signif shines. Better yet, it agrees with the Algorithm 3.2 in my GCaP book. Let's see how well it does.

Example (Old Rule): Consider the number

7.245

which, from the previous post, we know has 4 sigdigs and now, we wish to round it to 3 sigdigs. The 3rd sigdig is '4' and the 4th sigdig (the one we plan to drop) is '5'. According the old-school rule, the '5' tells us we should round the preceding '4' up, i.e., increment the '4' to a '5' and indeed, that is what Excel does. The function ROUND(7.245,2) produces:

7.25

The second argument in ROUND (the '2') refers the 2nd decimal place, rather than the number of rounded sigdigs.

The old-school rule is now considered to be biased by the odd-even parity of nearby digits. As explained in Chapter 3, the new-school rule takes that effect into account. Suppose we have a generic number nn..nnXYZ and we want to round it to the position where the digit X now sits. How do we do it? In pseudocode, the new rule (Algorithm 3.2) tells us how:
a. Scan from left to right and examine digit Y
b. If Y < 5 then goto (i)
c. If Y > 5 then set X = X + 1 and goto (i)
d. If Y == 5 then examine Z
e. If Z >= 1 then set Y = Y + 1 and goto (a)
f. If Z is blank or a string of zeros then
g. Examine the parity of X (odd/even)
h. If X is odd then set X = X + 1
i. Drop Y and all trailing digits

I have also implemented this algorithm in Perl.
Example (New Rule): Round 7.245 to 3 sigdigs using the new rule. Start by rewriting the number without the decimal point. To make the steps clearer, I'll write three rows with the first row enumerating the position of each digit of our number; shown in the second row:

1 2 3 4 5
7 2 4 5 _
_ _ x y z

The third row indicates the alignment of the X, Y and Z labels in the rounding algorithm with X in the 3rd position because we want to round to 3 sigdigs.

From step (a) Y equals 5 and step (d) says to look at Z, which is blank. Since it's blank, step (g) says look at the parity of X = 4, which is even. Finally, step (i) tells us to drop everything from digit Y to the right. The result of rounding to 3 sigdigs in this way is therefore:

7.24

which is different from the Excel result. Let's compare signif in R with Algorithm 3.2:
> signif(7.245,3)
[1] 7.24
which is in agreement with the new rule, in this case. Unlike Excel ROUND, the second argument in signif is the number of sigdigs to be displayed. Let's try some other examples in R.

Example (R signif): The test numbers in this table

Number SD Algor
1 62.53470 4 62.53
2 3.78721 3 3.79
3 726.83500 5 726.84
4 24.85140 3 24.90

show the value to be rounded in the first column, the number of rounded significant digits (SD) in the second column, and the rounded result (Algor) obtained by applying Algorithm 3.2.

The following R code appends the value obtained with signif

for(i in 1:dim(tabl)[1]) {
tablstr<-sprintf("%d\t%8.4f\t%d\t%6.2f\t%6.2f\n",
i, tabl$Number[i], tabl$SD[i], tabl$Algor[i], signif(tabl$Number[i],tabl$SD[i]))
cat(tablstr)
}
to the Rsn column in this table:

Number SD Algor Rsn
1 62.5347 4 62.53 62.53
2 3.7872 3 3.79 3.79
3 726.8350 5 726.84 726.84
4 24.8514 3 24.90 24.90

Since the Algor and Rsn values agree, it looks like signif incorporates the new rounding rule so, it can be used as is, straight out of the box. Nice.

No comments: