The Pith of Performance: storage

Showing posts with label storage. Show all posts

Wednesday, May 15, 2013

Exponential Cache Behavior

Guerrilla grad Gary Little observed certain fixed-point behavior in simulations where disk IO blocks are updated randomly in a fixed size cache. For his python simulation with 10 million entries (corresponding to an allocation of about 400 MB of memory) the following results were obtained:

Hit ratio (i.e., occupied) = 0.3676748
Miss ratio (i.e., inserts) = 0.6323252

In other words, only 63.23% of the blocks will ever end up inserted into the cache, irrespective of the actual cache size. Gary found that WolframAlpha suggests the relation: \begin{equation} \dfrac{e-1}{e} \approx 0.6321 \label{eq:walpha} \end{equation} where $e = \exp(1)$. The question remains, however, where does that magic fraction come from?

Little's Law and IO Performance

Next Tuesday, August 7th, I'll be presenting at the Northern California CMG meeting^*. My talk will be about Little's law and its implications for storage IO performance.

As a performance analyst or capacity planner, you already know all about Little's law—it's elementary. Right? Therefore, you completely understand:

How Little's law relates inventory and manufacturing cycle time
John Little (now 84) is not a performance analyst
John Little did not invent Little's law
Little's law was known to A. K. Erlang more than 100 years ago
That there are actually ~~two~~ three versions of Little's law
Little's law is not based on queueing theory
Little's law expresses the fact that response time decreases with increasing throughput
However, on the SPEC website you'll see that response time increases with increasing throughput. WTF !!!?

If you're feeling slightly bewildered about all this, you really should come along to my talk (assuming you're in the area). Otherwise, you can read the slide deck embedded below.

3-dimensional view of Little's law

I'll show you how I discovered the resolution to the apparent contradiction between items 7 and 8 (above) by representing Little's law in 3-dimensions. It's very cool! Even John Little doesn't know about this.

Oh yeah, and I'll also explain how Little's law reveals why it's possible to make your application IOs go 10x to 100x faster. IOPS bandwidth has become irrelevant.

Some of these conclusions are based on recent work I've been doing for Fusion-io. You might've heard of their billion IOPS benchmark, and more recently by association with SSDAlloc software from Princeton University.

^* If you're not a ncCMG member, it's a one-time $25 entry fee, which then makes you a life member. See the bottom of their web page for payment and contact details.

Wednesday, March 7, 2012

The SSD World Will End in 2024

So says the Non-Volatile Systems Lab at UC San Diego. The claim is, in order to achieve higher densities, flash manufacturers must sacrifice both read and write latencies. I haven't had time to explore this claim in any detail, but I thought it might be useful for you to know about it. Some highlights include:

They tested 45 different NAND flash chips from six vendors that ranged in size from 72 nm circuitry to the current 25nm technology.
They then took their test results and extrapolated them to the year 2024, when NAND flash development road maps show flash circuitry is expected to be only 6.5 nm in size. At that point, read/write latency is expected to increase by a factor of two or more.
They did not use specialized NAND flash controllers such as those used by Intel, OCZ or Fusion-io. Their results can be viewed as "optimistic" because they didn't include latency added through error correction or garbage collection algorithms.
Considering the diminishing returns on performance versus capacity, Grupp said, "it's not going to be viable to go past 6.5 nm ... 2024 is the end."

The technical paper entitled, The Bleak Future of NAND Flash Memory (PDF), was presented and published at the FAST'12 conference held in San Jose, CA on February 14—17, 2012.

Related post: Green Disk Sizing

Friday, February 3, 2012

Green Disk Sizing

I finally got around to completing item 5 on my 2011 list concerning electrical power consumed by a magnetic hard disk drive (HDD). The semi-empirical statement is:

Power ∝ N_platters × Ω^2.8 × D^4.6 . . . (1)

where N_platters is the number of platters on the spindle, Ω is the rotational speed in revolutions per minute (RPM) and D the platter diameter in inches. The power consumed is then measured in Watts.

In principle, this makes (1) valuable for doing green HDD storage capacity planning. The bad news is, it is not in the form of an equation but a statement of proportionality, so it can't be used to calculate anything as it stands. More on that shortly. The good news is that all of the quantities in (1) can be read off from the data sheet of the respective disk vendor^†. Note that the disk capacity, e.g., GB (the usual capacity planning metric) does not appear in (1).

The outstanding question is: where do those funny non-integral exponents come from?

My Year in Review 2011

Some days I wonder if I ever actually accomplish anything anymore. Maybe it's time to just pack it in and become a greeter at Walmart. I know a bit about how queues work, so that should put me a few notches ahead of the competition. And I would expect the competition to be fierce because it's a pretty cushy job; but not every day, apparently.

Before taking the big leap, I decided it might be a good idea to note down some of the technical projects I've worked on this year (over and above the daily grind):

How Much Wayback for CaP?

How much data do you need to retain for meaningful capacity planning and performance analysis purposes? Sounds like one of those "how long is a piece of string?" questions and I've never really thought about it in any formal way, but it occurred to me that 5 years is not an unreasonable archival period.

Mister Peabody and Sherman in front of the WABAC machine
My reasoning goes like this:

This Apple Does Fall Far From The B-Tree

That old adage: the apple doesn't fall far from the tree, doesn't apply to Apple the corporation. According to ArsTechnica today, Apple abruptly abandoned its open-source project to port Sun's ZFS as the filesystem for Mac OS X, on October 23rd.

The speculation is that Sun licensing fees may have been viewed as a roadblock to adoption or possibly there are growing concerns that Oracle's acquisition of Sun could cause other problems. In the meantime, Apple is hiring engineers to build its own advanced filesystem, instead of adopting either ZFS or its Linux derivative BtrFS.

Saturday, January 3, 2009

JournalSpace.gone

So, this is Web 2.0? And I'm supposed to put my entire existence on a (black) Cloud!? Do you know who is managing your web services? This is how you might find out.

Part of me still wants to believe it's a slightly premature April Fool hoax. I mean, just look at the filename on that HTML page. But I checked slashdot and it's still there. So, it must be true. :-\ The earlier innuendo (on "Tuesday") that it might have been MacOS X going nutzoid, has now been narrowed to a (the?) sysadm going postal on the database. Can you say, "Secondary storage"?

Monday, January 28, 2008

Cisco Systems: "It's the switch, stupid!"

Today, Cisco Systems (San Jose, California) announced its mother of all switching platforms, the Nexus 7000 Series, aimed at what it calls Data Center 3.0 (analogous to Web 2.0, I presume. I missed Data Center 2.0). Cisco is essentially trying to eliminate the need for separate storage networks, server networks, routing, switching and virtualization, by combining them all into a single unified fabric and managing it through Cisco's new proprietary NX-OS ("nex-os", get it?) operating system.

The Nexus 7000 will deliver up to 15 Tbps of switching capacity in a single chassis, with 512 ports for 10 Gbps ethernet, and eventually it is slated to be delivered with 40 Gbps and 100 Gbps ports. Some of the claimed performance speeds-and-feeds appear rather breathtaking:

Copy the entire Wikipedia database in 10 milliseconds.
Copy the entire searchable Internet in 7.5 Minutes.
Download all 90,000 Netflix movies in 38.4 seconds.
Send a high-resolution 2 megapixel photo to everyone on earth in 28 minutes.
Add a Web server in 9 seconds rather than 90–180 Days.
Transmit the data in all U.S. academic research libraries (estimated at more than 2,000 TB) in 1.07 seconds.

If nothing else does it, the 3 significant digits in the last claim tells you this is marketing-speak (read: calculated using max bandwidth assumptions), so a liberal dusting of sodium chloride is recommended.

The concept of a "data center" is currently undergoing a serious transformation and it will be interesting to see how this kind of mega-switch stacks up against alternative approaches, such Google's Data Center in a Box.

Thursday, March 1, 2007

Disk Storage Myth Busters

Interesting myth-busting synopsis on disk drive technologies entitled: "Everything You Know About Disks Is Wrong" over at StorageMojo. Some of the key myths busted by extensive research at Google and CMU include:

Disk drives have a field failure rate 2 to 4 times greater than shown in vendors specs.

Reliability of both cheap and expensive drives is comparable.

Disk drive failures are highly correlated, thus violating one of the assumptions behind RAID Level-5.

Disk drive failure rates rise steadily with age rather than following the so-called bathtub function.

Storage vendors such as NetApp and EMC have responded. David Morgenstern asks in eWeek, why would anyone have trusted the MTBFs that appear in vendor glossies in the first place?

Background on failure rate analysis can be found in Chapter 1 of my Perl::PDQ book entitled: Time - The Zeroth Performance Metric.

The Pith of Performance