CMG’09: Solaris/Linux Performance Measurement and Tuning (part 2)

10 12 2009

Adrian Cockcroft (Netflix)

http://www.slideshare.net/adrianco/solaris-linux-performance-tools-and-tuning

My notes:

  • Netflix releases every 2 weeks, first in beta and tracks everything
  • Everything at netflix (or in web-land in general) instrumented, in libraries so that instrumentation comes for free
  • Beware of kernel tweaks, good for older kernels, now a lot more auto-tuned
  • On Solaris, microstate data very useful
  • With Poisson arrivals, steady state, N identical servers, approximation of response time, R = S / ((1 – utilization)^N), S = service time, utilization = throughput * S
  • Issues with this simplistic model: bursted traffic, service time varies, N servers don’t process the same thing,  virtual hardware make it a lot harder to figure out
  • Measurement errors (especially around measuring time)
  • So don’t bother about utilization
  • Load average on linux is broken, it includes disk activity
  • I/O wait is fundamentally broken, the cpu never waits for I/O per se
  • Cockcroft Headroom Plots: 99th-%ile against response time
  • On linux, best way to track i/o per process is with SystemTap




CMG’09: “How ‘normal’ is your IT Data?”

9 12 2009

Dr. Mazda Marvasti

My notes on this very informative talk (the best I’ve seen today). The goal of the study was to evaluate the hypotheses around normal distribution assumption built in the newer IT monitoring tools, that create dynamic thresholds of the various metrics they collect.

  • Analyzed 4 workloads: ad-serving on LAMP, bond processing, stock trades and some online application
  • Test for normal distribution: Kolmogorov-Smirnov as it makes no assumption on the data distributions
  • Used average shifted histograms for the test
  • Results: none of the basic metrics (OS, applications, business-oriented) are normally distributed, neither are their averages, when looking at blocks of 1 hour
  • For instance Monday 9am does not look at all like Tuesday 9am
  • Also Mondays 9am don’t on average converge, meaning that their average are not independent and/or the averages are not identically distributed
  • Business cycles matter very much in analysis, spectral analysis can help!
  • Correlations examined using Spearman’s ranked correlation coefficient (though results not presented).
  • Conclusion: go for non-parametric analysis, known distributions don’t really apply
  • If you enable dymanic thresholds based on normal distribution assumptions, expect a 10x in the number of alerts — though it’s possible to mitigate this with use of topology rules (e.g. “don’t alert me if event 1 and event 2 coöccur)

My take on this: IT data analysis is challenging. One question is: how much is it worth, i.e. at what scale do you get your money back (and more) by getting this type of fairly sophisticated analysis and what kind of return can you expect of it? While the answer depends on the nature of the business conducted, I’m curious to see whether it’s bigger shops with expensive applications, cloud-scale companies or whether this is going to percolate toward the smaller web shops, integral to an Infrastructure-as-a-Service offering?

Stay tuned…





CMG’09: “How do you analyze 100,000s of servers?”

9 12 2009

Charles Loboz (microsoft)

  • No homogeneous software/hardware/applications
  • Access is often limited (e.g. hotmail servers are off-limit)
  • In the old days, 1 server analyzed per day
  • Stopped using averages and stddev (because data are not normal)
  • Built 10-bin histograms for utilization
  • Even that is limited, because long tails are the ones triggering issues (e.g. bad queries triggering load, then all queries will pile up)
  • No one cares about utilization (except data geeks), only performance matters
  • Estimate utilization impact on performance with “Performance Impact Factor” (PIF): a weighted average of histograms, heavy utilization should be favored to make long tails more obvious, for CPU, for net, for IO

Recipe

  • Compute histograms
  • Compute PIFs for each server
  • Cross-tabulate PIFs to server names to tag servers as underused, overloaded, etc.
  • Store everything in a database

Pitfalls

  • PIF averages don’t mean anything
  • It’s good to tell a “dead-cold” server, but it’s not good to tell you that you have an issue, just that you have to investigate




At CMG’09 today

9 12 2009

On paper it looked like a scientific approach to performance management, born in the mainframe days when computers were expensive. Now it’s cloud-scale that matters (and an ailing world economy if you’re not a bank) so managing capacity rigorously (and in an automated fashion) makes sense.

So far no breakthrough though, it’s a bit too applied to my taste. Let’s see what the next sessions hold in store.





Looking into system performance of an Oracle data warehouse

9 10 2009

Introduction

This is the start of an ongoing investigation into system performance of an oracle 10.2 data warehouse being loaded . The database server has 2 real storage volumes (called dw-clear and dw-encrypt) and 1 virtual one (dw-encrypt-u) used to decrypt data on the fly. Most of the data and the i/o are on the dw-clear volume.

System-data performance have been collected via sadc -d to capture per-device statistics. The data are then extracted using sadf -d filename -- -d -b -d. The summary is available here as a csv. It’s a large table of block i/o stats, cpu stats and per-device i/o stats, suitable to be imported into R.

The system characteristics are as follows.

  • Sun x4150 64GB RAM, 2×4 x5450, 1 4Gb/s QL2462 HBA with 2 ports.
  • 3 device-mapper devices, 2 using a round-robin multipath (v1, v2), 1 using an on-the-fly cipher to decode encrypted data (v3).
  • 3PAR S400 with 10k drives and 4Gb/s HBAs.
  • Out of the 64GB, 8GB are set aside as HugePages to serve as memory pages for the SGA.

The goal of this investigation is to understand what the bottleneck is in the processing and what can be done to remove it.

Let’s start with cpu utilization.

Distribution of CPU time spent in userland when not idle

Distribution of CPU time spent in userland when not idle

Not terribly loaded (I’m filtering out the long idle portions with user > 5. How about I/O?

% of CPU spent waiting on IO

% of CPU spent waiting on IO

Interesting, iowait is not negligible. Is it correlated to anything in particular? First of all, let’s see how iowait varies with device utilization of v1.
IOWait against v1 device utilization v1 is slowly but surely bringing iowait higher, to the point than more than one processor ends up waiting on I/O.

To be continued…





Blog battle on the storage appliance front

3 09 2009

Backblaze has started an interesting conversation by detailing how they get to $117,000 per PB, down to the type and number of SATA card used in their design. A great PR move for a company in the crowded personal backup space. Of course publishing comparisons with Dell, Sun, NetApp and EMC at 8x, 10x, 30x the price is a sure way to start stirring people’s emotions. The first to publish a lengthy response (that StorageMojo could find) is Joerg Moellenkamp in a blog post. Laudable in pointing design flaws for fundamentally 2 different markets. Sure, Sun’s hardware is a great piece of engineering, squarely aimed at the enterprise market. Which, incidentally, is not buying in droves and Sun’s financials is clearly reflecting that. Backblaze took the google route for storage and it’s hard to see, given the competitive pressure, how they would be better off spending their margin on Sun hardware. The era of gold-plated hardware is slowly drawing to a close and I can’t say I oppose that change.





Netflix describes its culture

4 08 2009





Catching up on Velocity 09

29 06 2009

This year I could not attend Velocity so I decided to catch up via http://velocityconference.blip.tv. Here are a few notes on the sessions I have been able to see so far.

John Allspaw (Ops) & Paul Hammond (Dev): 10+ Deploys per day: Dev/Ops coöperation at Flickr

This is a topic dear to my heart: changing the culture shared (or not) by dev and ops.

  • Contrary to popular wisdom, ops’ real mission is not to keep the service stable per se, but to enable the business.
  • Business requires change
  • Build the tools and the culture that allow repeated change with minimal uncertainty.
  • Automate your infrastructure
  • Use one shared source control, between devs and ops so that everyone on the team knows where to look
  • Reduce all manual steps down to one, that of deciding to build and deploy
  • Small frequent changes better than fewer large changes
  • Use “feature flags”, i.e., use code to enable features, rather than branches
  • Ship TRUNK so that everyone knows what gets released
  • Feature flags allow for private betas, reduces uncertainty
  • Dark launches: enable the feature to exercise the data path but don’t present the results to the end-user
  • Metrics, metrics, metrics
  • Add context to it, such as the last time something was deployed
  • We use IRC and IM bots to bring system updates into the conversation between dev and ops in real time, then push the logs into a search engine
  • Develop respect and trust between devs and ops
  • Have a healthy attitude toward failure (don’t blame, fix the problem first)




Started a friendfeed webops public group

29 06 2009

Feel free to join: http://friendfeed.com/web-ops





#structure09 Hosting on commodity hardware

25 06 2009

I just got out of the panel on commodity hardware and did not get a chance to participate so here’s my take on it.

The panel started with an opening question: google, amazon and the likes run at a huge scale on commodity hardware, yet enterprise vendors still push customized hardware and expensive at that.

To me the answer is pretty obvious: enterprise hardware is being for the most part sold to people who don’t know how to architect and design software on a commoditized stack. Let’s be honest, look at most “enterprise” hardware/software literature: it’s just noise and a waste of both the writer’s and the reader’s time. And by stack I mean from the server, all the way up to the application code.

If you constrain yourself to buy servers that cost no more than $5k, buying high-end database software makes little sense. Rather you recognize that low-end compute is how you get economies of scale and you apply the same reasoning to your networking gear, storage systems, database software, load balancing software, etc.

Google, from its earlier papers, seems to be the first to have understood that, rejecting the usual marketing garbage from large vendors. And for that we should be grateful.