Velocity: Panel, a survival guide

24 06 2008

Panelists: presented by Adam Jacob (HJK Solutions), Shayan Zadeh (Zoosk, Inc. ), Brian Moon (dealnews.com), Don MacAskill (SmugMug), John Allspaw (Flickr (Yahoo!)), Michael Halligan (BitPusher, LLC) and a gentleman (Fotolog)

Don McAskill: Rafael Nadal started to win Roland-Garros and his fanclub was there. He won the Open, which created a huge spike. Comments had to be turned off for the site to survive. The next year, he won again and stats had to be turned off. For his third victory servers did not collapse. This year he won and we did not even register.

John Allspaw: code gets pushed 20 to 30 times a day… Major events triggered traffic spikes.

Don would love to not operate a data center anymore, despite their expertise.

John: DB problems are hard [everyone in agreement, myself included]

[Discussion follows on scalablity: do not optimize for scale too early]

Don: EC2 is not worth it for servers that run around the clock, but if you’re good at shutting down instances that you don’t need.





Velocity: Sean Quilan @google, Storage at scale

24 06 2008

Strategy: buy lots of commodity hardware, because problems tend to be too big for their problem space. Hardware reliability is not that useful as well because it’s expensive.

[Showing the same pictures over and over again, someone from Google PR, please authorize the release of newer pictures]

[A GFS description follows, nothing new so far, read the papers on the topic]

[A BigTable description follows, same deal]

I wish this talk had some new information…





Velocity: Brent Chapman @great circle, what can IT professionals learn from emergency services?

23 06 2008

Example: a car hits a fire hydrant. Lots of agencies involved (fire dpt, ambo, police, electrical company). How do they coördinate all that?

Incident Command System is the protocol used in pretty much all emergency situations (courses available here).

I’ll put a pointer to slides, the example used in the talk is good. The wikipedia article is supposedly good and this article from ham radio operators is a good introduction.





Velocity: Luiz Barroso @google, efficient energy ops

23 06 2008

Hypothetical energy cost extrapolations, 5 years from now, hardware could be only 20-50% of the total energy costs.

Efficiency defined as computing speed divided by power. Can be broken down further (computing speed / power provided to chip x power provided to chip / power provided to server x power provided to server / power provided to data center).

  • Data center efficiency, PUE around 1.83, worse if data center is underutilized
  • Server energy efficiency, 25% dissipated by power supply

From uptime institute, 10-year energy costs, $9/W for consumption, $10-22/W for data center build out.

Rough cost breakdown: 50% on hardware, 22% on energy, 28% on  data center (assumptions, dual socket x86, 4 year depreciation, 70% load at peak).

How to be more efficient:

  1. consolidate workloads
  2. measure actual power usage rather than rely on nameplates
  3. investigate oversubscription

Oversubscription potential rises as the number of machines grows so oversubscribe at the data center level. Also mix workloads and be ready to kill instances if you get close to the limit.

Source: Energy-proportional computing

Consider a data center as a device (5,000 machines), distribution with 2 peaks, one at 5% utilization, another around 30%.

Typical power efficiency of a typical server, a machine running at a load of 0.3 is at 60% power efficiency, while a fully loaded machine is at 100% power efficency, and sadly data center are very rarely at 100% as seen before.

The idea behind energy-proportional computing: a generally proportional relation between work and power. Idleness in a server is scarce. It should happen at the electronics because in software it’s much harder (think of kernel getting interrupts all the time).

If you breakdown power by component, you find out that the CPU is much-more proportional than the rest of the components so even powering down the cpu the total savings are still between 10% and 20% of power gains.

Still CPUs have 2 important power-usage features:

  1. wide dynamic power range (ram, disks and network devices remain in a much closer power range)
  2. active low-power modes, where the cpu can do things

People, which average around 120W, have a 20x dynamic power range, compared to a 2x of a PC.

In conclusion, write fast code (biggest contribution to energy efficiency), consider reduction of all energy-related costs (provisioning), and demand energy-proportionality from equipment manufacturers.

Plug: http://climatesaverscomputing.org





Velocity: John Fowler (Sun), Innovation That Drives Opportunity for the Web Infrastructure

23 06 2008

John is responsible for hardware @Sun.

Web is built on a new software stack (varnish, rails, memcache, hadoop, etc.)

Trends:

  1. 16 cores per socket for 2009, Sun, AMD and Intel on the same track. Clock rates will remain the same.
  2. Application memory capacity increasing, working to get 1TB of RAM at commodity prices
  3. ZFS and SSD, enterprise SSD, $0.08 per iops to compare to $2.43 per ios for HDD

[Sun is clearly attacking the storage market by pushing for commoditization of software, as opposed to proprietary systems such as 3PAR, EMC, etc.] Sun is building something like x4500, using an x4450 with 1 32GB ZIL SSD, 1 80GB SSD ad 5 slow SATA drives, same capex, 3 times the throughput.





Velocity: Artur Bergman

23 06 2008

Artur works for Wikia. WoWWikia is the 2nd largest wiki around.

Value of performance and reliability is around

WoW: $520 MM of profit per year, 99% reliable but users expect it, so it’s really about setting expectations.

Operations is about using resources efficiently, reliably and has to be measured against revenues from user and the value of downtime (which must be computed): e.g. cost per page served is vital to guide decisions.

Example from wikia: 20% of all wiki pages went up from 200ms to 15s to load, 35% of pages were slow [per session] but that led to a 15% reduction of “fast” pages viewed, which has a clear cost.

Launched a project with 3 engineers for 4 weeks to improve page performance. Yielded good results but ads network is slowing down the whole thing. Since ads use document.write, wiki overrides it to allow for pages to load without waiting for ads to finish loading. This lead to more pageviews, but about 20% ads are not even loaded (network time-out, users clicks away).





Cookie crumbles

22 04 2008

On Monday’s front page of the Financial Times one could read “Google resolve crumbles on ‘cookies’ pledge“, an interesting piece on how earlier inquiries about the role of cookies in “behavioural targeting” had been gently pushed aside after the acquisting of DoubleClick had started, with the apparent benediction or at least indifference of regulatory bodies. As the paper puts it,

Some Google insiders say that as the company’s understanding of “behavioural targeting” has grown, some of its earlier fears about cookies have turned out to seem simplistic, and it has become less clear that the practice raises big privacy concerns.

As much as I like Google’s services and applications I find it disconcerting, to say the least, that the assessment about privacy cannot be clearly and publicly stated (and I doubt, though it is possible, that the paper would have not cited its sources if it could). And more importantly that this much needed assessment could not be conducted by an independent body. Protection of trade secrets I’m told.

It is also for the sake of trade secrets that the “market” for online advertising is run without any real auditing of any kind. In other industries, even with “independent” auditors quite a few irregularities manage to sneak through (see Enron, Countrywide, etc.) so I can only imagine what skeletons we will find, in the closet of a company that won’t let anyone look at how its main inventory is assessed, counted and verified. It is a true instance of self-regulation, back to the meaning of self. But hey, who can argue against a license to print a few billion dollars per quarter? Might is right, right?





twiki is great… twiki is not so great.

2 02 2008

To organize our internal IT information we have been using twiki. It is a very flexible tool by virtue of being a wiki and has two critical features out of the box that other wikis seem to lack:

  1. Forms
  2. and a fairly interesting search directive (%SEARCH%)

If you are not using these with twiki you are missing out; the analogy is using Word without styles. You can do without but life is so much easier with them.

Forms brings structure to wiki pages and allow to treat wiki pages as a structured record (the form) with a big, free-form description field (the page). For instance our twiki implementation records hosts, hardware items, services, change requests, incident tickets all with the use of custom forms, so as to produce a pseudo-relational database on which we build reports.

Examples of reports:

  1. list of all change requests awaiting peer review before approval
  2. list of all hosts assigned to a given project
  3. list of all hosts running on a given piece of hardware

The list goes on. Then we start having questions such as “which are the hardware pieces whose leases end in the next 3 months?” or “how many hosts run RedHat 4.5?”. And that is when twiki breaks… Its reliance on a file-based scheme (and rcs) to maintain relationship imposes some unwelcome limitations, not to mention a level of performance that is difficult to accept on a daily basis (I know that caching is in the works but it is just not built to scale).

Case in point: we define hosts (think linux hosts) as compute resources that execute on some physical substrate (think IBM x3550) so it is only natural that the host form has a mention for the hardware item it executes on. In other words there is a one-to-n relationship between hardware item and host. On the hardware item form we do not feature the list of hosts that live on that hardware item because chances of dangling pointers are too great. We used to have it and quickly we ended up with hosts that point to a piece of hardware, which itself does not point back to these hosts.

In other words we have had to limit the type of reports we can run because the underlying data implementation of twiki is lacking. Questions such as “Which hardware items are home to more than 3 hosts?” become unnecessarily complicated, whereas with the proper framework it becomes as simple as:

select hi.name, count(*)
from hardware_item hi
join harware_host ho on (hi.sid = ho.hardware_sid)
group by hi.name
having count(*) >= 3

How about the list of potential single points of failure for a given service:
select max(h.name) as hostname, hc.name as host_class
from service s
join service_host so on (s.sid = so.service_sid)
join host h on (so.host_sid = h.sid)
join host_class hc on (h.class_sid = hc.sid)
where s.name = “My critical service”
group by hc.name
having count(*) < 2;

Now, assuming I have such a relational database that twiki can query via a sql module, how different is it from the database that my monitoring package is based upon? In the ideal world my data model presents something that:

  • monitoring can use (service dependencies, host maps, etc.)
  • configuration management can use (change requests bound to given hosts, software items, etc.)
  • asset management can use
  • finance can use

The key properties that I would want such a system to keep are the ease of use with which it be manipulated (nothing more cumbersome that twiki) and its accuracy (no duplicate data). At the same time I have not found any product out there fits the bill (a monitoring package that has a solid data model that be extended for other uses). So I might just bite the bullet and build a prototypical ERP for IT.

Stay tuned.





A useful presentation on mysql performance

31 01 2008

Some of the slides are quite mysql-specific but a lot applies to all databases (at least to Oracle as well).