Determining Application Performance Profiles in the Cloud

I want to know how to characterize my workloads in the cloud. With that, I should be able to find systems both over-provisioned and resource starved to aid in right-sizing and capacity planning. CloudForms by Red Hat can do these at the system level, which is where you would most likely take any actions, but I want to see if there’s any additional value in understanding at the aggregate level. cpuWe’ll work backwards for the impatient. I found 7 unique workload types by creating clusters of cpu, mem, disk, and network use through k-means of the short-term data from CloudForms (see the RGB/Gray graph nearby).  The cluster numbers are arbitrary, but ordered by median cpu usage from least to most.

From left to right, rough characterizations of the clusters are:

  1. idle
  2. light use, memory driven
  3. light use, cpu driven
  4. moderate use
  5. moderate-high everything
  6. high cpu, moderate mem, high disk
  7. cpu bound, very high memory
    Continue reading “Determining Application Performance Profiles in the Cloud”

Graphing Wall Street with LittleSis.org

With a goal of transparency, wallstreetLittleSis.Org has started collecting peer-membership information for public figures of many sorts.  Just the stuff made for social graphs!

This is image represents the social networks of the CEOs of the American Wall Street companies, from the info at LittleSis.  Red nodes are the CEOs (Thain is included), and green are organizations.

The data is a work in progress, as it only represents a few organizations these folks are involved with; but a work in progress is progress indeed.

P.S. LittleSis: API pretty please!

Demographics Fail

We forget, now that our reach is wide, that all purchasing is done by individuals.  Since we don’t know the individuals, and locating and selling to each and every one of them (us) is too expensive, we developed marketing to help us select the people, the individuals, most likely to purchase whatever we are selling.  We do that by carving up the population into demographic segments.  We do that by creating images and messages our testing tells us will appeal to those demographics.  As you noted, I am using the word “demographics” loosely – as it can just as easily mean single white 18-24 year-old men when selling video games, as it can mean general practitioners in the rural parts of beef exporting states when selling Lipitor.

759460300_63ca1caac9_mBut, why is this important?  Demographics provide us with statistically probable individuals.  Using these expected values are a great way for describing groups, but the value breaks down when talking about individuals.  We all know the story about the man who drowns crossing the river that is, on average, six inches deep.

The second failing in demographics is the pure focus on the individuals.  If the goal of sales and marketing is to convince individuals to take action (purchase, vote, visit, etc.), demographics alone does not provide the context under which we, as social animals, make decisions.

The number one factor that we as consumers use in making purchase decisions in consumer packaged goods, automotive, everything is our peers.  The younger we are, the better demographics reflect our peers, but that starts to break down rapidly once we leave school and enter the work force.

One place where we, as marketers, do a great job taking peer context into account is children’s toys.  Think about how they are advertised.  Is the latest and greatest StarBot 7000 action figure advertised with a static image of the figure with a voiceover talking about the high durability injection molded plastic construction and the die cast elbows capable of withstanding 30,000 hours of continuous play in -40°C conditions?  No, they show bunch of kids running around having a great time with the StarBot.  Children do not have long-standing deep networks of peers, so advertisers create a potential peer group in the advertisements.  Even as children get older, more media savvy, and create deeper relationships with their peers, all parents will recognize the plaintive cry of, “But, Billy has one!” Continue reading “Demographics Fail”

The Never Ending Quest for Data

Luc Legay's Social Network
Radial Representation of a Social Network

Finding good data in this field is difficult, even most of the academic literature references relatively small networks of less than 100 or so individuals. I suggest that the academic research is just starting to take off now (although the field is very far from new), because of availability of large real world datasets available in the social networking sites.

Nathan Eagle (Reality Mining at MIT) was kind enough to share 330,000 hours of proximity and cell phone communications data he and the team collected from volunteers over the course of the project. To say I am quite excited about digging into it, would be an dramatic understatement.

For other large data sets, Duncan Watts is spending his sabbatical over at Yahoo!, and I can only hope there are other people looking really hard at the data available there, Facebook, Hi5, Google, and many more. Research into people’s behavior, especially in a commercial setting is not only a great thing for the unprecedented data, but at least equally as important, this also brings to front the ethical implications.

[Image: Luc Legay‘s Facebook network]