Joel's SharePoint Architect Blog

SharePoint 2010, MOSS & WSS Tips and Consultancy Tales

Subscribe Subscribe  View Joel Jeffery's profile on LinkedIn
joelblogs.co.uk | joelj.co.uk | joeljeffery.co.uk | jfdiphoenix.co.uk

Fri, 9 November 2007 10:45 GMT

Pat Helland

Microsoft Corporation

 

A cracking session from Pat. My notes don’t do it justice.

 

Forces, Objects and Evolution

  • Economic and technical forces driving our industry
  • Forces are influencing the way we build our applications
  • This drives new models for building applications
  • Causes evolution in computing models – new models must respect legacy stuff

 

Forces in Processors

  • Moore’s Law Continues!
    • The CPU Wall
      • # Transistors still doubles ever 2 years
      • Voltage isn’t dropping as fast; power consumption now up at 150 Watts
      • CPU Frequency is plateau-ing at 4GHz
      • CPU performance hits the "Power Wall"
        • "Static Power" vs "Dynamic Power" – Static Power increases as performance increases
        • As transistors get smaller, they "leak"
      • Clock Frequency causes a rise in "Dynamic Power"
      • Vicious Circle: Faster Chips get hotter; Hotter Chips user More Power; Hotter Chips…
    • The Memory Wall
      • Plateau at 60ns access
      • More bandwidth is doable, but not shorter latency
        • Faster CPUs wait more
      • Speculative Execution
        • Guesses what memory will be needed in advance
        • "Out of order processing" to bring data in to cache early
        • Fractionally more performant; 5 times more complicated
      • In Order Execution
        • Simple 1 Instruction at a time
        • Slower clock speed equal fewer waits
        • Slower clock speed can be almost as fast as Speculative Execution
  • Many Core CPUs
    • Can’t currently make faster CPUs
    • Solution is to put multiple CPUs per chip
    • The future is moving towards 500 Cores per chip in the next decade
    • Our software is not ready for this! It doesn’t take advantage of 2 core properly yet!
    • On-Chip Memory Cache – shared across multi core (locking issues?)
  • Parallel Processing will be orders of magnitude cheaper than sequential
      tyle="margin-top:0in;margin-bottom:0in;margin-left:0.375in;direction:ltr;unicode-bidi:embed">
    • How can we take advantage of parallel processing?

 

Forces in Data-Centres

  • Buildings outlast servers
  • Currently we overestimate requirements
  • Reducing power saves air-conditioning
    • Double savings
  • Backup power is 20% of datacentre cost
    • Batteries for a while
    • Backup generators cost $2M
  • Trends in datacentres
    • Datacentre in a shipping container!
    • Fail-in-place
    • Don’t use backup power: use multiple datacentres!
    • Only works with applications that will scale out
    • Move towards stateless, composable, distributed applications
    • Future will have a mixture of traditional datacentres and low-power, datacentre in a shipping crate datacentres

Forces in Storage

  • Disk Is Tape
    • The pipe to the disk is getting smaller
    • Capacity increases with areal density
    • Read/Write time with linear density
    • 10+ Terabyte disks projected for 2010 for $100 or so
      • 5-15 hours to read sequentially
      • 15-150 days to read randomly
  • Flash Is Disk
    • Moore’s Law Drives Flash RAM Capacity
    • Low power, low temperature
    • Not constrained by "disk pipe" issues
    • By 2012 Flash will be same price as cheapest disks

 

Forces in Communication

  • Bandwidth and Latency
    • Datacentres dark fibre bandwidths
    • Total Bandwidth triples every 12 months – exceeds Moore’s Law
    • Latency reductions are limited – by the speed of light
  • Wireless Everywhere – Mostly
    • Applications need to exhibit "Always Offline" behaviour
    • Don’t resort to the hourglass
    • Useful work always continues
    • Data gets less stale
  • It is easier to move a "bit" than it is to move a "watt"
    • Datacentres moving to be close to Hydro-Electric dams

 

Forces in the Cloud

  • Videos: Software and Services
  • Application State must be separated from the machine
    • Per user, Per app state
    • Safety and sandboxing
    • Controlled and safe sharing across apps
    • Controlled and safe sharing across users
  • Parallelism
    • Pipeline parallelism
    • Partitioned Parallelism
    • Bulky Xml? Not  problem considering trends in computing
    • Problems today
      • Servers: big databases
      • Clients: big exes
  • Gain speed by bringing data close
    • Principal of locality of data
  • It’s OK to have copies of data close by
    • Read-Only Reference Data
    • Divergent Changes of Copies
  • Defy Authority
    • Multiple changes to multiple copies
    • No Single Source of Truth
    • Who to believe?
    • Historic trust
  • How can we build apps out of small, independent and UNRELIABLE pieces?

 

Demand for disconnection, scaling, cheap computers and cheap datacentres

 

The Movable Objects

Admitting we’re confused

  • Even if the computer is accurate
    • Data are entered by people
    • Data are entered by sensors
    • Decisions are made
  • Guessing and partial knowledge
    • Separated from the real world
    • Map be separated from other replicas
  • Computers do not make decisions
    • They *try* to make decisions
    • Good guesses, bad guesses but no certainty
  • Memories and Sharing
    • Remember your guesses
    • Sharing your memories is useful
    • Fidelity of memories tightly bound to cost
    • More memories = longer latency
  • Investing in remembering well is a business decision
  • Screw-ups and apologies happen – it’s OK to be decisive and wrong
  • Airlines, bookshops – many business take advantage of this

 

Working in the here and now

  • Smaller computers offer more "bang for the buck"
  • Smaller datacentres offer more "bang for the buck"
  • Smaller datasets frequently offer more "bang for the buck"
  • OK to have copies of data
    • There is no authoritative copy
    • Versioning and change history show what was intended
    • Application design for independence is required
    • Big websites have large caches of product catalogues and price lists
      • Computing with versions
    • Demand for cheap datacentres adds to this need

 

Cutting the work into little pieces

  • Scaling with local transactions
    • Assume your computation must be done, can’t wait to cooridate and your partner is likely to be remote
    • You must do local work
  • Solution: Uniquely keyed objects and partitioning
    • Must identify objects with a unique key as its identity
    • Transactions cannot occur across
  • Queries are different
    • No transactional queries
    • May not be on same machine
    • No remote transactions
    • Can query stale copies
      • What does stale mean?
  • Alternate indices are different!
  • Fine-Grained workflow
    • These objects and their data are not like traditional DBs
    • T
      his is traditional workflow but with fine-grained participants
    • Separate Transactions on Little Objects
      • Smaller is better

 

Independent Changes to Little Pieces

  • Subjective Consistency
    • Given what I know here and now, make a decision – REMEMBER THAT INFORMATION!
    • Other copies of the object ma make divergent decisions
    • Ambassadors had authority: before radio
  • Eventual Consistency
    • Eventually all the copies of the object share their changes
    • Now apply subjective consistency
    • Given the same knowledge, produce the same result
    • Everyone sharing their knowledge leads to the same result

 

  • Idempotence, commutativity and associativity of the operations are all implied by this requirement

 

  • The CAP conjecture
    • Consistency
    • Availability
    • Partition Tolerance
    • …PICK ANY TWO!

 

  • Subjective Consistency plus Eventual Consistency means it’s OK to have some screw-ups

 

Interoperate and Entice

  • Must interoperate with the existing software investments
  • A lot of code exists and it very important
  • Entice into the new world
    • Drawing people in will happen though financial and business drivers

 

Conclusions

  • Hardware is changing

Lots of CPUs; no faster though

Power and heat

Lots of devices

Lots of bandwidth -> intermittent device connectivity

  • Economics are changing
    • Small is beautiful
    • Unreliable is cheaper
    • Configuring costs more than the devices
    • An onslaught of data
  • Components must change
    • They need

 

  • ACID  - Atomic, Consistent, Isolated, Durable
    • Goal for transactional ACID was to make the insanely complex look like a single machine
  • ACID for objects – Associative, Commutative, Idempotent, Spread out and independent
    • The drive to stateless computing
    • To recognise that the work will be done by lots of unreliable machines
    • Associative: (A + B) + C = (A + C) + B
    • Commutative: A + B = B + A
    • Idempotent: repeatable, no side effects
    • Distributed: Spread out and autonomous

 

  • Not for everything
    • Not for real time systems (closed queuing networks)
      • e.g. Fly by wire; nuclear power plants
    • Not for very expensive apologies
      • "Oh crap, I just launched the space shuttle"

 

  • The new ACID
    • Huge Numbers of Relatively Small Operations
    • Cost and Probability of apologies is small
    • Essential for service and product provides

 

Technorati Profile

Technorati Tags: Microsoft, SOA, TechEd 2007

 

You can leave a response, or trackback from your own site.

Leave a Reply