Irresistible Forces Meet the Movable Objects – Pat Helland
Fri, 9 November 2007 10:45 GMT
Pat Helland
Microsoft Corporation
A cracking session from Pat. My notes don’t do it justice.
Forces, Objects and Evolution
- Economic and technical forces driving our industry
- Forces are influencing the way we build our applications
- This drives new models for building applications
- Causes evolution in computing models – new models must respect legacy stuff
Forces in Processors
- Moore’s Law Continues!
- The CPU Wall
- # Transistors still doubles ever 2 years
- Voltage isn’t dropping as fast; power consumption now up at 150 Watts
- CPU Frequency is plateau-ing at 4GHz
- CPU performance hits the "Power Wall"
- "Static Power" vs "Dynamic Power" – Static Power increases as performance increases
- As transistors get smaller, they "leak"
- Clock Frequency causes a rise in "Dynamic Power"
- Vicious Circle: Faster Chips get hotter; Hotter Chips user More Power; Hotter Chips…
- The Memory Wall
- Plateau at 60ns access
- More bandwidth is doable, but not shorter latency
- Faster CPUs wait more
- Speculative Execution
- Guesses what memory will be needed in advance
- "Out of order processing" to bring data in to cache early
- Fractionally more performant; 5 times more complicated
- In Order Execution
- Simple 1 Instruction at a time
- Slower clock speed equal fewer waits
- Slower clock speed can be almost as fast as Speculative Execution
- The CPU Wall
- Many Core CPUs
- Can’t currently make faster CPUs
- Solution is to put multiple CPUs per chip
- The future is moving towards 500 Cores per chip in the next decade
- Our software is not ready for this! It doesn’t take advantage of 2 core properly yet!
- On-Chip Memory Cache – shared across multi core (locking issues?)
- Parallel Processing will be orders of magnitude cheaper than sequential
- How can we take advantage of parallel processing?
Forces in Data-Centres
- Buildings outlast servers
- Currently we overestimate requirements
- Reducing power saves air-conditioning
- Double savings
- Backup power is 20% of datacentre cost
- Batteries for a while
- Backup generators cost $2M
- Trends in datacentres
- Datacentre in a shipping container!
- Fail-in-place
- Don’t use backup power: use multiple datacentres!
- Only works with applications that will scale out
- Move towards stateless, composable, distributed applications
- Future will have a mixture of traditional datacentres and low-power, datacentre in a shipping crate datacentres
Forces in Storage
- Disk Is Tape
- The pipe to the disk is getting smaller
- Capacity increases with areal density
- Read/Write time with linear density
- 10+ Terabyte disks projected for 2010 for $100 or so
- 5-15 hours to read sequentially
- 15-150 days to read randomly
- Flash Is Disk
- Moore’s Law Drives Flash RAM Capacity
- Low power, low temperature
- Not constrained by "disk pipe" issues
- By 2012 Flash will be same price as cheapest disks
Forces in Communication
- Bandwidth and Latency
- Datacentres dark fibre bandwidths
- Total Bandwidth triples every 12 months – exceeds Moore’s Law
- Latency reductions are limited – by the speed of light
- Wireless Everywhere – Mostly
- Applications need to exhibit "Always Offline" behaviour
- Don’t resort to the hourglass
- Useful work always continues
- Data gets less stale
- It is easier to move a "bit" than it is to move a "watt"
- Datacentres moving to be close to Hydro-Electric dams
Forces in the Cloud
- Videos: Software and Services
- Application State must be separated from the machine
- Per user, Per app state
- Safety and sandboxing
- Controlled and safe sharing across apps
- Controlled and safe sharing across users
- Parallelism
- Pipeline parallelism
- Partitioned Parallelism
- Bulky Xml? Not problem considering trends in computing
- Problems today
- Servers: big databases
- Clients: big exes
- Gain speed by bringing data close
- Principal of locality of data
- It’s OK to have copies of data close by
- Read-Only Reference Data
- Divergent Changes of Copies
- Defy Authority
- Multiple changes to multiple copies
- No Single Source of Truth
- Who to believe?
- Historic trust
- How can we build apps out of small, independent and UNRELIABLE pieces?
Demand for disconnection, scaling, cheap computers and cheap datacentres
The Movable Objects
Admitting we’re confused
- Even if the computer is accurate
- Data are entered by people
- Data are entered by sensors
- Decisions are made
- Guessing and partial knowledge
- Separated from the real world
- Map be separated from other replicas
- Computers do not make decisions
- They *try* to make decisions
- Good guesses, bad guesses but no certainty
- Memories and Sharing
- Remember your guesses
- Sharing your memories is useful
- Fidelity of memories tightly bound to cost
- More memories = longer latency
- Investing in remembering well is a business decision
- Screw-ups and apologies happen – it’s OK to be decisive and wrong
- Airlines, bookshops – many business take advantage of this
Working in the here and now
- Smaller computers offer more "bang for the buck"
- Smaller datacentres offer more "bang for the buck"
- Smaller datasets frequently offer more "bang for the buck"
- OK to have copies of data
- There is no authoritative copy
- Versioning and change history show what was intended
- Application design for independence is required
- Big websites have large caches of product catalogues and price lists
- Computing with versions
- Demand for cheap datacentres adds to this need
Cutting the work into little pieces
- Scaling with local transactions
- Assume your computation must be done, can’t wait to cooridate and your partner is likely to be remote
- You must do local work
- Solution: Uniquely keyed objects and partitioning
- Must identify objects with a unique key as its identity
- Transactions cannot occur across
- Queries are different
- No transactional queries
- May not be on same machine
- No remote transactions
- Can query stale copies
- What does stale mean?
- Alternate indices are different!
- Fine-Grained workflow
- These objects and their data are not like traditional DBs
- T
his is traditional workflow but with fine-grained participants - Separate Transactions on Little Objects
- Smaller is better
Independent Changes to Little Pieces
- Subjective Consistency
- Given what I know here and now, make a decision – REMEMBER THAT INFORMATION!
- Other copies of the object ma make divergent decisions
- Ambassadors had authority: before radio
- Eventual Consistency
- Eventually all the copies of the object share their changes
- Now apply subjective consistency
- Given the same knowledge, produce the same result
- Everyone sharing their knowledge leads to the same result
- Idempotence, commutativity and associativity of the operations are all implied by this requirement
- The CAP conjecture
- Consistency
- Availability
- Partition Tolerance
- …PICK ANY TWO!
- Subjective Consistency plus Eventual Consistency means it’s OK to have some screw-ups
Interoperate and Entice
- Must interoperate with the existing software investments
- A lot of code exists and it very important
- Entice into the new world
- Drawing people in will happen though financial and business drivers
Conclusions
- Hardware is changing
Lots of CPUs; no faster though
Power and heat
Lots of devices
Lots of bandwidth -> intermittent device connectivity
- Economics are changing
- Small is beautiful
- Unreliable is cheaper
- Configuring costs more than the devices
- An onslaught of data
- Components must change
- They need
- ACID - Atomic, Consistent, Isolated, Durable
- Goal for transactional ACID was to make the insanely complex look like a single machine
- ACID for objects – Associative, Commutative, Idempotent, Spread out and independent
- The drive to stateless computing
- To recognise that the work will be done by lots of unreliable machines
- Associative: (A + B) + C = (A + C) + B
- Commutative: A + B = B + A
- Idempotent: repeatable, no side effects
- Distributed: Spread out and autonomous
- Not for everything
- Not for real time systems (closed queuing networks)
- e.g. Fly by wire; nuclear power plants
- Not for very expensive apologies
- "Oh crap, I just launched the space shuttle"
- Not for real time systems (closed queuing networks)
- The new ACID
- Huge Numbers of Relatively Small Operations
- Cost and Probability of apologies is small
- Essential for service and product provides
You can leave a response, or trackback from your own site.


