> Highly Available Systems == Scalable Systems
Highly Available Systems == Scalable Systems
In this QCon talk: “Building Highly Available Systems In Erlang“, Erlang programming language creator and highly-entertaining speaker Joe Armstrong asserts that if you build a highly available software system, then scalability comes along for “free” with that system. Say what? At first, I wanted to ask Joe what he was smoking, but after reflecting on his assertion and his supporting evidence, I think he’s right.
In his inimitable presentation, Joe postulates that there are 6 properties of Highly Available (HA) systems:
- Isolation (of modules from each other – a module crash can’t crash other modules).
- Concurrency (need at least two computers in the system so that when one crashes, you can fix it while the redundant one keeps on truckin’).
- Failure Detection (in order to fix a failure at its point of origin, you gotta be able to detect it first)
- Fault Identification (need post-failure info that allows you to zero-in on the cause and fix it quickly)
- Live Code Upgrade (for zero downtime, need to be able to hot-swap in code for either evolution or bug fixes)
- Stable Storage (multiple copies of data; distribution to avoid a single point of failure)
By design, all 6 HA rules are directly supported within the Erlang language. No, not in external libraries/frameworks/toolkits, but in the language itself:
- Isolation: Erlang processes are isolated from each other by the VM (Virtual Machine); one process cannot damage another and processes have no shared memory (look, no locks mom!).
- Concurrency: All spawned Erlang processes run in parallel – either virtually on one CPU, or really, on multiple cores and processor nodes.
- Failure Detection: Erlang processes can tell the VM that it wants to detect failures in those processes it spawns. When a parent process spawns a child process, in one line of code it can “link” to the child and be auto-notified by the VM of a crash.
- Fault Identification: In Erlang (out of band) error signals containing error descriptors are propagated to linked parent processes during runtime.
- Live Code Upgrade: Erlang application code can be modified in real-time as it runs (no, it’s not magic!)
- Stable Storage: Erlang contains a highly configurable, comically named database called “mnesia” .
The punch line is that systems composed of entities that are isolated (property 1) and concurrently runnable (property 2) are automatically scalable. Agree?