Adders
How digital circuits add two numbers — starting from a single XOR gate, ending with why the obvious design gets slower the wider your inputs are.
The half adder
The smallest possible adder takes two bits and produces two bits: a sum and a carry. With two single-bit inputs there are four cases — the sum is 1 when exactly one input is 1 (that’s XOR), and the carry is 1 only when both inputs are 1 (that’s AND).
Toggle the two switches. The first LED is the sum bit, the second is the carry. The highlighted row in the truth table follows whichever combination you’ve set.
Handling carry-in: the full adder
A half adder is enough to add a single bit, but the moment you want to add multi-bit numbers, you need a way to receive a carry from the stage below. That third input is called carry-in (cin), and an adder that handles all three inputs is called a full adder.
The trick: a full adder is just two half-adders stacked. The first half-adder adds a + b; the second adds that partial sum to cin. Either half-adder can produce a carry, and the full adder's carry-out is the OR of both.
Try all eight combinations — the result is the binary encoding of how many of the three inputs are 1.
Chaining them: ripple-carry
To add two 8-bit numbers, chain eight full adders together. The carry-out of bit 0 becomes the carry-in of bit 1; the carry-out of bit 1 becomes the carry-in of bit 2; and so on. This is called a ripple-carry adder because the carry signal ripples from the low bit to the high bit, one stage at a time.
Change the inputs below and watch the result. It works correctly for every pair of values — addition is addition. But there's a subtle cost to this structure that doesn't show up in any single-cycle simulation.
Why ripple-carry is slow
Every gate takes some real, physical time to switch. The clock speed of a chip is bounded by the longest combinational path between any two registers — whatever signal has to travel farthest sets the upper limit on how fast the clock can tick.
In a ripple-carry adder, the high-bit's sum can't settle until the carry has propagated all the way from bit 0 through every intermediate stage. An 8-bit adder needs the carry to traverse eight full-adders before the answer is valid. A 64-bit adder needs 64. Depth grows linearly with width — a problem that gets worse exactly as your inputs get wider.
Carry-lookahead: faster addition through parallelism
The fix is to stop waiting. For each bit position, compute two signals in parallel: generate (g_i = a_i AND b_i, this stage definitely produces a carry) and propagate (p_i = a_i XOR b_i, this stage will pass through whatever carry it receives). With those, every carry can be computed in terms of every prior g and p at once — no ripple required.
“Carry-lookahead” is really a family of designs that share this idea, at different points on a depth/area tradeoff curve. Single-level CLA groups bits into chunks of k (typically 4) and gets depth proportional to n/k — faster than ripple, but still linear. Multi-level CLA applies the lookahead idea recursively across groups. And parallel-prefix networks — Kogge-Stone, Brent-Kung, Han-Carlson — arrange the g/p combinations as a tree, achieving true O(log n) depth.
For a 32-bit adder, a parallel-prefix structure means depth ~5 instead of depth 32. That’s the actual reason a real CPU can add two 64-bit numbers in a single clock cycle — every modern ALU you have ever used is some variant of this family. The naive ripple-carry from earlier in this page is correct, but it’s not what’s sitting inside your laptop.