Building a CPU from Scratch

From a single NAND gate to a working 6502 processor running C code — every circuit is live and interactive. Click the switches. Watch the signals propagate. Build intuition for how computers actually work.

Interactive tutorial/~15 min read/Built with Simten

Starting from Nothing: The NAND Gate

Every computer ever built — from the Apollo Guidance Computer to the M4 chip in your MacBook — can be constructed from a single type of logic gate: NAND.

A NAND gate outputs 0 only when both its inputs are 1. That’s it. From this one building block, we can create every other logic gate, and from those gates, an entire computer.

Let’s start by building the basic gates. Click the switches to toggle inputs and watch the output LED respond.

NOT — The Inverter

Wire both NAND inputs together. When the input is 1, both NAND inputs are 1, so the output is 0. Inversion!

Compiling...

NOT Gate

Toggle the switch to see the output invert

AND Gate

NAND followed by NOT. The double negation cancels out, giving us a gate that outputs 1 only when both inputs are 1.

Compiling...

AND Gate

Output is ON only when both inputs are ON

OR Gate

De Morgan’s theorem in action: NOT each input, then NAND the results. The output is 1 when either input is 1.

Compiling...

OR Gate

Output is ON when either input is ON

XOR — Exclusive OR

The “difference detector” — outputs 1 only when inputs are different. Built from 4 NAND gates. This one is essential for arithmetic.

Compiling...

XOR Gate

Output is ON when inputs differ

Composition: Building Arithmetic

Now for the magic trick of digital design: composition. We take the gates we just built and wire them together into bigger circuits. Those bigger circuits become building blocks for even bigger ones.

Let’s build an adder — the circuit that lets a CPU do math.

Half Adder

Adds two single bits. The sum output is the XOR (are the bits different?), and the carry output is the AND (are both bits 1?). Try it: 1+1 = 10 in binary — sum is 0, carry is 1.

Compiling...

Half Adder

Adds two bits: produces sum and carry

Full Adder

The real workhorse. A full adder handles three inputs: A, B, and a carry-in from the previous column. Chain 8 of these together and you can add two bytes. Chain 32 and you have the adder in a modern CPU.

Compiling...

Full Adder

Adds three bits: a, b, and carry-in

Multiplexer

A data selector: the sel switch chooses which input (A or B) passes through to the output. Muxes are everywhere in CPUs — they’re how the control unit routes data between components.

Compiling...

2:1 Multiplexer

sel=OFF picks A, sel=ON picks B

Memory: Teaching Circuits to Remember

Everything so far has been combinational — the outputs depend only on the current inputs. But a computer needs to remember things. To store a bit, we need feedback: a circuit whose output connects back to its own input.

This is where the clock enters the picture. Sequential circuits use a clock signal to synchronize state changes. Click the Tick button to advance the clock by one cycle.

SR Latch

The simplest memory cell: two NOR gates cross-coupled. Toggle S (Set) to store a 1, toggle R (Reset) to clear it. Notice how the output stays after you release the input — that’s memory!

Compiling...

SR Latch

Set stores a 1, Reset clears to 0

D Flip-Flop

The workhorse of digital memory. The D flip-flop captures whatever value is on the D input when the clock ticks, and holds it until the next tick. Set the switch, then click Tick to capture the value.

Compiling...

D Flip-Flop

Captures input on each clock tick

4-Bit Register

Four D flip-flops in parallel, sharing a clock. Set some switches, click Tick, and the register captures all four bits at once. This is exactly how CPU registers work — just wider (8, 16, 32, or 64 bits).

Compiling...

4-Bit Register

Stores 4 bits simultaneously on each clock tick

Putting It Together: A Counter

Now we combine everything. A counter uses flip-flops, NOT gates, XOR gates, and AND gates working together. Bit 0 always toggles. Bit 1 toggles when bit 0 is 1. Bit 2 toggles when bits 0 and 1 are both 1. The AND gates form a carry chain — the same idea as addition.

Click Tick repeatedly or hit Auto to watch it count. The four LEDs show the binary value: 0000, 0001, 0010, 0011, ... up to 1111 (15), then it wraps around.

A program counter in a CPU works just like this — it counts through memory addresses, fetching one instruction at a time. The only difference is width (16 bits for the 6502) and the ability to load a new value (for jumps and branches).

Compiling...

4-Bit Counter

Click Tick or Auto to watch it count in binary

Scaling Up: A 4-Bit Adder

We built a full adder that adds three single bits. But a CPU needs to add numbers. The trick is chaining: connect the carry-out of each full adder to the carry-in of the next. Four full adders in a row give you a 4-bit adder that can add numbers 0–15.

This is called a ripple-carry adder because the carry “ripples” from the least significant bit to the most significant. The 6502 uses exactly this pattern, just wider — 8 bits for its ALU, 16 bits for address arithmetic.

Try it: set the A switches (a3–a0) to 0011 (3) and the B switches to 0101 (5). You should see the sum LEDs show 1000 (8).

Compiling...

4-Bit Ripple-Carry Adder

Set A and B in binary, watch the carry propagate

The ALU: A Calculator Chip

An adder can only add. A real CPU needs to do logic too — AND, OR, XOR. The Arithmetic Logic Unit (ALU) computes all of these in parallel and uses a multiplexer to pick the result based on a control signal.

Below is a 1-bit ALU slice. It has an adder, AND, OR, and XOR gate all wired to the same inputs. The two op switches select which result passes through: 00 = ADD, 01 = AND, 10 = OR, 11 = XOR.

Chain 8 of these slices (with carry linking the adders) and you have the complete ALU of the 6502. The CPU’s control unit just sets the op bits based on which instruction it decoded. Same circuit, different operation — that’s what makes it programmable.

Compiling...

1-Bit ALU Slice

op: 00=ADD 01=AND 10=OR 11=XOR

RAM: Read/Write Memory

Registers store a few values. A CPU needs thousands of addressable bytes. That’s RAM — an array of memory cells, each with an address. You put an address on the bus and the data at that address appears on the output.

The key insight: reads are instant (combinational) but writes need a clock tick. Change the addr input and data_out updates immediately. To write: set addr, set data_in, turn we (write-enable) ON, then Tick. Turn we OFF and change the address to read it back.

Try it: write the value 42 to address 1, then write 7 to address 2. Switch between addresses to see both values are remembered. The 6502 has 2 KB of RAM wired to its address bus — same idea, just 2,048 locations instead of 256.

Compiling...

256×8 RAM

Reads are instant. Writes happen on Tick with we=ON.

The 6502: A Real CPU

Everything we’ve built — gates, adders, registers, counters, and an ALU — are the building blocks of a real processor. The MOS 6502 (1975) powered the Apple II, Commodore 64, and NES. It has just 3,510 transistors and an elegant instruction set. Its ALU is wider (8 bits), its program counter longer (16 bits), and it has a control unit that decodes 56 instructions — but the pieces are the same.

Below is a complete 6502 system simulated at the gate level — over 5,500 lines of TypeScript, compiled and running in your browser. It has a CPU, RAM, ROM, and a memory-mapped console output at address $F000.

The ROM is pre-loaded with C programs compiled with cc65, a C compiler targeting the 6502. Click Run to watch the CPU execute real compiled C code, one cycle at a time.

Loading 6502 CPU simulator...

What you just saw is the same process that happens billions of times per second in the device you’re reading this on. A clock ticks. The program counter increments. An instruction is fetched from memory. The control unit decodes it. The ALU computes. Results are stored. Repeat.

The only difference between this 6502 and a modern CPU is scale: more transistors, wider buses, deeper pipelines, more cache. But the fundamentals — NAND gates all the way down — haven’t changed.

Open the editor →Next: How TPUs Do Calculations →

← Back to blog