Breakout in Hardware
A complete Breakout game built entirely from logic gates, registers, and memory — a 3-pixel paddle, a bouncing ball, 16 destructible bricks, and a 10-phase rendering pipeline. No CPU, no software, just digital circuits.
Steve Wozniak built the original Breakout in hardware for Atari in 1976 — no CPU, just TTL chips. Here’s what it looked like:
Paddle Input
The paddle is 3 pixels wide on row 7. Its center position is stored in a register, and keyboard scan codes (75 for left, 77 for right) feed into comparators that produce a movement delta: −1, 0, or +1.
The delta is added to the current position, then clamped to the range 1–6 so the 3-pixel paddle (center ± 1) always stays on screen. Two comparators check the boundaries, and two muxes override the result if it would go out of bounds.
This is the same pattern used in every hardware input system — a comparator bank decodes the input code, combinational logic computes the new position, and boundary clamping prevents invalid state. No if-statements, no software — just gates selecting between values.
Ball Physics & Clock Divider
The ball has four registers: X position, Y position, X velocity (1 or 255 for right/left), and Y velocity (1 or 255 for down/up). Each frame, the next position is computed by adding velocity to position.
But the ball would be impossibly fast if it moved every frame. A clock divider — a counter that counts 0, 1, 2, 3, then resets — generates an enable signal that fires once every 4 frames. The ball’s position registers only update when this enable signal is high. The paddle updates every frame, giving the player a speed advantage.
Clock dividers are fundamental hardware. Every digital system uses them: your CPU’s peripheral bus runs slower than the core, USB runs at 12MHz from a 48MHz source, VGA timing derives from a pixel clock. Here, it’s just a counter and a comparator — when the counter equals the max value, the enable goes high and the counter resets.
Wall bouncing is handled before computing the next position. If the ball is at X=0 moving left, the velocity flips to +1 first, then the new position is computed. This prevents the ball from wrapping around to the other side of the screen.
Brick Collision
The 16 bricks live in the first two rows of the framebuffer RAM (addresses 0–15). A brick is “alive” if its RAM cell contains 1, and “destroyed” if 0.
Collision detection reads the RAM at the ball’s next position during phase 0 of the pipeline. If the value is non-zero and the next Y position is in the brick rows (Y < 2), it’s a brick hit. The Y velocity flips, and during phase 4, the brick’s RAM cell is written to 0 — the brick disappears from the screen.
This is elegant because the framebuffer is the collision map. There’s no separate data structure tracking which bricks are alive — the same RAM that the screen reads for display is the RAM that the ball reads for collision. One DualPortRAM serves both purposes: port B for the screen, port A for game logic.
The 10-Phase Rendering Pipeline
The DualPortRAM has one write port. But each frame needs to clear 4 old pixels (ball + 3 paddle) and draw 4 new pixels (ball + 3 paddle), plus optionally clear a hit brick. That’s up to 9 writes per frame — so the frame is split into 10 phases, each doing one RAM operation:
A 4-bit phase counter cycles 0–9 and resets. Each phase drives a chain of muxes that select the RAM address and data. Clears write 0, draws write 1. The write-enable signal is only high during phases 1–8 — phase 0 is read-only (brick check) and phase 9 commits the new register values.
This is exactly how a GPU schedules memory operations — a pipeline that breaks each frame into phases, each with a specific memory operation. The only difference is scale: a real GPU has millions of pixels and thousands of pipeline stages. This one has 64 pixels and 10 stages. But the principle is identical.
The Full Game
All the pieces come together: paddle input, ball physics, brick collision, and the 10-phase rendering pipeline. 16 bricks across two rows, a ball that bounces off walls, paddle, and bricks, and a 3-pixel paddle you control with arrow keys.
Press Run and use the arrow keys to move the paddle. The ball moves every 4th frame (40 clock ticks) — a hardware clock divider that gives you time to react. When the ball hits a brick, it disappears and the ball bounces back.
Everything you see is running on the same simulator that powers all the circuits in this blog. ~120 nodes, ~200 connections, no CPU, no software — just gates, registers, and one DualPortRAM.
The circuit uses the same fundamental building blocks as everything else on this site: registers for state, comparators for collision detection, muxes for selecting between addresses, and adders for position arithmetic. The 10-phase pipeline is just a counter driving a chain of muxes — the same pattern a GPU uses to schedule memory operations, scaled down to 8×8 pixels.