rebuildin' the wheel
everyone says "don't reinvent the wheel." but how do you know what a wheel even does if you never built one from scratch?
will any of these be production ready? absolutely not. will i mess up and learn? absolutely yes.
the idea is to build somethin' that already exists, not to replace it, but to finally understand what's goin' on under the hood.
you don't truly get how a database works until you've written a btree at 2am and questioned everythin'.
it's like cookin' instant noodles your whole life and then tryin' to make pasta from scratch. suddenly you respect the process.
- oltp database - postgres didn't build itself. btrees, wal logs, query parsers, buffer pools. i wanna feel that pain firsthand.
- olap database - columnar storage, vectorized execution, compression. basically build a baby clickhouse and cry tryin'.
- programmin' language - lexer, parser, ast, interpreter. doesn't need to be useful. just needs to run "hello world" and make me mass of proud.
- llm from scratch - not fine-tunin'. not prompt engineerin'. actually trainin' a transformer from raw weights. even if it only learns to say "the" repeatedly.
- vector database - embeddin' storage, approximate nearest neighbor search, hnsw index. it's just math with extra steps.
- in-memory database - basically redis but mass of worse. key-value store, expiration, pub/sub. how hard can it be? (famous last words.)
- website without any framework - oh wait, i already did this one. you're lookin' at it.
- web framework - routing, middleware, templatin'. build the thing i refuse to use on this website.
- container runtime - cgroups, namespaces, union filesystems. docker is just linux features in a trench coat.
- git - objects, trees, blobs, refs. understand why merge conflicts exist and hate it even more.
- search engine - crawlin', indexin', rankin'. google started in a garage, i'll start in a terminal.
- load balancer - round robin, least connections, health checks. nginx is just a for loop with extra steps. (it's not. i'll find out.)
- message queue - producers, consumers, topics, acknowledgments. kafka but mass of simpler and mass of buggier.
- operating system - bootloader, kernel, scheduler, filesystem. this one might actually take a lifetime.
- compiler - tokenizer, parser, code gen. turn text into machine code and feel like a mass of wizard.
- http server - sockets, tcp, request parsin', response buildin'. the thing that runs the internet and is surprisingly simple at its core.
- dns resolver - recursive queries, cachin', root servers. the phonebook of the internet that nobody thinks about.
- blockchain - hashing, consensus, merkle trees. not for crypto bro reasons, just to understand the actual data structure.
- shell - read, eval, print, loop. pipin', redirections, job control. bash is just a while loop readin' your commands.
- regex engine - nfa, dfa, backtracking. the thing that makes every developer question their career choices.
- 3d renderer - ray tracin', rasterization, shaders. make pixels appear on screen and feel like god.
- bittorrent client - peer discovery, piece selection, chokin' algorithms. piracy education purposes only obviously.
- neural network - forward pass, backprop, gradient descent. no pytorch, no tensorflow, just raw matrix math and tears.
- text editor - gap buffers, syntax highlightin', undo/redo. vim started somewhere too.
- web browser - html parser, css layout engine, js interpreter. the most complex piece of software most people use daily.
- template engine - parsin' mustache/jinja style templates. it's just string replacement with opinions.
- physics engine - collision detection, rigid body dynamics, gravity. make things fall and bounce realistically.
- emulator - cpu cycles, memory mappin', instruction decoding. make a gameboy run on your laptop.
- memory allocator - malloc/free from scratch. understand why memory leaks happen and respect garbage collectors forever.
- network stack - tcp/ip from raw sockets. syn, ack, fin. the handshake that holds the internet together.
- bot framework - event loops, command parsin', state management. discord bots are just glorified if-else chains.
- diffusion model - noise schedulin', u-net, denoising. generate images from static and question reality.
- rag pipeline - chunkin', embeddin', retrieval, generation. the thing every ai startup is sellin' as magic.
- processor - logic gates, alu, registers, instruction set. build a cpu and finally understand what "clock speed" means.
- random number generator - computers can't do true randomness. it's all pseudorandom math pretendin' to be chaotic. build one and realize every "random" thing you've ever seen was a lie.
- hash map - arrays, hashing, collision resolution. the data structure that makes everything O(1) and then doesn't.
- package manager - dependency resolution, version lockin', conflict handling. npm does this and somehow still breaks.
- cron job scheduler - parse time expressions, schedule tasks, handle misses. time is harder than you think.
- garbage collector - mark, sweep, compact. understand why your java app pauses for no reason.
if you're insane enough to build any of these, hit me up.
there's a github repo that explored this space a bit, check it out.