From cd9422355a2a7a9ea9aa8cd7902b680e6fa0371c Mon Sep 17 00:00:00 2001
From: retoor <retoor@molodetz.nl>
Date: Wed, 7 May 2025 22:38:02 +0200
Subject: [PATCH] Added regex.

---
 regex.md | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 99 insertions(+)
 create mode 100644 regex.md

diff --git a/regex.md b/regex.md
new file mode 100644
index 0000000..8c6154b
--- /dev/null
+++ b/regex.md
@@ -0,0 +1,99 @@
+# Regex
+
+## Obsession
+
+If you looked good around my repositories, you've probably seen that I have a special thing for regex interpreters.  
+I love writing them. It's the most underestimated skill there is—to write one from scratch.
+
+Yes, you can follow some basic tutorial on the internet and learn how to do it the way everyone does.  
+But the real game? It's writing something you can't find anywhere else.
+
+And I've done that. Several times.
+
+Compiled, bytecode, even used regex itself as bytecode—that one was very special.  
+Nice interpreters, fast interpreters, winning, losing... But the end product is not the interpreter.  
+It's your own brain.  
+
+## Why Do It?
+
+Thinking and problem solving is actually one of the best things there is.  
+And with problem solving, I do **not** mean solving it using Google or some book.  
+Pure thinking. With good understanding of language basics, you're able to write a regex interpreter.  
+It takes a serious—do not underestimate—amount of time.  
+
+But more than being hardcore at the basics (yes, that's a thing), you don’t need.  
+The beautiful thing is, once you get into it, you can keep going on without having to Google or read a book.  
+It's all in your head.
+
+## The Trap of Research
+
+The most fun is when you haven’t researched regex or interpreters beforehand.  
+It makes you **extra creative** and lets your brain think freely.  
+
+Solutions from others can be inspiring... but they can also *pollute* your thought process.  
+You can get stuck in someone else's way of thinking and end up building the same thing they did.
+
+For me, the target is not to create a regex engine that beats everyone else's.  
+That comes with many factors. In certain scenarios, I've even beaten the original glibc regex.  
+Cool? Sure. But not the point.
+
+The goal is: **write something decent and unique**.  
+Own design. No influence from others. That's it.
+
+## Questions Worth Asking
+
+Do you know what an AST is?  
+Will you use one? Or will you just interpret the regex directly?
+
+The easiest way must be the fastest, right?  
+Actually... no.
+
+I've benchmarked interpreters a lot, and performance really depends on the regexes themselves.  
+There's no one-size-fits-all solution.
+
+An advanced byte-compiled one with JIT will always be slower on the first pass than a dumb interpreter that just walks character by character.  
+But after parsing several lines? That JIT version takes the lead.
+
+## Performance Myths
+
+Validating strings is actually such a small task for a computer.  
+When it comes to performance, for most users, **it doesn’t matter** which parser you pick.  
+
+That’s probably why everyone just uses the one bundled with their favorite programming language.
+
+But I had a parser that could parse an entire book.  
+We can’t say that for everyone—looking at you, glibc regex interpreter.  
+That one dies at around 10MB of content, if I remember correctly. Something like that.  
+So yeah, even things like that can be a target.
+
+## Wild Ideas
+
+What also could be fun?  
+Using a parser that validates while walking a file descriptor.
+
+By doing that, you can parse files of unlimited size—or even live network streams.  
+James Bond stuff. Real-time regex over TCP. Tapping into streaming data.
+
+And now we’re getting close to my next hobby: **protocol design**.  
+But that’s a story for another time.
+
+---
+
+I don’t even expect people to read this far.  
+
+---
+
+## Code you should never read
+
+At least, not until you’re ready.
+
+I'm talking about a basic regex interpreter in C and it is written in around 30 lines.
+I've read it in a book called Beautiful Code. The source was written by Brian Kernighan.
+
+I'm not posting the source because it probably would destroy your 
+creaitivty. It’s easy to find if you want to.
+
+Once you’ve seen it, you can’t unsee it.
+
+What he built? That’s the level I aim for.
+