Paint The Code
Table of Contents
Background
At some point I fell (again) into the creative coding rabbit hole.
I re-read The Importance of Sketching with Code and it really clicked this time:
instead of thinking “I need a big serious project”, just sketch. make tiny things. weird experiments. save them.
So I wanted a small sketch that:
- uses code as data instead of something to execute
- feels a bit like generative art
- doesn’t require a browser or shaders or anything fancy
The result was this little side project: a Python source visualizer that turns any .py file into a panel of abstract rectangles.
Nothing “useful”, but very fun.
Idea
The core idea is:
take a Python file → tokenize it → let those tokens influence how a rectangle is recursively split → color each piece depending on what kind of token it came from.
So comments, strings, numbers, keywords, etc. all get their own “visual personality”.
In the end you get something that kind of looks like a Mondrian painting that’s been hit with a syntax highlighter.
Reading the file safely
First tiny function:
def read_text(path):
"""Read the file as plain text, never executing it."""
with open(path, "r", encoding="utf-8", errors="replace") as f:
return f.read()
Nothing fancy, but the important bit for me: I never import or exec the file.
It’s just bytes → text. That’s it.
I also explicitly set errors="replace" so if the file has weird encoding issues, the visualizer still works and just throws in some replacement characters. Glitch-friendly.
Tokenizing: turning code into little categories
This is where the “code as data” part starts:
def tokenize_source(text):
"""
Tokenize Python source and group tokens into semantic categories.
"""
result = []
reader = io.StringIO(text).readline
try:
for tok in tokenize.generate_tokens(reader):
tok_type, tok_str, start, end, line = tok
if tok_type in (tokenize.ENCODING, tokenize.NL, tokenize.ENDMARKER):
continue
if tok_type == tokenize.COMMENT:
group = "comment"
elif tok_type == tokenize.STRING:
group = "string"
elif tok_type == tokenize.NUMBER:
group = "number"
elif tok_type == tokenize.OP:
group = "op"
elif tok_type == tokenize.NAME:
if keyword.iskeyword(tok_str):
group = "keyword"
else:
group = "name"
else:
group = "other"
weight = max(1, len(tok_str))
result.append({"text": tok_str, "group": group, "weight": weight})
except tokenize.TokenError:
result.append({"text": text, "group": "other", "weight": len(text) or 1})
return result
A couple of fun bits here:
- I’m using Python’s built-in
tokenize module instead of splitting on characters myself.
- Every token falls into one of a small set of groups:
keyword, name, string, number, comment, op, other.
- Each token gets a
weight that is roughly len(tok_str). Longer tokens = more “influence” later.
The try/except is there so that if tokenization fails (e.g., half-written files or snippets), I just treat the whole thing as one big "other" block. The sketch should never crash just because the code is ugly.
This matches that “sketching” mindset: it’s allowed to be broken, the tool should still respond somehow.
Chopping the canvas into rectangles
This is the heart of the visualizer:
def build_rectangles(tokens, max_rects=400, min_size=0.02, margin=0.03, rng=None):
"""
Recursively slice up a big rectangle based on tokens.
"""
if rng is None:
rng = random.Random()
rects = [
{
"x": margin,
"y": margin,
"w": 1.0 - 2 * margin,
"h": 1.0 - 2 * margin,
"group": "background",
"token": None,
"depth": 0,
}
]
if not tokens:
return rects
total_weight = sum(t["weight"] for t in tokens) or 1.0
expanded = []
target_len = max_rects * 2
for t in tokens:
share = t["weight"] / total_weight
copies = max(1, int(share * target_len))
expanded.extend([t] * copies)
rng.shuffle(expanded)
...
The picture is:
- Start with one big rectangle (our “canvas”).
- For each token (biased by that weight), pick a rectangle and split it in two.
- Repeat until we hit
max_rects or things get too tiny.
The part I like most is how it picks which rectangle to split:
for t in expanded:
if len(rects) >= max_rects:
break
areas = [r["w"] * r["h"] for r in rects]
total_area = sum(areas)
if total_area <= 0:
break
pick = rng.random() * total_area
acc = 0.0
idx = 0
for i, a in enumerate(areas):
acc += a
if acc >= pick:
idx = i
break
rect = rects.pop(idx)
...
- Every existing rectangle has a probability proportional to its area.
- Bigger empty spaces get refined first.
- Tiny rectangles are skipped once they fall under
min_size.
So you get this organic subdivision where some areas are super detailed, others stay chunky.
Orientation depends on token type
I also let the token group influence orientation:
group = t["group"]
if group in ("comment", "string"):
orientation = 1 if rng.random() < 0.7 else 0 # more horizontal
elif group in ("keyword", "op"):
orientation = 0 if rng.random() < 0.7 else 1 # more vertical
else:
orientation = rng.randint(0, 1)
- comments & strings → mostly horizontal cuts
- keywords & operators → mostly vertical
- everything else → whatever
Purely an aesthetic choice, but it makes different files feel different:
- comment-heavy scripts generate these long horizontal bands
- math-y / expression-heavy code leans more into vertical chopping
Split ratio with jitter
Then I decide where to cut:
base = {
"keyword": 0.35,
"comment": 0.65,
"string": 0.55,
"number": 0.45,
"name": 0.5,
"op": 0.4,
}.get(group, 0.5)
jitter = (rng.random() - 0.5) * 0.3 # ±0.15
ratio = min(0.8, max(0.2, base + jitter))
So:
- each token group has a typical split ratio (e.g. comments are a bit 65/35-ish),
- then I nudge it randomly within bounds.
This is one of those tiny details that doesn’t matter logically, but visually it changes things a lot.
The layouts feel less “perfect grid” and more “hand-tuned but slightly drunk”.
Every resulting rectangle keeps track of:
- its
(x, y, w, h) in [0, 1] space
- its
group
- the
token text that spawned it (not used visually yet, but could be)
depth, so I can adjust styling based on how many splits it went through.
Colors, palettes, and a bit of determinism
I hardcoded a couple of palettes:
PALETTES = [
{
"name": "midnight",
"background": "#050816",
"keyword": "#ff6b81",
"name": "#4dabf7",
"string": "#ffe066",
"number": "#b197fc",
"comment": "#868e96",
"op": "#ff922b",
"other": "#e9ecef",
},
{
"name": "pastel",
"background": "#f8f9fa",
...
},
...
]
Nothing algorithmic here, I just fiddled with colors until the outputs looked pleasant enough.
To keep things interesting but reproducible, I do this:
def choose_palettes(text):
"""
Pick 3 distinct palettes in a deterministic way based on the file contents.
"""
seed = hash(text) & 0xFFFFFFFF
rng = random.Random(seed)
indices = list(range(len(PALETTES)))
rng.shuffle(indices)
chosen = [PALETTES[i] for i in indices[:3]]
return chosen, rng
- I hash the file contents to seed a local RNG.
-
That RNG:
- chooses 3 palettes for the 3 panels
- is passed into
build_rectangles so splits are deterministic too
So: same file → same picture every time (unless you change the code).
This is that “keep track of the random seed” lesson but kind of smuggled into the design.
Drawing panels
The function that actually paints rectangles:
def draw_panel(ax, rects, palette, line_mode="thin"):
ax.set_facecolor(palette["background"])
if not rects:
return
max_depth = max(r["depth"] for r in rects) or 1
for r in rects:
group = r["group"]
color = palette.get(group, palette["other"])
depth_factor = (r["depth"] + 1) / (max_depth + 1)
alpha = 0.4 + 0.6 * depth_factor
if line_mode == "none":
lw = 0.0
edgecolor = None
elif line_mode == "thick":
lw = 1.5 + 1.5 * depth_factor
edgecolor = "#000000"
else:
lw = 0.4 + 0.6 * depth_factor
edgecolor = palette["background"]
rect_patch = Rectangle(
(r["x"], r["y"]),
r["w"],
r["h"],
linewidth=lw,
edgecolor=edgecolor,
facecolor=color,
alpha=alpha,
)
ax.add_patch(rect_patch)
Fun bits:
I then use three subplots to show three different “moods” of the same layout:
def make_figure(rects, palettes, title=None):
fig, axes = plt.subplots(1, 3, figsize=(15, 5), constrained_layout=True)
draw_panel(axes[0], rects, palettes[0], line_mode="thin")
draw_panel(axes[1], rects, palettes[1], line_mode="thick")
draw_panel(axes[2], rects, palettes[2], line_mode="none")
...
return fig
Same structure, different outfits.
Putting it together: CLI
The rest is just a small command-line wrapper:
def main():
parser = argparse.ArgumentParser(
description=(
"Visualize a Python source file as abstract rectangles.\n"
"The file is never executed, only read as plain text."
)
)
parser.add_argument("source", help="Path to the .py file to visualize")
parser.add_argument(
"-o",
"--output",
help="Output image filename (e.g. out.png). "
"If omitted, the window is just shown.",
)
parser.add_argument(
"--max-rects",
type=int,
default=400,
help="Maximum number of rectangles to generate (default: 400)",
)
args = parser.parse_args()
text = read_text(args.source)
tokens = tokenize_source(text)
palettes, rng = choose_palettes(text)
rects = build_rectangles(tokens, max_rects=args.max_rects, rng=rng)
title = f"Visualization of: {args.source}"
fig = make_figure(rects, palettes, title=title)
if args.output:
fig.savefig(args.output, dpi=300)
else:
plt.show()
So running:
python visualizer.py my_script.py -o my_script.png
spits out a PNG of your code.
Running it without -o just pops up a Matplotlib window.
Takeaways
- Code is a great medium to sketch with, even when it’s not doing “real work”.
- Treating source code as raw data (instead of something to execute) is oddly refreshing.
- Randomness is fun, but deterministic randomness is much more usable.
- Letting token types leak into visuals (orientation, ratios, colors) makes each file feel like it has its own personality.
- And finally, once again: sketching is worth it. This started as a “let’s see what happens if…” evening and now I kinda want to build a whole series of tools like this.