Twenty-two years ago, I wrote some Perl scripts to test Redland RDF library builds across multiple machines with SSH. Two months ago, I asked an LLM to turn those scripts into a modern Python application. The resulting Redland Forge application evolved from simple automation into a full terminal user interface for monitoring parallel builds - a transformation that shows how LLMs can accelerate development from years into weeks.
The Shell Script Years (2003-2023)
The project originated from the need to build and test Redland, an RDF library with language bindings for C, C#, Lua, Perl, Python, PHP, Ruby, TCL and others. The initial scripts handled the basic workflow: SSH into remote machines, transfer source code, run the autoconf build sequence, and collect results.
Early versions focused on the fundamentals: - Remote build execution via SSH - Basic timing and status reporting - Support for the standard autoconf pattern: configure, make, make check, make install - JDK detection and path setup for Java bindings - Cross-platform compatibility for various Unix systems and macOS
Over the years, the scripts grew more features: - Automatic GNU make detection across different systems - Berkeley DB version hunting (supporting versions 2, 3, and 4) - CPU core detection for parallel make execution - Dynamic library path management for different architectures - Enhanced error detection and build artifact cleanup
The scripts were pretty capable of handling everything from config.guess location discovery to compiler output integration into build summaries.
The Python Conversion (2024)
The script remained largely the same until 2024, when I decided to revisit it. It was time to move on from Perl and shell scripts and it seemed like a good opportunity to use the emerging LLM coding agents to do that with a simple prompt. This was relatively easy to do and I forget which LLM I used but it was probably Gemini.
The conversion to Python brought:
- Type hints and modern Python 3 features.
- Proper argument parsing with argparse instead of manual option handling
- Pathlib for cross-platform file operations.
- Structured logging with debug and info levels.
- Better error handling and user feedback.
The user experience improved as well: - Intelligent color support that detects terminal capabilities. - Host file support with comment parsing. - Build summaries with success/failure statistics and emojis. I'm not sure if that's absolutely an improvement, but 🤷
Terminal User Interface (2025)
A year later, in July 2025, with LLM technology rapidly advancing almost weekly, I was inspired to make a big change to the tool by prompting to make it a full text user interface, with parallel execution of the builds visible interactively in the terminal.
Continuing from the Python foundation, the tool gained a full terminal user interface. The TUI could monitor multiple builds simultaneously, showing real-time progress across different hosts.
One of the first prompts was to identify what existing Python TUI and other classes should be used, and this quickly led to using blessed for TUI and paramiko for SSH.
A lot of the early work was making the TUI work properly on a terminal, where the drawn UI did not cause scrolling or overflows, and the text wrapping or truncation worked properly. After something worked, prompting the LLM to make unit tests for each of these was very helpful to avoid backsliding.
As it grew, the architecture became much more modular: - SSH connection management with parallel execution - A blessed-based terminal interface for responsive updates - Statistics tracking and build step detection - Keyboard input handling and navigation
Each of those was by prompting to refactor large classes, sometimes identifying which ones to attack by using a prompt to analyze the code state and identify candidates, and sometimes by running external code complexity tools; in this case Lizard
The features grew quickly at this stage: - Live progress updates based on event loop. - Adaptive layouts that resize with the terminal. - Automatic build phase detection (extract, configure, make, check, install). - Color-coded status indicators both as builds ran, and afterwards. - Host visibility management for large deployments so if the window was too small, you'd see a subset of hosts building in the window.
The design used established design patterns such as the observer pattern for state changes, strategy pattern for layouts, and manager (factory) pattern for connections. Most of these were picked by the LLM in use at the time with occasional guidance such as "make a configuration class"
Completing the application (September 2025)
The final phase built the tool into a more complete application and added release focus features and additional testing. The tool transformed from an internal development utility into something that could be shared and useful for anyone who had an autoconf project tarball and SSH.
Major additions included: - A build timing cache system with persistent JSON storage so it could store previous build times. - Intelligent progress estimation based on the cached times. - Configurable auto-exit functionality with countdown display. - Keyboard based navigation of hosts and logs with a full-screen host mode and interactive menus.
The testing at this point was quite substantial: - Over 400 unit tests covering all components. - Mock-based testing for external dependencies. - Integration tests and edge cases.
At this point it was doing the job fully and seemed complete, and of more broader use than just for Redland dev work.
Learnings
Redland Forge demonstrates how developer tools evolve. What started as pragmatic shell and perl scripts for a specific need grew into a sophisticated application. Each phase built on the previous, with the Python conversion serving as the catalyst that enabled the terminal interface.
It also demonstrates how LLMs in 2025 can act as a leverage
multiplier to productivity, when used carefully. I did spend a lot
of time pasting terminal outputs for debugging the TUI boxes and
layout. I used lots of GIT commits and taggings when the tool
worked; I even developed a custom command to make the commits in a
way that I prefered, avoiding hype which some tend to do, but that's
another story or blog post. When the LLMs made mistakes, I could
always go back to the previous working GIT (git reset --hard
), or
ask it to try again which worked more than you'd expect. Or try a
different LLM.
I found that coding LLMs can work on their own somewhat, depending on the LLM in question. Some regularly prompt for permissions or end their turn after some progress whereas others just keep coding without checking back with me. This allowed some semi asynchronous development where a bunch of work was done, then I reviewed its work and adjusted. I did review the code, since I know Python well enough.
The skill I think I learnt the most about was in writing prompts or what is now being called spec-driven development for much later larger changes. I described what I wanted to one LLM and made it write the markdown specification and sometimes asked a different LLM to review it for gaps, before one of them implemented it. I often asked the LLM to update the spec as it worked, since sometimes the LLMs crashed or hung or looped with the same output, and if the spec was updated, the chat could be killed and restarted. Sometimes just telling the LLM it was looping was enough.
The final application I'm happy with and it's nearly 100% written by the LLMs, including the documentation, tests, configuration, although that's 100% prompted by me, 100% tested by me and 100% of commits reviewed by me. After all, it is my name in the commit messages.
Disclaimer: I currently work for Google who make Gemini.