Spec coding
Like everyone else, I've been playing with using LLMs as an app generation tool and like most people, I have been disappointed. The code being spit out has not been very good. And like many others, I was interested in finding ways to improve code creation.
Gemini.md
I had stumbled upon this medium article and then took a look at the Gemini CLI github repo. Much like claude, cursor and github copilot, it is possible to have standing instructions for Gemini. Just write up instructions on how you want your projects laid out, what language and libraries you want to use and how you want the LLM to prioritize things and in theory, Gemini should build an app closer to your spec.
After a using gemini-cli for a few weeks I started to gather a few prompt sets from reading what others have done with claude and cursor. Laying it out in a markdown file was easy enough but I was curious if it really had any real effect.
As a test, I decided to try the same prompt twice, once without a gemini markdown and once with. This is nowhere near a full and exhaustive test. It's just one LLM, one app, one framework, one prompt, one style of app and a single gemini.md file.
The Prompt
Without prompt assistance
For the version without a gemini.md did not work after generation and it was a mess to deal with. Gemini was unable to repair the template issues. The app.py included models, routes and utility functions. Everything was in the application root directory with the exception of the html templates. The configuration for the database is main file. After loading it up into an IDE and doing a static analysis and style checking and I wasn't impressed at all. It is a flask app but it isn't a good flask app. Having an import for the forms in the middle of app.py file was particularly bad.
With prompt assistance
Gemini did not provide a working example for this test on the first shot. It took two additional prompts to fix problems.
the endpoint "/user_checkouts" has an error "jinja2.exceptions.UndefinedError: 'now' is undefined" please diagnose and fix
the endpoint "/book_popularity" has an error "TypeError: '<' not supported between instances of 'method-wrapper' and 'method-wrapper'" please diagnose and fix
With those two additional problem, Gemini was able to diagnose and fix both errors. Whit that done, I have a working app. A working database driven webapp via a spec in less than ten minutes with the agent fixing its own bugs.
What does the code look like? Better. Follows pep8. Models, routes and utilities are separated into different files. There is a config.py file. There are missing docstrings but those are easy enough to fill in as you test. There are no unit tests but I refuse to let AI write my tests anyway. If I were to work on this app, I would be adding pytests as I went along and manually refactored and updated the app. Most importantly, it manages dependances via uv and pyproject.toml so I could easily add dev time dependancies for testing and deployment image creation. It didn't use pydantic for validation and there are no type hints for functions but overall, it wasn't bad as it used Flask and db.Model features for type checking.
One little thing that that impressed me was the seed.py file to populate the database with test data. That was a nice touch.
You can see the code here. A copy of the gemini.md file I used is in library2.
Final thoughts
While I did get it to work and spit out a good starting point for an app, it doesn't seem to be that much better and approach when compared to AirTable, AppSheet or similar no-code tooling for database apps.
Comments
Post a Comment