AI Refactoring AI: Codex Improves ChatGPT Generated Code

Earlier this year, I wrote about an experiment where I attempted to create an in-browser game using only ChatGPT. This lightweight web game had a fun twist to the classic four-in-a-row gameplay: after each turn, there was a random chance the entire board would rotate 90 degrees, causing pieces to fall and settle into new positions. While this rotation mechanic worked well, my original vision was even more ambitious—I wanted the board to occasionally rotate 45 degrees, creating a diamond orientation and causing the game pieces to bounce down the board “Plinko-style.” At the time, however, existing AI tools weren’t quite up to the challenge. With OpenAI’s recent introduction of Codex, I decided it was the perfect opportunity to revisit this project. Codex promises streamlined refactoring, robust test generation, and effective bug fixing—the exact tools I needed to finally realize my original vision.

What is Codex?

Codex is OpenAI’s latest cloud-based AI software engineering assistant, built on the codex-1 model. It performs coding tasks within isolated sandbox environments, directly integrated with your code repository. Available to ChatGPT Pro, Team, Plus, and Enterprise users, Codex is marketed to excel at:

  • Writing and refactoring code to closely match human styles and preferences.
  • Iteratively running tests until it achieves passing results.
  • Proposing detailed pull requests for human review.
  • Answering targeted questions about your codebase.
  • Generating initial drafts of technical documentation.

Goals for Revisiting the Project

I had three clear goals for my revisit to my original project:

  1. Introduce a Testing Framework: The initial project lacked any automated tests, making it challenging to confidently fix bugs or implement new mechanics.
  2. Fix Existing Bugs: Specifically, the game logic had a bug with correctly determining a winner after pieces fell into new positions post-rotation.
  3. Implement 45-Degree Rotations: This feature would require significant game logic changes, making it a great challenge for Codex’s refactoring capabilities.

My Experience with Codex

I began by using Codex’s sidebar integration within ChatGPT, assigning specific tasks directly via the “Code” option. Initial results were mixed: without internet access, Codex struggled to integrate a valid testing framework, leading to similar limitations I encountered previously with AI-generated code (particularly with the 45-degree rotation feature).

However, upon enabling Codex’s controlled internet access—specifically to the preset common dependencies recommended by OpenAI—I saw significant improvements. Codex successfully integrated Jest for testing and quickly created a suite of unit tests for existing functionality. With this in place, Codex was then able to resolve the bug regarding winning conditions after piece movement within a single iteration and even added additional test coverage for said functionality.

With a stable test suite in place, I turned again to the ambitious 45-degree rotation mechanic.

game of four in a row being played, with the board rotating 45 degrees while still supporting gameplay.

This time, Codex showed much greater potential, successfully building logic to support a diamond-shaped orientation and preliminary “Plinko-style” piece movements. Although not yet perfect, the progress was notably greater than anything I achieved a year prior using previous AI models. With continued iterations and further AI-generated pull requests, achieving the originally elusive 45-degree rotation mechanic now seems genuinely attainable.

Conclusion

Returning to my AI-driven “Four-in-a-Row” game project with Codex highlighted how far AI development tools have come. Codex streamlined adding essential tests and quickly fixed lingering logic bugs through testing. It also made previously unattainable features like diamond-shaped orientations seem feasible.

The user experience of Codex was also quite nice as compared to working with a simple chatbot. Having access to logs outlining the iteration process of Codex made insights into its processes easy. The ability to bounce between branches to see how progress is coming along with the automated PR process felt great. And I also found myself being greatly appreciative of the ability to spin up multiple tasks and seeing how progress is coming along with different requests in parallel.

Screenshot of chat interface for Codex

Although Codex required some external guidance (i.e. internet access for dependencies) and a handful of iterations to get fully functioning results across the board, its ability to enhance the project’s stability and innovate gameplay mechanics has left me optimistic about the future of AI-assisted software development.

 

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *