Article summary
Collaborating on complex algorithms or numerical models can be difficult, especially when collaborators have highly asymmetric knowledge and responsibilities. On a recent custom software project, we had a client team member who was deeply familiar with a computational model that is core to their business. He could write Python code in a Jupyter notebook to explore and implement that model, but he often reminds us he’s not a programmer. Atomic Object’s developers are responsible for delivering a robust, high-quality solution for our clients. That normally includes being responsible for well-factored code with a robust automated test suite.
We faced an interesting question: How can we take advantage of this opportunity to empower our client and collaborate on the model while meeting our goals of quality and a good experience for the collaborators?
The short answer is we decided to adopt parts of our client team’s toolset, Python and Jupyter notebooks, so that we could directly use the model they created and share responsibility for the behavior and correctness of that code. Below are more details about how we diversified developer tooling and how it worked out for us.
Context & Deciding Factors
Though I made the decision sound easy in the introduction, we could have landed on a different direction. The main part of our application was built in Next.js to take advantage of our team’s deep experience with React and TypeScript. We could have rewritten the Python model in TypeScript to align with the rest of the application, or chosen to switch platforms and build a Python Django application instead. We didn’t do either.
The application architecture we arrived at is a Next.js application, Postgres database, and Python Flask API running in Heroku. Fragmenting an application’s tooling isn’t a decision we arrived at without due consideration. Below are some of the key factors that lead us in this direction.
- Time to first feedback – Rewriting all the Python code would have taken significant time, and we wanted to keep the time to first feedback from our client as short as possible.
- Flexibility – We were aware that the model is likely to evolve and wanted the ability to incorporate the changes from our client team directly.
- Opportunity to collaborate – We saw the opportunity presented by Jupyter notebooks and the Python models our client could develop.
- Right tools for the job – Our team wanted to use the best tools for each part of the job. Next.js works exceptionally well for the web UI. Python has tools our client is comfortable with for Psychrometry, graphing, and numerical modeling.
Benefits
Overall we’re quite happy with the outcome. It feels that we’re using the right tools for each part of this application.
The ability to collaborate directly on the same code with our client has made it possible to evolve their most critical algorithms with shared responsibility for the correctness and quality of that part of the solution. The Atomic Object team has clear ownership over the Next.js UI and used best practices for the web UI to deliver a modern, best-of-breed application experience.
It was fairly simple to add the new Python API to our monorepo, get the tests running in CI, and deploy another Heroku app alongside our Next.js app using heroku-buildpack-monorepo.
We’ve directly incorporated changes from our client team in most cases. When there are problems or complexities, we can sit down and walk through the application code together.
We’ve avoided confusing differences that could have crept in from tooling differences between our client’s code and the application code, e.g., differences in how the psychrometric library implementation in JavaScript rounds or handles unit conversions.
We were able to use a charting library in Python that our client can test out in their Jupyter notebooks. In our deployed system, the charts render server-side in Python and were easy for us to incorporate into the PDF downloads we provide from the system.
Challenges
While there have been many benefits, not everything has been easy.
We discovered early on that we were using different versions of the Python runtime to run the Jupyter notebook, and that caused some problems. I haven’t figured out yet how to get my VS Code Jupyter notebook runtime to consistently play nice with my terminal environment and asdf-managed Python installs. I’m sure its possible but my searches didn’t lead to an obvious solution.
Sharing different versions of the library code, Jupyter notebooks, and data sets needed to run the algorithm has been a challenge. Our client hasn’t used git or other version control so we’ve resorted to sharing files with dates in their names. If this collaboration continues longer-term, we’ll need to find something that works better.
Deciding when to refactor the code our client wrote and when to leave it alone is tough. If we refactor we’re creating more cognitive load and learning burden for our client (who has limited time), but if we don’t we have to deal with more duplicate code and poorer developer ergonomics. We’ve generally aimed to not refactor in this situation.
Notes for Next Time
I see value in this pattern, but I’d do some things a bit better the next time we have this type of opportunity.
Introduce a shared Jupyter notebook earlier that includes a simple boundary and error check test suite for the core algorithm being developed. Include tests that run their functions the same way the application does. Host it in some version control system or online environment.
Run a short session early in the collaboration to introduce some key developer ergonomics practices, like type hints, variable and function naming, and constant order of parameters to functions. Then do the refactoring work to establish a reasonable baseline of practice.
In short, I would absolutely make this decision again. Fragmented developer tooling can have some downsides, but we’ve found that the ability for more direct collaboration with our client plus preferred tooling for the web UI outweighs the value of tool unification in this case.
I’d love to hear how others have taken advantage of opportunities to collaborate with domain experts who can write enough code to express key algorithms but not well enough to commit changes directly. What have you found helpful in these situations?