For the past couple of years I’ve been building AI products.
In this post, I wanted to share some of the things that I’ve learned with regard to Agile workflows for AI.
The notion of ‘research’ can seem quite daunting to founders, managers, or anyone concerned with business metrics — those who may not have a PhD.
At first glance, research might seem like it goes against all that is Agile — namely, long lead times, coupled with the uncertainty regarding outcomes. AI products often take on the appearance of an academic research project, with flexible deadlines and unlimited or open-ended budgets. Failing slow is the norm, and products are prioritised not by the potential business impact but by team preference. Just like the age old ‘waterfall process’, AI products can take many months, or even years, before the results are put into production.
Agile was introduced to evolve the waterfall approach — giving end users working, tested, and deployable software on an incremental basis.
Thankfully, workflows for AI are now making this transition too.
Taking a brief pause, it’s worth mentioning that AI products can differ greatly from one another. There are many variables that make each unique — please keep this in mind as you read this post. I could be ‘overfitting’ my commentary on the AI products that I’ve helped build.2Also, I should note that views expressed do not represent those of my employer.
So, can an AI-product team be Agile?
Undoubtedly, the answer is yes (in my opinion).
As I touched on in the introduction, there are quite a few nuances for adapting Agile to AI products. But overall, throughout the past couple of years, I have not encountered a problem insurmountable that didn’t yield a better dynamic nor better results for everyone involved, than ‘Agile AI’.
To ensure we leave little room for ambiguity, it might actually be worth taking a brief segue to define what an ‘AI product team’ actually is.
The key roles when building an AI product are: Sponsor (if within a large corporate), Product Manager, Data Engineer, Designer UX/UI, Machine Learning Engineer, and Data Scientist. This core team can be augmented with Software Engineers, DevOps Engineers, and other roles as needed based on product definition.
So you’ve got your team, you have a problem to explore, and you want to be Agile — how do you go about it?
An evolving process
Below is a process that I’ve found to work:
It’s a simple workflow, but one that I’ve found encourages creativity and exploration — yet, also maintains the scientific rigour.
PMs, CEOs and Founders get their ‘focus on delivery’, whilst Scientists and Engineers get the freedom to explore, test datasets, and benchmark existing models, amongst other things. Furthermore there’s also circuit breakers for ‘go-no-go’ decisions throughout — so we don’t end up failing slow as mentioned in the introduction.
Throughout this process the whole team, including end users and stakeholders, are able to clearly understand the problem and solution space. There should be no ambiguity.
The process also caters to the fact that research and building / deployment should be separated.
In research, failing to prove feasibility is an acceptable outcome. Failing to deliver a feature is less acceptable.
Counting on the research to result in a feasible solution (in order to immediately build / deploy it) by definition means that you will most likely end up not delivering.
Let’s jump into each step in a little more detail…
This ‘pre-planning’ phase is where the business problem and target variables are chosen. Typically done collectively between Scientists, Product Manager and Strategy or Design.
Initial research protocol (IRP) & go-no-go
This step focuses on a document that should take 3-4 days to write up (typically by the Scientific Lead), and it’s basically gives the Scientists, PMs, and Strategy freedom to explore. This is akin to a Founder creating a pitch deck for a VC. They are walking everyone through their thinking, their business case, their expected output, the required inputs, etc.
Everyone should have an opportunity to provide feedback and discuss the IRP as a group. The desired outcome being a go-no-go decision. Overall, as a group, do we think committing resources to building this, will end up in us hitting the desired goals and KPIs?
In more detail, what should the IRP include?
Overall goal and scope of the proposed ML model – prioritised on the basis of cost, feasibility, and benefits to the final product.
Data – basic list of data sources, or vendors, along with availability, and suitability for the model. Worth remembering or noting if annotations are needed.
Modelling – this includes the core modelling tasks. Where relevant, models that are currently used within the company, if any. Of course, a literature review (published work on similar problems in relevant areas). And finally, your own ideas and proposals.
Usefulness and impact of ML prediction – here you should list the expected improvements in performance of existing (current) business models or baseline models. This should be based on research in the literature review or exploration of your own ideas. Next, the estimated business value of model improvements — obviously this will be difficult, but work with your stakeholders or sponsors to take an educated guess. Just as important, the estimated impact of incorrect or biased predictions. Finally, how frequently does the model need to be correct to be useful?
Next your IRP should cover the computational resources, as well as briefly describing model feasibility. And finally, a timeline for the proof of concept.
The aim should be to build a minimal viable model (MVM) or minimal viable product (MVP). This is a model or product that is just good enough to put into action. It provides good benchmarks and enables you to gather feedback from the end user sooner. This is one of the core rationales for Agile – you’re able to understand quicker and iterate, or change direction.
Let’s say you build a churn model which the sales department uses for customer retention. As soon as they start reviewing your MVM they could perhaps find out that the interval in which you predict is too short, many customers have already cancelled their subscription. Instead of further optimising this model, you would focus on predicting a longer time ahead.
What a MVM looks like is product-dependent of course, but in many cases it would probably make sense to define it a regular statistical measure. The model might be replacing a business rule that has been in place for many years, the MVM is then ready as soon as the model outperforms the business rule.
I’m not going to go into a huge amount of detail; but this should include all of what you’d expect. A summary on data; covering, quality, coverage, issues, etc. Also make sure to cover performance baselines – including simple models or heuristics. If you need to compare against human performance, give that summary. And finally, map your model KPIs to business KPIs.
Peer review & go-no-go
This simply means that you should have other Scientists within the team review and feedback on the technical report. The report should be of a standard that you could submit it to a top journal or conference.
Collectively the team, along with stakeholders, need to now make the critical decision of whether or not to go ahead and deploy the model into production.
Some advice: start with simpler models and incrementally transition to more complex model as necessary. The balance between model performance, speed, explain-ability, and ease of deployment. Next, develop a simple model pipeline and avoid creating pipeline jungle. Follow best practice of respective modelling tool to serialise entire model workflow (pre-processing, feature engineering etc), not only model algorithm.
How the concepts of Scrum map to building AI products
To change gears slightly, it’s worth covering some of the ‘ceremonies’ involved in Agile; specifically, Scrum.
Product backlog: this is a prioritised list of stories that are mapped too and created from the IRP. They should be split out by function, for example ‘Engineering’, ‘Machine Learning’ or ‘Design’. The PM is ultimately responsible for the order in which they are tackled. Making sure they map to the roadmap, and updated as and when issues arise — for example data availability or stakeholder feedback.
Sprint: the sprint is the key unit of iterative development in the AI product. It is typically 2 weeks. It follows sprint planning. This is a time-boxed working session that lasts a couple of hours. In sprint planning, the entire team agrees to complete a set of product backlog items. This agreement defines the sprint backlog and is based on the team’s velocity or capacity and the length of the sprint.
Sprint review: In the sprint review meeting the team shares an update with product stakeholders. The topics include recently engineered features, an update on model accuracy, and progress made toward the business outcome. This is also a great chance to review an updated feature importance ranking (shows which inputs are most helpful in predicting the target variable) to begin building trust in the product.
I always forget to sign-off my blog posts, so I’m attempting to break that trend. I hope you’ve found this post useful in illustrating how and why building AI products should be tackled with Agile in mind. I could have gone into a lot more depth, but I’ve reserved that for my upcoming book, as mentioned in the introduction.
If you have enjoyed the post, please be sure to share it with your colleagues or friends and link them to here. Thanks!0