Agile Data Science: Does it really work?

Fidan Musazade
5 min readDec 6, 2020
To be, or not to be, that is the question

Agile has become some kind of hype nowadays. There is an increasing number of vacancies in most of the companies for Agile Coaches, Scrum Masters, etc.

However, usually, most of these specialists are hired to work for Software Development departments. Now, with the increase in the popularity of data science, the question arises. Do we also need to implement Agile in that department?

Overview of Agile

First, let’s decide on what Agile is and what it is not. According to the Agile Alliance, “agile is the ability to create and respond to change”. This is a philosophy rather than some technology. Agile concerns the way of thinking and aims at changing the values and beliefs of people.

What we think of when we hear the word “philosophy”

This is a very powerful instrument in helping companies to efficiently work in unstable environments. However, Agile is not a magical instrument. Let’s start by examining what it is not.

  1. Agile is not a tool to completely omit planning. It still requires teams to efficiently plan sprints and have a global plan for the project.
  2. Agile is not undisciplined. It does not eliminate deadlines and responsibilities. Even though the deadlines are set by people who work on them, they should still be reasonable.
  3. Agile is not new and unproven. There is tons of evidence of the effects of agile. The concept did emerge long before the Agile Manifesto was created.
History of Agile

Now on what Agile is. Agile is for all departments, not only for Software Development. Yes, it has more evidence coming from development teams, however, it can also be applied to other departments.

Data Science vs Software Development

As opposed to Software Development, it is hard to show results in Data Science projects. Most of the work done consists of research and implementation of certain techniques, which may have unclear results for people outside the team.

This creates an issue in such rituals of Agile, such as demos. Every 2 or 3 weeks the teams using Agile methodology, have a demo ritual, which involves showing the stakeholders what they have done in the sprint. The period may be more or less depending on the team but generally is around two weeks.

How data scientists show analysis results

In a software development project, it is rather easy to show a result. For example, this spring some button was added that was not present before, or some bug is eliminated. Pretty straightforward.

You know what you are working with on a software development project, there are clear expectations and deliverables that you need to meet. However, DS projects are like research tasks you get at university. No one knows what you will get at the end.

Here, Forrest Gump’s mom got it right, when she said “Data Science is like a box of chocolates. You never know what you’re gonna get”. xD

Data Science and Agile

So let’s talk about why you should use Agile for DS projects if there is so much trouble associated.

Spring Planning in Real Life (or no :D)

First of all, Agile brings mandatory planning to the process. At the beginning of each sprint, teams have a Sprint Planning ritual, which makes the team think of tasks for this sprint and define deadlines. This positively affects the projects as they become more organized and less messy.

Secondly, the team can clearly see at what stage they are, what are the blocks they face, and what they should do to move from one stage to another. Agile teams usually have boards that show the tasks, the category of these tasks, and at what stage the task is. For example, let’s say the specialist had a task of model validation but needs the approval of another department to finalize the issue, then this task can be moved to the “blocked” column. This visualization helps the team to spot the bottlenecks in a timely manner and fix them.

Thirdly, having retrospectives is beneficial for the team. By recapping everything that happened during a sprint, the team is able to look backward and define what happened wrong, what could be better, and what actually stood up. This ritual brings benefits to DS projects, as it could help the team to spot the problems earlier.

Feedback

Another benefit of Agile is getting feedback at the end of each iteration. It is vital to know what you are doing wrong in order to be able to fix it. Sometimes, your research may go wrong in the beginning and the whole project could depend on that error. Fixing it at the end could be a real pain, so spotting it early on is definitely a win.

But you have to keep in mind some challenges that come with these benefits. Teams have to be careful when loosening up the deadlines and letting everyone decide on their own timeframes. Some could exploit it.

It also could be hard to define deadlines as most of the challenges are hard to spot before starting the task. For example, the data to be used could be not as clean as expected and need additional processing. In these cases, the extension of the deadline is reasonable as more work came up.

Overall, Agile is there and it works not only for developers but also for data scientists. Even though there are some practical challenges, the benefits usually outweigh them. So to the question “does Agile work for DS projects” the answer is YES. But remember to be careful ;)

--

--

Fidan Musazade

Data Scientist & Machine Learning Engineer @ The International Bank of Azerbaijan