Andy McMahon

Data scientist

Woman looking at an eye and some rectangles, representing machine learning.

Congratulations on winning data scientist of the year 2019, what did you do to qualify? 

Thanks very much! Basically I was looking for some award schemes for our wider team to apply to and I stumbled across these awards run by the Data Science Foundation (which I’m a member of and would recommend joining if you’re in this space!). I mentioned them to my boss and she put me forward, which was nice. 

Then I had to answer a lot of questions about my background, how I’d got to where I was and what stood out as a big data science achievement for 2019. That year for me was really about consolidating a lot of work I and others had been doing into something more strategic.

Essentially I helped to solve a relatively complicated data science question, managed the productionisation of it and led the stakeholder management piece, which was crucial.

I had also done a lot of work to support our wider data strategy, especially the elements around data science and machine learning. I think it was a good year and I was really pleased to be recognised with the award.

In your role, what's the split between defining what your organisation needs and actually designing and delivering the solution? 

With the type of positions I’ve held I think I’ve had to broach both of these in relatively equal measure. Data science and machine learning are extremely powerful tools but a lot of organisations have struggled to gain traction and generate value in this space, I think due to a few misunderstandings about to how to approach these projects.

For example, there’s often a misconception that you get some smart people in a room and some magic happens.

I’ve spent a lot of time in the organisations I’ve been part of educating who I can that really it’s still about the same things as for other tech projects: process, clarity of purpose and solid business cases.

For the UX design and research community: what part do we play in creating machine learning technology? For example - interviewing end users or designing visualisations. 

I think this is a massively overlooked area. A lot of things I build are backend processes that feed into other systems but at the end of it a user has to interact with the result in some way or another. You could argue this is just standard UX for whatever that solution is, but I think the unique aspects of ML have to be incorporated so that you don’t lose users on their journey.

For example, how do you communicate effectively that an ML algorithm’s results are not certainties but often probabilistic in some sense? How do you get users to provide feedback to use as more training data? How do you ensure that any suggestions or insights being provided augment their workflow and don’t detract from it? And of course, like you mentioned, how do we effectively visualise outputs so that anyone can understand what the results say? I think there’s an assumption that this is easy and it can be incorporated into the jobs of the people building the algorithms, but that’s a mistake in my view. Give it to the experts, with the time, tools and appropriate focus and you’ll create something special.

I'm personally interested in learning https://ml5js.org/ - an open source framework for ML which includes image recognition and NLP. What tools or language would your recommend for designers? Do you use any yourself?

That framework looks really interesting and it’s great that it’s JavaScript based. I am a massive Python head but JavaScript is excellent as another relatively “low barrier to entry” language for machine learning. 

For recommendations it depends what you want to do. If you are looking to embed ML in front end applications then frameworks like ml5js or similar fit the bill. If you want to extend this out to the typical ML workflows you’d be looking at the usual Python/Scala (and now languages like Julia) ecosystems. For large scale compute it’s all about Spark and that ecosystem. As I say depends what you are after.

As I mentioned I’m very much a backend person who works on building pipelines, jobs and microservices, so I’ve not had a massive need for the JavaScript based frameworks. I use all the other tools I’ve mentioned, especially Spark and related tools.

Having said all that what I am super interested in is always building amazing services that other solutions can consume from, including beautiful front ends and visualisation tools. So I’m always keen to hear if anyone has tips for making this interaction more seamless.

In your opinion, what's likely to be the biggest development in ML in 2021? 

In general, across the enterprise, my hope and belief is that MLOps will start to mature quite substantially and we will see some real examples of excellence in this space (including at my current employer!).

MLOps is the incorporation and application of ideas from DevOps to the world of ML and is so important in my mind for successful products that can run stably for years.

I’m also pretty excited by the prospects of algorithms on graphs, I know this is a huge area of interest across many industries and I can particularly see it’s relevance to finance (where I work). Applying algorithms to find connections or understand relationships between entities and processes to me is really impressive, valuable and also just pretty cool!

Finally, I’m hoping that Reinforcement Learning  in real world applications starts to take off a bit more. I’ve built a few prototypes using these techniques in the past but taking them through to production is fraught with challenges, and I hope that the community makes some strides in this direction (and I hope that I can help)!