The Wallscape data scientist on conversational agents, data visualisation and working with designers
Pitch Wallscape to us in 100 words or fewer. Go!
Data management and integration has always been a huge problem with floods of new documents (health records, contracts, spreadsheets with columns titled ‘CY3’ and a Word document naturally stored somewhere else telling you what ‘CY3’ means), financial transactions, social media mentions, staff payroll updates, website traffic tracking data, and the list goes on… This is a critical challenge for businesses of every size.
Wallscope’s platform handles all of this data as it comes in live, assimilating it with existing information, and presenting this ever-increasing wealth of knowledge in an understandable and engaging way.
What tools and languages do you use most often?
I code in several languages, depending on what I’m doing, but I predominantly use Python as the libraries are brilliant for AI projects. For example, Rasa is a fantastic open-source framework for conversational AI, I use this especially to ‘understand’ what a user is saying. For natural language processing (NLP) I tend to use spaCy and AllenNLP which are also both open-source.
Accompanying the above tools, I often use Tensorflow for other machine learning projects, Apache Spark for analytics, and OpenCV for Computer Vision (e.g. to develop a conversational agent for people with visual impairments).
There are so so many tools out there for anything you can really imagine, Vincent Warmerdam from Rasa even made a challenge for recruiters to identify the Pokemon names within a list of data science tools.
For your PhD, how would you define a conversational agent? How would I talk to it?
A conversational agent is essentially any system that you can ‘talk’ to. For example - Alexa, Siri, Cortana, Google Assistant, etc… are commonplace and the list just goes on and on! I tend to focus on these ‘voice assistants’ but a conversational agent doesn’t have to process voice. When you go on a website and a little chat window pops up, that is not a person, but a chatbot - a conversational agent that takes textual messages instead of voice. This really expands to any way that you can ‘converse’ with a system, think gestures for example.
What's the application for Healthcare you're exploring?
Speaking to our home is no longer a vision of the future, it is here now and here to stay. This is incredibly exciting, but the accessibility of conversational AI is now more crucial than ever. I can set a timer without washing my hands while baking, but can an older adult see who is at the door when their arthritic knee plays up? Or can a person with dementia access their music (shown to be a huge benefit) independently?
Natural conversation is very messy, but we subconsciously ‘clean’ our speech when talking to voice assistants. As dementia progresses however, these natural conversational phenomena become more common and more prominent - it is currently unclear whether this important group of people also ‘clean’ their speech when talking to a voice assistant. I am working to find this out, determine what speech patterns cause current voice assistants to misunderstand this particular group of people, and explore solutions to improve systems - ultimately making them more accessible. I have already worked on the ethics around this topic and sections of the voice processing pipeline, we hope to have the data for the above results very soon!
You've studied analytics and visualisations at degree level. What are your favourite visualisations and tools for creating them?
So the tools I use to create my unglamorous charts are seaborn, plotly, matplotlib, and Shiny when building more interactive applications over a dataset.
Can you share any examples of visualisations you've created?
Oh wow, I suppose I can - prepare to be blown away haha.
In a recent article about building an Olympic themed linked data application, I shared my drawings of the proposed interface… Later in the article, you can see my collaborators bring it to life!
To illustrate something that I spent a little more time on actually designing - I created an infographic to illustrate my research:
I’ll save you the excitement of my many line charts, histograms, and confusion matrices...
What problems have you solved using data science?
Great question, one of the first data science projects I worked on (depending on how you define it) was modelling stochastic processes (e.g. weather and climate models). Back then, as I knew the maths but not the tools, I would figure out the algorithms on paper before programming them to iterate as needed.
Over the years since then, I have worked on loads of interesting projects using data science! With Wallscope I worked on a system (with the NHS) to manage patients as they are discharged from hospitals, and a similar system (with the Scottish Government) to keep track of critical documents and statistics within their vast datastores.
In the conversational AI realm, I have worked on the detection of offensive sentences, the answering of open-domain questions (e.g. “Who stars in The Queen’s Gambit?”), my core PhD work, and adapting voice assistants for blind and partially sighted people (look out for updates on this soon).
The list of interesting data science projects going on at the moment is endless, I could talk non-stop about the projects in Scotland alone! I recommend joining some meetups if you want to hear about them too.
We're a UX design collective - we research and create apps and websites. Do you work with the designers at Wallscope? How do you collaborate?
I actually sat next to Wallscope’s UI/UX designer (Dorota) for the majority of my time at Wallscope, and I think it was really beneficial for both of us. I couldn’t work further away from the user - working on data structure and information extraction. Dorota, on the other hand, communicated directly with the users, figuring out exactly what features the client needs and how to display that in the best way possible.
As we were able to talk so easily, we learned how to very efficiently exchange ideas. I knew what data Dorota needed, and she knew the limitations of the data we had available - this benefitted us both a huge amount as we only had to build things once. We of course had to tweak our respective ‘products’, but we were not two disjointed entities having to overhaul developed ideas. Even when the data was not yet available from clients, we would work together to ‘mock’ the data as accurately as possible, allowing the design team to continue. I really can’t express how valuable I think this communication is!
Dorota was interviewed a few months ago, so please do read the article.
What impact do you think machine learning might have in 2021 as lockdown eases?
That is a very tough question... We are now in this lockdown, more digitally reliant than ever before! As the lockdown eases however, I can see both the industries that have boomed (online shopping, video calling, etc…) and the industries that have been strained (airlines, hospitality, etc…) turn to AI. The organisations that have thrived will have more data to optimise and automate their processes, preparing for a drop in usage (hopefully people meet in person again), and allowing their staff to focus on more complex tasks. Similarly, people will flood the remaining restaurants and pubs (my favourite is sadly shut forever). I expect minimising food wastage will be one priority here, ensuring every penny spent is put to use - making up for the months of lockdown.
On the research front, funding has been made readily available for healthcare and AI which I expect to remain the case for many years. I hope this prepares us for the next global crisis.
We're interested in how to communicate machine learning insights to office workers. This includes people with no statistical background. What techniques might you recommend? The ones we're aware of include:
- case management - including showing the priority of new sales leads
- automatically tagging data, such as identifying complaint emails
- showing where data exceeds thresholds
- showing recommendations - for example, behaviour which shows a bank account has been hacked
I think it is very important to do this! For example, academic papers are littered with jargon and assume the reader is well-versed on that specific topic. Non-technical people (e.g. end users) have valuable insights but would find the paper inaccessible. I write articles on Medium for exactly this reason, opening up the discussion to everyone. Some of the topics I write about are very technical so I have often struggled to convey a point. In these cases I either turn to illustration (here) or writing further on the particular topic. I have even written full articles thanks to questions from readers!
Whether in writing, talks, demonstrations, or any other method of communication - I recommend examples, examples, examples. Think about the problems your ‘system’ solves and find one that you know your audience can relate to. When you describe the pains someone faces in their day, and then how your solution will make them go away - that person will listen, understand completely, and hopefully even be excited! Then you can pull out the charts.