Lindsay Lee, data scientist at the University of Sheffield AMRC, explains how experts like her are working side-by-side with engineers at the AMRC’s Factory 2050 to show manufacturers how to get the most out of their smart factory and big data.
‘So, what data do you want?’
When a data scientist enters a new field, this is often the first question they are asked. The most common response? Often, it is simply: ‘Well, I don’t know.’
The truth is, data scientists do not just enjoy working with data for data’s sake, but rather relish problem solving – the answers to which are usually found in the data. Without knowing what problem is waiting to be solved, how can we understand what data we need?
Safe and ethical AI
I am new to the field of manufacturing and my first six months at the University of Sheffield Advanced Manufacturing Research Centre (AMRC) has mostly involved understanding how data is captured, collected and stored, and what sort of data-related problems engineers are facing.
Working in the Technology Readiness Level (TRL) scale between academia and industry means that everything the AMRC does has to work in production. For a data scientist, this puts a huge emphasis on the emerging field of safe and ethical Artificial Intelligence (AI), especially where high-risk decisions are to be made. We have a basic ethical understanding that everything we do has to be fair and unbiased – this means we consider the data that has not been collected as well as the data that has and recognise the implications of this. We are all aware that if you ‘collect’ the right data you can manipulate the results as you wish but, as a data scientist, we have an ethical obligation to make sure that the results are not simply a result of the data collection method.
Data scientists are also trained in a number of analysis techniques and are able to code this up so that we are not at the mercy of the available software and the limited options this can bring. For me, an interesting part of the safe and ethical movement is the push to move away from ‘black box’ analysis and ensuring that there is some sensible interpretation of the analysis, so that even a right answer hasn’t been found for the wrong reason. Ensuring interpretability of results is important for every step of the process. Why have we collected the data we have? What data can’t we collect? What different analyses have been conducted? How can the results be interpreted? What happens when things go wrong and how do you identify it?
As far as I am aware, there is currently no piece of software used in industry that completely covers this, only people. Those people are us, the data scientists. At the AMRC we are trying to tackle this with the Factory+ project.
The Factory+ framework: a connected, smart factory
Factory+ is an open-access digital architecture for manufacturing shop floors that simplifies the way data can be handled across an organisation. Factory+ aims to provide a synthesised way for machinery to capture and use data to solve problems; to make manufacturing more sustainable, efficient and ready for Industry 4.0 – or even 5.0. It is a truly collaborative project of Internet of Things (IoT) engineers, robotic engineers, software engineers and data scientists.
Data scientists are considered the users of the Factory+ architecture and need to be able to pull data for any project. The value of having data scientists involved in this is that, while we don’t have the domain knowledge of an engineer, we do know what should be considered when collecting useful data for an array of problems without simply trying to collect and store all available data; an endeavour quickly curtailed by storage limitations.
Understanding the data
It is likely that high-resolution data will be needed when the project involves monitoring of equipment for a quick response. However, this sort of data does not need to be stored long-term and the analysis may be performed at the edge, requiring little or no recording of historical data. We can also do interim analysis to identify markers of any problems and store only the relevant data for long term monitoring of equipment and future prediction of upcoming issues. For long-term storage, it is very important that there is meta-data available to explain the data, its resolution, its units and any extra useful information. Data scientists are embedded within the Factory+ project to help define what that useful information might be.
Factory+ is much more than a data science project. In fact, data connectivity is the main aim of the first phase. Its longevity lies in the continued collection of useful data; increased connectivity throughout the AMRC and further afield; and, of course, the data science applications and how their results can be fed back into the machinery. Continued collaboration between the IoT experts and data scientists will one day lead to data science tools both using data and being a connected data stream.
Factory+ in practice
While the data we collect will be used to reveal features to machine users and allow process monitoring on a continuum, another goal is to use the data to allow machines to update and improve without the need for a costly pause in production and human intervention. With such lofty goals it is clear that the data scientists need to work closely with the domain experts to understand the implications of reducing costly interventions and to work with the engineers for error prevention and further process improvements. It is always important to remember that a data scientist doesn’t just need data, we need domain experts to have real-life impact.
Our first AI goals with Factory+ are to improve robot cutting by optimising the feed rate using machine learning algorithms with data from the first robots connected to Factory+. The problem identified by the robotic engineers at the AMRC’s Factory 2050 facility is that the robot slows down more than necessary when turning corners, most likely related to the number of steps it is allowed to consider at once. Working with the engineers, we have been able to understand how the robot works, understand the design codes that make the robot move, and even help connect the robot data stream to Factory+. Now the data scientist has a data stream that can be used as an output to a machine learning algorithm; all the design data required as an input to the algorithm; and the knowledge to find an optimisation of the feed rate that the robotics engineer can implement.
More widely, alongside the specific AI projects, we will produce generalised data science workflows and codes so that when Factory+ is up and running across the wider AMRC, there are consistent methods for using the data that is produced and the most common analyses can be easily applied in different settings.
So, what data do I want? I want data that can let me use data science tools to have impact and improve manufacturing processes. By including data scientists in the Factory+ team, we hope to provide the ultimate blueprint for a connected, smart factory as well as the blueprint for a successful collaboration.
About the author
Lindsay Lee is a data scientist at the University of Sheffield AMRC, based at Factory 2050 in Sheffield.
Lindsay completed a PhD in Probability and Statistics at the University of Sheffield in 2010, following that she worked for ten years as an applied statistician in the School of Earth and Environment at the University of Leeds where she became internationally recognised as an expert in applying advanced data science solutions to quantify and characterise uncertainty in climate simulations. She enjoys presenting these technical ideas to many audiences, from the public to industry and academia.
Lindsay joined the University of Sheffield AMRC in 2021 to use her applied statistical, machine learning and AI knowledge to develop interpretable and trustworthy technical solutions for manufacturing. Lindsay is passionate about ensuring manufacturing can become more efficient and sustainable through better use of data, from collection to decision making, using the latest data science/AI solutions and effective communication.
If you’d like to read more articles similar to this, please check out our smart factory channel.
The Manufacturer also organises Smart Factory Expo which is taking place between 16th-17th November 2022 in Liverpool. Smart Factory Expo brings together all the technologies enabling the digital manufacturing revolution under one roof – creating a carefully-curated shop window for manufacturers at all stages of their digital journey.
The exhibition features five distinct Visitor Zones, each of them anchored by a Solutions Theatre with free-to-attend presentations delivered by manufacturing and technology experts. Visitors can find the very best solutions on offer, including the latest technology offerings from start-ups, alongside established brands and companies that truly support and advance the manufacturing sector. Find out more here.