Professional Documents
Culture Documents
ABSTRACT 1 INTRODUCTION
A Digital twin is the virtual replica of a physical system. Digital A digital twin is the virtual representation of a physical system (its
twins are useful because they provide models and data for design, physical twin) [1]. The concept of the digital twin was first used
production, operation, diagnostics, and prognostics of machines by NASA to describe a digital replica of physical systems in space
and products. Traditionally, building a digital twin requires many maintained for diagnosis and prognosis. Digital twin consists of
built-in sensors to monitor various physical phenomena associated large historical context and performance data and utilizes the di-
with cyber-physical systems such as vibration, energy consump- rect (through inbuilt sensors) and indirect (through latent variable
tion, etc. However, many legacy manufacturing systems do not analysis) sensing to provide the near real-time representation of
have multi-physics sensors built-in by default. Moreover, it might the physical system. Moreover, it consists of various models (for
not be feasible to intrusively place sensors in these systems after simulation, monitoring, control, optimization, etc.,) in a hierarchi-
they are manufactured. To bring the advantages of digitalization cal manner (consisting of a representation of the system, process,
to legacy manufacturing systems, this paper contributes with an component, etc.,) which can provide the blueprint of the whole
Internet-of-Things (IoT) based methodology to build digital twins system [2]. Since, the digital twin allows the user to monitor, sim-
using an indirect medium such as side-channels, which can local- ulate, optimize, and control the entire manufacturing system in
ize anomalous faults and infer the quality of the products being the virtual domain it is expected to play an important role in the
manufactured while keeping itself up-to-date. We achieve this by next industrial revolution (Industry 4.0) [3, 4]. Gartner has listed
exploring and utilizing the side-channels (emissions such as acous- the digital twin as one of the top ten technology trends for 2018
tics, power, magnetic, etc.) of the system that unintentionally reveal and the years to come [5]. Moreover, organizations like Siemens
the cyber and physical state of the system. To validate our method- [6], General Electric [7], NASA [8], and the Air Force [9] are cur-
ology, in this paper, we focus on building a digital twin model rently building digital twins of gas turbines, wind turbines, engines,
of a Fused-Deposition Modeling (FDM) based Cartesian additive and airplanes that allow them to manage the assets, optimize the
manufacturing system. The proposed methodology achieves 83.09% system and fleets, and to monitor the system health and provide
accuracy in anomaly localization. To the best of our knowledge, prognostics.
this is the first work demonstrating the possibility of modeling In the context of next generation of manufacturing systems,
and maintaining a living digital twin of a manufacturing system by additive manufacturing (also known as 3D printing) has allowed
extracting information from the side-channels using low-end IoT designers to rapidly prototype 3D objects layer by layer and brought
sensors. about significant disruption in the manufacturing domain [10]. In
fact, there are various types of 3D printing technologies [11] that
CCS CONCEPTS enable designers to create 3D objects from light-sensitive polymers,
metal powders, thermoplastic filaments, etc. However, these 3D
· Computer systems organization → Embedded and Cyber-
printing technologies are still susceptible to defects due to the large
Physical Systems; · Information systems → Information sys-
diversity in structure and properties of printed components [12].
tems applications; Clustering;
Digital twin models, in this situation, could alleviate the cost of
manufacturing by providing tools to simulate and infer quality
KEYWORDS deviation in the virtual domain. In this work, we narrow down the
Digital Twin, Smart Manufacturing, Internet-of-Things, Machine scope towards Fused-Deposition Modeling (FDM) technology based
Learning 3D printers, which print 3D objects using thermoplastic filaments
such as acrylonitrile butadiene styrene (ABS) or polylactic acid
(PLA).
Permission to make digital or hard copies of all or part of this work for personal or A key enabler for creating a digital twin is the availability of a
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation large number of built-in sensors and their historical data. However,
on the first page. Copyrights for components of this work owned by others than ACM current FDM based additive manufacturing printers lack large num-
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
ber of these sensors [13]. It mainly consists of sensor necessary
fee. Request permissions from permissions@acm.org. for basic control (such as a temperature sensor, micro-switch, etc.,).
IoTDI’2019, Montreal, Canada Lack of sensor arrays makes it a difficult task for sensing the current
© 2019 Association for Computing Machinery.
ACM ISBN 123-4567-24-567/02/2019. . . $15.00
system states, which is vital for digital twin models. Moreover, the
https://doi.org/0000000000 task of building digital twins becomes even harder once the system
IoTDI’2019, Montreal, Canada S. Rokka Chhetri et al.
has been manufactured as placement and selection of sensors for commonly used in the state-of-the-art IoT devices. Moreover, re-
direct observation of system states can be challenging [14]. cently more and more sensors (such as current, humidity, tempera-
Previously, due to the lack of cheap sensors and high-speed and ture, etc.) are added in IoT devices. For building the system digital
reliable network, acquiring the data for the digital twin was costly. twin, we propose to capture the interaction between the cyber-
Today, thanks to the availability and affordability of IoT sensors, it is domain data (such as G/M-codes carrying geometry and process
becoming easier and cheaper for system operators to acquire large information), the physical domain input of the system (such as raw
amounts of data on-the-go from their physical systems [15, 16]. materials and energy), and the environment. G/M-codes consists
However, for 3D printers that do not have built-in sensors and lack of G and M code. The digital twin of the product (the 3D object
historical data, building the digital twin with IoT sensors is still a being created) is initially described using a Computer-Aided Design
challenge. (CAD) tools. The CAD tool then produces StereoLithography (STL)
files which consist of geometry description of the object in coor-
1.1 Research challenges and contribution dinate space. Then a Computer-Aided Manufacturing (CAM) tool
This paper is motivated by three important research questions that takes the STL file and slices it into multiple layers and finds a trace
apply to legacy FDM technology based additive manufacturing to be followed to print the object in each layer. The output of the
system without built-in sensors and access to historical data: CAM layer is the G/M-code. In our experiment, we consider that the
digital twin of the product is described using the G/M-code. G-codes
• Is it possible to build a digital twin of a FDM based additive man- are responsible for controlling the motion (XYZ-axes and extrusion
ufacturing system by indirectly monitoring the side-channels? rate of filaments) of the machine for constructing certain geome-
• Can this indirect side-channel based digital twin model faithfully try shape. Whereas, M-codes are responsible for controlling the
capture the interaction between the environmental factors, pro- machine parameters (process parameters such as temperature, ac-
cess parameters of the system, and the design parameters of the celeration of stepper motors, etc.). Recently, researchers in [17, 18]
product to explain the impact of such interaction on products? have demonstrated that various signals (such as acoustic, magnetic,
• Can we retrofit low-end IoT sensors to maintain the digital twin etc.) collected from the 3D printer behave as side-channels and
up-to-date and use it to predict the product quality (localize faults reveal information about the cyber-domain. Motivated by these
and infer tolerance deviation) of the next product being produced? work, we performed a preliminary study to measure the mutual
To address these research challenges, this paper provides a first information between the four types of sensor data (acoustic, mag-
study on the limits of various sensors modalities (such as acoustic, netic, vibration, and power) and the G/M-codes in a 3D printer. A
magnetic, power, vibration, etc.) and their contributions towards variation of angle (0o to 90o with step size of 9o ) and speed (700
building and maintaining a living digital twin. The key insight mm/minute to 3300 mm/minute with step size of 100 mm/minute)
of our work is that manufacturing machines generate unintended for printing line-segments is encoded in G/M-codes (with total of
side-channel emissions that carry valuable information about the 297 G/M-codes) and the principal components various time and
machine itself, the product they are producing, and the environment. frequency domain features extracted from the side-channels were
Our methodology uses IoT sensors to capture these side-channel used to calculate the mutual information.
information and build a living digital twin (see Section 3.6). Our Figure 1 shows that various data collected from different sensors.
main contribution is a novel methodology to monitor production The percentage mutual information in the y-axis represents how
machine degradation, build their living digital twin, and use this much of the total Shannon entropy of the G/M-code (loд2 (297) ≈
living digital twin to provide product quality inference (see Section 8.21 bits) can be explained by the analog emissions. We have as-
3.7) while localizing faults (see Section 3.5). sumed the distribution of G/M-codes to be uniform for calculating
the Shannon entropy. The figure shows that different sensors have
90 varying level of mutual information with the cyber-domain G/M-
Mutual Information (Percent %)
and Zhang in [20] surveyed the state-of-the-art and motivate the The digital twin of the product DTpr oduct starts its life-cyle in the
need for more building blocks to create digital twins of additive design phase, where computer aided design and computer aided
manufacturing systems. Boschert and Rosen in [21] highlight the manufacturing tools are used to represent the product in the cyber-
simulation aspect of the digital twin and its use in the product domain. These product digital twins from design phase (with an
life-cycle. Alam and Saddik in [22] provide the reference model instance represented using X i ) then go through the production,
for the cloud-based cyber-physical system, with an implementa- where the physical twin of the system PTsyst em takes raw materials,
tion of Bayesian belief network for dynamically updating system energy, and the DTpr oduct to create its corresponding physical
based on current contexts. Schroeder and Steinmetz in [23] provide twins (with an instance represented using Yi ). The physical twin
a methodology to model the attributes related to the digital twin of the system PTsyst em consists of actual physical components
for providing easier data exchange mechanism between the digi- that are used for manufacturing. The PTsyst em is influenced by the
tal twins. Cerrone and Hochhalter in [24] present finite element manufacturing environment in a stochastic manner.
models of as-manufactured models to predict the crack path for Let DTpr oduct = {α 1 , α 2 , . . . , αm } : m ∈ Z >0 , α ∈ R repre-
each specimen. Authors in [25] provide a semantic layer which pro- sent the parameters that define the digital twin of the product
vides a mechanism to pass control feedback and evolve the build (such as dimension, surface roughness, mechanical strength, etc.),
parameters on-the-fly for compensating the tolerance. PTsyst em = {β 1 , β 2 , . . . , βn } : n ∈ Z >0 , β ∈ R represent the pa-
In summary, all these work have focused on either building the rameters of the physical twin of the manufacturing system ( such
digital twin using simulation of the first principle based equations as flow rate, acceleration values for motors, nozzle temperature,
[26, 27] or just placing expensive sensors for in-situ [28, 29] process etc.), and let Esyst em = {γ 1 , γ 2 , . . . , γp } : p ∈ Z >0 , γ ∈ R repre-
monitoring. There is work that uses low-end sensors for in-situ sent the environmental factors affecting the manufacturing system
process monitoring [30ś33], however, these work do not consider (such as temperature, humidity, pressure, etc.). Here m, n and p
keeping the model up-to-date, using the indirect side-channels, represent the total product, system and environmental parameters
which is the fundamental requirement for the digital twin. To this that maybe considered for the modeling purpose. We propose to
end, we propose a methodology for building the system digital capture the interaction between these parameters using IoT sen-
twin and keeping it alive using low-cost sensors available in off- sors. Using the data collected from multiple modalities (acoustic,
the-shelf IoT devices. We use the fact that some of the physical vibration, magnetic, power, etc.), we propose to model a stochastic
emissions act as side-channels, revealing information about cyber- function fˆ(.), that performs three tasks: (1) localize the deviation
domain, and that for every control signals in cyber-domain, there in the DTpr oduct parameter from its physical twin PTpr oduct , (2)
is a corresponding physical fingerprint in the physical domain. As make sure that the DTsyst em is up-to-date (alive), and (3) infer the
it uses the side-channels, this methodology is different compared quality deviation for the DTpr oduct before creating the PTpr oduct .
to the existing methods. Using the proposed methodology, we may Moreover, the DTpr oduct may interrogate the DTsyst em to infer
be able to find new emissions (that may not have been considered the quality deviation due to the current status of the PTsyst em .
during design time) that are able to better represent the cyber and
physical states of the system during run-time.
2.2 IoT sensor data as side-channels
Design Production Operation
Manufacturing systems consists of cyber and the physical domain.
PTsystem The computing components in the cyber-domain have processes
Drives
(deterministic) that communicate with the physical domain. A cross-domain signal
Physical
Cyber
Xi Yi
that is passed from the cyber-domain to the physical-domain have
DTproduct PTproduct the possibility of impacting the physical domain characteristics.
This phenomenon is more prominent in manufacturing system
(stochastic)
Influences
Unintended
where the digital twin of the product causes the physical twin of
(deterministic)
emissions
Interrogates
IoT
Sensors due to these characteristics there exists physical emissions (such
Esysten
as acoustic, vibration, magnetic, etc.) which also leak information
about the digital twin of the product. We denote these emissions
as side-channels, as they indirectly reveal the information about
Localizes Fault
(stochastic) the cyber-domain interactions due to the particular physical im-
plementation of the system. For building the digital twin of the
DTsystem Infers Quality
(stochastic) manufacturing system that captures the interaction between the
product physical twin, the environment, and the system’s physi-
Figure 2: Digital twin concept for manufacturing.
cal twin, these side-channels play a crucial role in providing the
necessary information. As show in [34, 35], there are various com-
2 BACKGROUND ponents of the system that reveal information about its internal
2.1 Concept definition states through the side-channels. In this paper, we propose to utilize
As briefly explained earlier, in a manufacturing environment, we those indirect side-channel information for fault localization, qual-
have Digital and physical twin of product and system DTpr oduct , ity inference and for updating the digital twin models. In this work,
DTsyst em , PTpr oduct , and PTsyst em , respectively (see figure 2). we analyze four such analog emissions which potentially behave as
IoTDI’2019, Montreal, Canada S. Rokka Chhetri et al.
side-channels. Let sa (t), sv (t), sp (t), and sm (t) represent acoustic, fault localization, fingerprint generation, and quality inference. For
vibration, power and magnetic emissions from the manufacturing run-time localization, we propose to create and maintain an active
system. Then we define each of these signals as: fingerprint library of the individual IoT sensor data correspond-
sa (t) = δˆa (α i , β j ) + γk : i <= m, j <= n, k <= p (1) ing the DTpr oduct parameters. This fingerprint also captures the
sv (t) = δˆv (α i , β j ) + γk : i <= m, j <= n, k <= p (2) PTsyst em and Esyst em parameters during run-time. Then for lo-
calizing the faults, the deviation of the run-time IoT sensor data
sp (t) = δˆp (α i , β j ) + γk : i <= m, j <= n, k <= p (3)
is compared with the fingerprint. For updating the digital twin,
sm (t) = δˆm (α i , β j ) + γk : i <= m, j <= n, k <= p (4) a voting scheme is used to check if the majority of the finger-
Equations 1-4 represent the analog emissions as a result of the prints are deviating corresponding to few fingerprints. To infer
deterministic function δˆ(.) which is influenced by the digital twin the deviation in quality, we have proposed to estimate a function
parameters of the product, DTpr oduct , and the physical twin pa- Qd = fˆ(α, β, γ , sa (t), sv (t), sm (t)), sp (t)), where the Qd is a func-
rameters of the System PTsyst em , and the non-deterministic envi- tion of DTpr oduct , PTsyst em and Esyst em parameters, and the IoT
ronmental parameters Esyst em . Moreover, for each of the analog sensor data.The propose methodology is shown in figure 3. The
emissions, the total number of parameters (α, β, γ ) may not be same. various components of the proposed methodology is explained as
Traditionally, non-trivial simulation based approach such as finite follows:
element analysis is used to model the deterministic part and ex-
plore relation between the DTpr oduct , PTpr oduct , and PTsyst em . 3.1 DTpr oduct parsing
However, the PTsyst em parameters vary over time, and Esyst em
For generating the fingerprint of the DTpr oduct from the IoT sen-
parameters affect the PTpr oduct in a stochastic manner. Hence, we
sor data, first of all it is parsed to its corresponding parameters
explore the possibility of using IoT sensors to model and maintain
(α 1 , α 2 , . . . , αm ). The parsed values will depend on the type of man-
a live DTsyst em for product quality inference.
ufacturing system. In the experimental section, we will present the
parsing for an additive manufacturing system that uses G/M-codes.
2.3 Metric for quality measurement
These codes are the instruction that carries the process (machine
The digital twin can be used for various purposes. However, one specific parameters, such as temperature, acceleration values for
of the most fundamental uses of digital twin is in predicting the motors, etc.) and product information (for example the geometry
Key Performance Indicators (KPIs). Although the ultimate goal of description). The parsing will break down the individual parameters
the digital twin will be in predicting a variety of KPIs [36], in this from the product digital twin.
paper, we select quality as one of the KPIs. We will demonstrate
that by maintaining a living digital twin we can infer the possible
deviation in quality of the product. One of the quality metrics that 3.2 Feature extraction
is used is the dimension (Qd ) of the product. For generating the fingerprint from the analog emissions, in this
(α1,α2.. ., αm )
paper various time domain features such as Energy, Energy Entropy,
DTproduct Parsing Peak to Peak features (highest peaks, peak widths, peak prominence,
(γ1,γ2,…, γk)
etc.), Root Mean Square values, Skewness, Standard Deviation, Zero
PTSystem Sa(t)
Crossing Rate, Kurtosis (114 features in total) and frequency do-
Synchronize and
IoT Sensors
Extraction
Sv(t)
Features
Segment
Quality pling and testing the various window size (5 ms to 100 ms) for
Clustering
Localization
Inference
Algorithm Model highest model accuracy (50 ms in our case). These features have
been selected by calculating the Gini importance or mean decrease
Fingerprint Anomaly impurity of well-known time and frequency domain features (ż1000
Localization and Fingerprint in total) used for analysis of time-series data [37]. Principal Com-
Digital Twin Update Library
Algorithm ponent Analysis (PCA) is then performed to further reduce the
dimension of these features. Let m be the total number of reduced
Figure 3: Digital twin modeling methodology.
feature set, then all the features are concatenated for n total samples
to create a feature matrix O ∈ IR nxm .
3 BUILDING THE DIGITAL TWIN
As mentioned earlier, we need to build the digital twin from the
IoT sensor data to perform three tasks: run-time localization of
3.3 Synchronize and segment
faults, regularly update of the system digital twin, and infer the Before clustering is performed, the features are synchronized and
quality of the product digital twin. Hence, in this paper, the digi- segmented into subgroups based on the parsed DTpr oduct parame-
tal twin model consists of algorithms and models associated with ters (α 1 , α 2 , . . . , αm ). For instance, the features are segmented based
QUILT: Quality Inference from Living Digital Twins in IoT-Enabled Manufacturing Systems IoTDI’2019, Montreal, Canada
on conditions such as presence or absence of particular compo- used for testing while performing k-fold cross-validation [38] to
nent’s movement (for example, motors responsible for moving the validate the accuracy. For each cluster number, Line 7 estimates the
3D printer nozzle in X , Y , Z -Axes). By segmenting based on the clusters for the training set. Then, Line 8 measures the accuracy
parsed parameters, the features are reduced into smaller groups. of the estimated cluster with a specified silhouette score threshold
This allows for further reducing the complexity in acquiring the for the test set of features. Based on the obtained accuracy in Line
fingerprints. Henceforth, group is used to denote the sub-division 8, Line 9 to 11 select the cluster number, re-estimate the cluster
of the DTpr oduct parameters, which are different than the clusters and store the silhouette scores for all the groups and the channels
estimated in the subsequent sections. (acoustic, magnetic, power, and vibration signals).
The pseudo-code for generating the cluster and saving the finger- Algorithm 2 parses the features of the DTpr oduct either run time
print is presented in algorithm 1. Features with their corresponding or after the product’s physical twin has been created. Then, using
group and channel name are passed as input and the fingerprint in the fingerprint library it estimates the new cluster labels for the
the form of clusters and their corresponding silhouette scores are parsed features in line 5. Using these labels and the features the
given as output. First, Line 1 and 2 initialize the cluster numbers new silhouette coefficient for the parsed features are calculated in
and Silhouette Score Threshold for measuring the accuracy of the line 6 using Equation 5. If the calculated silhouette coefficient is less
cluster estimation. Then Line 3 splits the features into test and than the stored silhouette coefficient ± threshold SCT hr eshold then
train set. Normally 80% of the data is used for training and 20% is the DTpr oduct segment corresponding to the feature is marked as
IoTDI’2019, Montreal, Canada S. Rokka Chhetri et al.
deviating from the previous fingerprint and returned as containing ensemble of decision trees based regression models. This ensem-
a possible anomaly. Moreover, G/M-code adds layers to print the 3D ble generates a new tree against the negative gradient of the loss
object in sequential order. Hence, if a fault is detected at a certain function and combines weak learner to control over-fitting. Hence,
time, it can be correlated to locate its position in the 3D object. they are robust to outliers and outperform many other learning
algorithms as demonstrated in [40]. Since regression trees are used
3.6 Digital twin update algorithm as weak learners, we need to estimate various hyper-parameters
For updating the digital twin model, the library of fingerprint for such as learning rate, number of weak estimators, maximum depth
the DTpr oduct have to be updated. However, before updating the of the weak learners, etc., to improve the capability of the model
library, it should be checked if the anomaly in the fingerprint is to generalize. To do this, the collected feature matrix is divided
temporary or it is due to the degradation of the machine over time. into test and training set. Then, the testing and training accuracy
In order to update the digital twin, line 10 in algorithm 2 keep tracks is used to determine the hyper-parameters that best generalize the
of all the DTpr oduct variables that deviated. Then line 11 checks if function. This estimation function is also updated when the digital
more than f eatureT hr eshold of the DTpr oduct parameters deviated twin update algorithm reaches a consensus that all the fingerprints
from the previous fingerprint. Then line 14 checks if more than are outdated.
дroupT hr eshold of the groups deviated from the previous finger-
print. Finally, line 17 checks if more than channelT hr eshold of all 4 EXPERIMENTAL SETUP
the channels deviated. If these condition are met then in line 18 Acoustic Sensor
the library for the digital twin is updated. This threshold for check-
ing the deviation from the fingerprint can be varied for different
channels and groups based on the amount of information leaked Vibration
by each of the side-channels. Sensors Current
Sensor
vibration, three magnetic, and current sensors. We consider them for maintaining the digital twin of the system. Furthermore, in all
as 11 separate channels. Analog emissions from the additive manu- the training algorithms K-fold cross-validation has been performed
facturing system (or a 3D printer) are automatically collected using to measure the performance of the models and prevent over-fitting.
p19
National Instruments Data Acquisition (NI DAQ) system whenever
a print command is given to it. The analog emission acquisition p
p 3 p12
was carried out in a lab environment with sound pressure level p1 2 p
p10 11 p20 p21 p22
varying between 60-80 dB. The digital twin models are trained and
p6
estimated in a desktop computer with Intel i7-6900K CPU with 3.20 p4
p5 p15 p23 p24 p25
p
GHz clock frequency, 32 GB of DDR3 RAM, and 12 GB of NVIDIA p13 14
p26 p27 p28
Titan X GPU. Moreover, the digital twin models are stored and p9
p8 p18
p7 p17 Back
retrieved using pickle operation in Python. p16
Front
4.2 Digital Twin parameters Figure 6: Experimental setup for sensor position exploration.
The sample G/M-code (DTpr oduct ) consists of maximum six param-
eters, G/M code specifying whether it is machine instruction or 4.3 Sensor position analysis
coordinate geometry information, travel feed rate F of the nozzle One of the challenges in IoT sensor-based information extraction
head, the coordinates in XYZ -Axes each and amount of extrusion E. is figuring out a non-intrusive position of the sensors. This task
Various 3D test objects normally used for calibrating the 3D printer may also be machine specific. In our experiment, the 3D printers’
are downloaded from the open-source website [43] to extract sam- external surface is considered for non-intrusively placing the sen-
ple G/M − codes. The parsing algorithm in this case separates the sors. Total of 28 uniform positions are selected. For each of the
DTpr oduct based on presence or absence of 5 of these parameters, positions, vibration, acoustic, and magnetic sensors are placed and
G/M, X , Y , Z , and E. Hence, DTpr oduct = {α 1 , α 2 , . . . , α 32 } and data is collected for various DTpr oduct parameters. Then a gradient
there are 32 groups. When the manufacturing system is operating, boosted random forest is used to create simple classifier to estimate
the environmental parameters Esyst em = {γ 1 , γ 2 , . . . , γp } affect the accuracy of the model based on various sensor location data.
the physical twin parameters PTsyst em = {β 1 , β 2 , . . . , βn }. This The accuracy of the classifier is given as,
change in β eventually affects the DTpr oduct parameters, which TP +TN
in return affects the quality of the product. For example, environ- Accuracy = (6)
TP + T N + FP + FN
mental parameters such as humidity, temperature, etc., may affect Where TP stands for total true positives, TN stands for total true
the gearbox of the system, which in return may affect the flow-rate negatives, FP stands for total false positives and FN stands for total
of the manufacturing system. For experimental purpose changing false negatives. is taken as a metric for determining the placement of
the environment parameters Esyst em = {γ 1 , γ 2 , . . . , γp } was not the sensors around the 3D printer. DTpr oduct parameters selected
performed, instead we have assumed that this parameters eventu- for estimating the classifier consists of simple G/M-code instruc-
ally affect the β parameter. Hence, analog emissions (sa (t), sv (t), tions (such as presence or absence of stepper motors movement in
sm (t), and sp (t)) for various α parameters have been been collected X, Y, and Z-axes).
for optimal β parameters, and the environmental variability have Accuracy score of IoT sensor data is shown in figure 5, this score
been simulated by varying the β parameters beyond their optimal shows which of the sensors positions are capable of better clas-
values to check if the digital twin model is able to reflect those sifying the stepper motor movements. It may be noticed that for
changes. After collecting the analog emissions, time and frequency different positions the classification accuracy is different. Moreover,
domain features are extracted from each of them. Moreover, the these accuracy results also correlate the mutual information be-
DTpr oduct = {α 1 , α 2 , . . . , α 32 } consists of timestamps to segment tween the various sensor position and the side-channels themselves.
and synchronize the features. For initial training phase, various Based on these values, a single position is selected for each of the
G/M-code of the 3D-objects (cube, pyramid, cylinder, etc.) are given sensors. However, since four acoustic sensors are used, positions
to 3D printer and their corresponding analog emissions are col- with top four classification accuracy are selected for the sensor
lected. From them, we proceed to generate the fingerprint library placement.
Magnetic Sensor Data Accelerometer Sensor Data Acoustic Sensor Data
1
0.8
Classification Score
0.6
0.4
0.2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Positions
4.4 Performance of clustering algorithms printing in fused deposition modeling based 3D printers. However,
Mini Batch K Means Spectral Clustering due to sudden slippage, faulty filament, etc., the flow of the filament
may deviate from its nominal value.
Seperation of object into multiple segments
Normal Flowrate
Degraded Flowrate
Ward Agglomerative Clustering
Figure 9: Test DTpr oduct created using CAD tool for checking
anomaly localization capability of the digital twin.
Figure 7: Scatter plots of the clusters (plotted with the first The flow rate, a process specific parameter, is calculated as follows:
Q
two principal components of the features for five clusters for W ∗H =A= (7)
acoustic side-channel). v f eed
Various algorithms are explored for creating clusters to generate Where W is the width and H is the height of the line-segment
the fingerprints. Among them are Mini batch K-means, Spectral being printed on the XY-plane, Q is the constant volumetric flow
Clustering, Ward, Agglomerative Clustering, Birch, and Gaussian rate of the material. Q is estimated based on die swelling ration,
Mixture method. For each of the clustering algorithm, a varying pressure drop value and buckling pressure of the filament. And
number of clusters are initialized, and the corresponding silhouette v f eed = ωr ∗ Rr is the feed velocity of the filament. Where ωr is
coefficient is calculated for measuring the fitness of the features into the angular velocity of the pinch rollers, and Rr is the radius of the
these clusters. The average silhouette coefficients of the clustering pinch rollers. Then, the pressure drop is calculated as follows:
1
algorithms for all groups and channels are shown in figure 8, and Pmot or = △ P ∗ Q (8)
2
the corresponding scatter plots of acoustic side-channel for cluster Where, Pmot or is the pressure applied by the stepper motors, △P
number five is shown in figure 7. It may be noticed that although is the pressure drop. Hence, the pressure applied by the motor
the Agglomerative Clustering has a higher silhouette coefficient needs to be maintained for the constant volumetric flow rate. This
value, from the scatter plot, the clusters are not well distributed pressure needs to be less than buckling pressure which is calculated
in the scatter plot. However, the Birch clustering algorithm has as follows:
relatively higher silhouette coefficient with a better spread of the π 2 ∗ E ∗ d f2
cluster centers. Hence, Birch algorithm is selected for generating Pcr = (9)
the clusters for fingerprinting the DTpr oduct . Furthermore, the 16 ∗ L2f
number of clusters is also estimated based on the accuracy of the Where E is the elastic modulus of the filament, d f is the diameter
Birch algorithm using algorithm 1. of the filament, and L f is the length of the filament from the roller
Spectral Clustering Ward Mini Batch K Means
Agglomerative Clustering Birch Gaussian Mixture to the entrance of the liquefier present in the nozzle. A sudden
1
0.9 change in the pressure can cause the uniform flow of the filament
0.8 to be disrupted. For validating the application of the digital twin
in anomaly localization, the flow rate is varied outside the optimal
Silhouette Coefficient
Figure 10: Average receiver operating characteristic (ROC) curve for anomaly localization for various sensors.
Seperation of object into multiple segments
decision, initially the threshold is varied and the corresponding
4.632 mm
4.152 mm
3.937 mm
accuracy of the detection mechanism is measured.
Vib_z 0.7035
Vib_y 0.7124
Vib_x 0.7209
Flowrate (180%) Flowrate (100%) Flowrate (40%)
Mic_4 0.9337
0.8583
Figure 12: Three of the PTpr oduct created using CAD tool for
Mic_3
testing quality inference and update capability of the digital
Mic_2 0.9063
twin (DTpr oduct thickness is 4 mm).
Mic_1 0.8605
Mag_z 0.6950 not update) for the degraded flowrate (60%). Then based on the
Mag_y 0.6713 result of algorithm 2, the updated (or the old) digital twin is used
Mag_x 0.6991 to predict the class labels again for the same degraded flowrate
Current 0.5486 (60%) to see if the digital twin gets updated again. The result of
0.0 0.2 0.4 0.6 0.8 1.0 degradation analysis is presented in Table 1.
Accuracy
Figure 11: Accuracy of the digital twin’s anomaly localiza- Table 1: Degradation test result for the digital twin.
tion for each channel. Old Clusters New Clusters
Channel TNR FPR Update TPR FNR Update
Based on the highest accuracy acquired for each of the channels, Mic_1 0.97423 0.0257 True 0.6040 0.3960 False
the threshold is set and the corresponding classification accuracy for Mic_2 0.4962 0.5038 False 0.7460 0.2540 False
Mic_3 0.9705 0.0295 True 0.5731 0.4269 False
the segments that have been degraded is calculated. Corresponding Mic_4 0.9867 0.0133 True 0.9798 0.0202 False
to the varying threshold the Receiver Operating Characteristic Current 0.5924 0.4076 True 0.5545 0.4455 False
(ROC) curve for some of the channels is presented in figure 10. Vib_x 0.9324 0.0676 True 0.8681 0.1319 False
Vib_y 0.9695 0.0305 True 0.7224 0.2776 False
The accuracy of each channel in detecting the anomalous flowrate Vib_z 0.4602 0.5398 False 0.4400 0.5600 True
is shown in figure 11. Since the features are time stamped, the Mag_x 0.4791 0.5210 False 0.6718 0.3282 False
corresponding section of the DTpr oduct maybe calculated after the Mag_y 0.4382 0.5618 False 0.4344 0.5656 True
Mag_z 0.6267 0.3733 True 0.3669 0.6331 True
digital twin has marked the features to be anomalous. From figure
11, it can be seen that analog emissions from microphone number Table 1 consists of true negative rate, false positive rate and up-
four are more accurate in detecting the degradation of the flow date decision taken for each channel for the old cluster. When the
rate. This is due to the fact that this emission is collected by the system degrades, we expect the digital twin to find higher negative
contact microphone attached near the extruder’s stepper motor. labels being generated as the silhouette score will be lower than
Moreover, the average accuracy across all the channels in detecting the average silhouette score stored for all the channels and groups.
the anomalous flowrate is 83.09%. It can be seen that out of eleven channels four of them had the
decision of not updating the cluster, and seven of them opting for
4.6 System degradation prediction analysis updating the clusters. Hence, the clusters are updated by algorithm
For detecting the degradation of the System, and hence the need for 2. On the other hand, once the cluster has been updated, the analog
updating the digital twin, the flow rate for the entire DTpr oduct is emissions are labeled as true, hence we expect to see higher true
varied beyond the optimal range. From equation 7 to 9, it is evident positive rate and lower false negative rate. In the table 1, it can be
that various mechanical degradation (such as worn out rollers), seen that only three of the channels gave the decision for updating
stepper motor degradation over time, etc., may cause the flow rate the clusters again, however, eight of them opted for not updating
to be reduced over time. the cluster. This shows that the digital twin is able to update itself
To check if the digital twin model gets updated to reflect the during degradation that causes emissions in multiple side-channels
current status of the system, we perform two experiments. In the to vary.
first experiment, the current digital twin with its fingerprint library It may be noted that the side-channels gave different decisions
is used to predict the class labels (True for update and False for do for updating the digital twin. Out of them, the acoustic sensors
IoTDI’2019, Montreal, Canada S. Rokka Chhetri et al.
Update Decision
True False True True False True True True False True True
0.8
Faulty update decision by the Digital Twin
Mean Absolute Error (mm) 0.7 algorithm
0.6
0.5
0.4
0.3
0.2
0.1
0.0
40 50 60 70 80-120 130 140 150 160 170 180
*
Before 0.75 0.58 0.66 0.65 0.50 0.61 0.57 0.65 0.59 0.75 0.65
* After 0.58 0.72 0.54 0.55 0.51 0.54 0.54 0.57 0.69 0.61 0.63
*Before and after the update decision Flow rate (percentage)
Figure 13: Accuracy of the quality inference model.
and vibration side-channels were mostly able to predict the right in predicting the quality was 0.59 mm (calculated by averaging the
decision, whereas the magnetic sensors were mostly wrong in this mean absolute errors of the inference model after update decision).
decision. This also correlates with the accuracy values presented in 4.8 Comparative analysis
Figure 11. One anomaly to this is the current sensor data. However,
Although this paper presents a novel methodology of building a
it may be noticed that during both the decision model’s true positive
living digital twin by using IoT based sensors, there has been a
and true negative rates are very low compared to the acoustic and
considerable amount of work in quality prediction in additive man-
magnetic sensors. This means that the current side-channel data
ufacturing or manufacturing systems in general. In this section,
is not as reliable for the update decision as the acoustic and the
we provide a qualitative comparative study of the various non-
vibration side-channels.
exhaustive list of methods compared to the proposed methodology.
The result of the comparison is shown in Table 2. It may be ob-
4.7 Quality inference served that there are three general categories of research effort in
For checking the accuracy of the digital twin in inferring the devia- maintaining quality.
tion of quality (Qd ), first of all, gradient boosting based ensemble The first is the first principle-based approach (simulation) [26,
of regressors is used to estimate the function Qd = fˆ(.) for the 45], where quality inference model is based on the process and
optimal flowrate range (80% to 120%). For each flow rate, five test design parameters. These models although are accurate, they do
objects (with the thickness of 4 mm for DTpr oduct ) are 3D printed, not account system degradation over time and requires non-trivial
and for each test object, various segments (see figure 12) are cre- formulation of physics-based equations. The second category in-
ated to measure the thickness using the micrometer. Then all the volves in-situ process monitoring methodologies [44, 47]. These
groups of features lying in these segments are assigned a single methods monitor the process variation using high-end acoustic and
thickness value. Initially function Qd = fˆ(.) is estimated using piezoelectric sensors. Compared to these high-end sensor-based
optimal flowrate. Then it is used to infer the thickness of the 3D methods, our method is able to keep the model updated even using
object for various feature samples with varying flow rates. The low-end sensor data for fault localization and quality inference1 .
accuracy of the DTsyst ems quality inference model is measured The third category involves process monitoring using low-end
using mean absolute error value. sensor placement [30ś32]. They focus either on specific anomaly
The result of the quality inference is shown in figure 13. At first, detection or quality variation detection. However, these methods
the mean absolute error value of the inference model trained with do not consider checking the aliveness (up-to-date model) of the
optimal flowrate range is measured. It can be seen in the figure that model and are mostly limited to anomaly detection. Each of these
for optimal flow rate ranges, the mean absolute error value is around techniques has their own merit, hence, the proposed methodology
0.5 mm. Then, at each consecutive step the flow rate of the Ptsyst em is not intended to function independently but in conjunction with
is varied with a step size of +10% in the positive direction (> 120%) various approaches to fully realize the concept of digital twin.
and at the same time +10% in the negative direction (< 80%). It may
be seen that in both directions when the system ages (degrades with 5 DISCUSSION
an increase or decrease in the flow rate), the DTsyst em has increased Quality inference: To validate the proposed methodology, the
mean absolute error without the update. This is intuitive as the flow rate was used to detect anomalous system behavior and over-
DTsyst em has not been updated to the new fingerprints. However, all system degradation behavior. However, there can be multiple
once it has been updated the mean absolute error is lower. It may PTsyst em parameters that might affect the quality. However, our
also be noticed that when the system degraded with flowrates at methodology can be adjusted over time to consider variation in
160% and 50%, the wrong decision was taken by the algorithm 2 other PTsyst em parameters over time. We have considered only
in not updating the quality inference model. Due to this, a large dimension (thickness of a simple 3D object) as a quality metric.
increase in mean absolute error was observed for quality prediction 1 With high-end sensors as theirs, our methodology may achieve higher accuracy in
other DTpr oduct . However, this faulty decision was recovered in anomaly detection along with the capability of keeping the digital twin most up-to-
the consecutive stages. Moreover, the average mean absolute error date at the cost of more computational and resource requirements, which may not be
feasible for an IoT paradigm.
QUILT: Quality Inference from Living Digital Twins in IoT-Enabled Manufacturing Systems IoTDI’2019, Montreal, Canada
However, for building the full scale DTsyst em , multiple metrics are information by fusing these sensors may further justify the pro-
needed to be considered. We leave this as future work. posed methodology. The machine learning algorithm achieves this
More IoT sensors and placement: In this work, sensors with by carefully selecting the features extracted from the side-channels
low sampling rate and resolution were used. The number of sensors to build a model to get the highest possible accuracy. However, in
were limited as well. To improve the accuracy of the digital twin our future work, we will dwell deeper in performing mutual in-
techniques such as [13] needs to be incorporated for the develop- formation analysis to further clarify the contribution of individual
ment of IoT sensor arrays. We leave this as our future work. side-channels in indirectly building the digital twin.
Implementation using IoT device: For building the DTsyst em Generalizability of the proposed method: As a case study,
using IoT devices, further consideration is required for off-the-shelf we presented applicability of the proposed methodology in fused-
and wireless IoT devices [13]. Hence, further analysis is required deposition modeling based additive manufacturing. However, many
to understand the trade-off between power, time, and performance other technologies also have analog emissions (see Table 3), which
of the DTsyst em in localizing and inferring the quality variation. may be capable of aiding in the indirect method of building digital
More test cases: One of the limitations of the experimental twin models. Hence, we hypothesize that the proposed methodology
section was in using a limited number of test 3D objects for inferring will be able to scale across multiple manufacturing systems.
the quality and localizing the faults. However, these 3D objects
contain structures which provide large possible variation in G/M- 6 CONCLUSION
code for building the digital twin models. Nonetheless, the digital This paper presents a novel methodology to build a living digi-
twin models will be more accurate if large test data are incorporated. tal twin of the fused deposition modeling technology based addi-
We leave this as our future work. tive manufacturing system by utilizing various retrofitted low-end
sensors available in IoT devices to indirectly monitor the system
Table 3: Other additive manufacturing technologies. through various side-channels (such as acoustic, vibration, mag-
Technology
Source of analog emissions netic, and power). Based on these signals, a clustering algorithm
Acoustic Vibration Power is used to generate a fingerprint library, that effectively represents
Build Platform, Motor
SLA
Stepper Motor
Sweeper
Controller
the physical status or the physical twin of the system. The digital
Fabrication Power twin is used for localizing the anomalous physical emissions that
SLS Rollers
Piston supply have the potential of resulting in quality variation. For localizing
Moving head,
Build Tray, Heater, the error, the digital twin achieved an average accuracy of 83.09%.
MJ Blower,
Jetting Head Coil Moreover, we also presented an algorithm for updating the digi-
Position belt
SLM
Retractable Leveling Power tal twin, and inferring the quality deviation. As a case study the
Platform Cylinder Supply digital twin modeling was performed on additive manufacturing
Build Build High Voltage
EBM
Platform Platform Cable system. Compared to the state-of-the-art methods (which do not
SLA: Stereolithography
SLM: Selective Laser Melting
consider model aliveness), our methodology is able to update itself,
SLS: Selective Laser Sintering infer quality deviation and localize anomalous faults in the additive
EBM: Electron Beam Melting
MJ: Material Jetting
manufacturing system.
Sensor fusion analysis: In the motivation section, we provided
mutual information analysis for individual side-channels. We ac- ACKNOWLEDGMENTS
knowledge that a rigorous analysis of the calculation of mutual This work was partially supported by NSF CPS grant CNS-1546993.
IoTDI’2019, Montreal, Canada S. Rokka Chhetri et al.