Uncertainty Analysis in Ship Performance Monitoring Draft Under Review

Uncertainty Analysis in Ship Performance Monitoring
2014

L. Aldousa, T. Smithb, R. Bucknallc and P. Thompsond
a,b
Energy Institute, University College London, 14 Upper Woburn Place, London, WC1H 0NN,
UK
c
Department of Mechanical Engineering, University College London, Torrington Place, London,
WC1E 7JE, UK
d
BMT Group Ltd, Goodrich House, 1 Waldegrave Road, Teddington, TW11 8LZ, UK
Corresponding author: L. Aldous, email al.aldous@ucl.ac.uk, phone number +44 (0)7887 441349
Keywords: Uncertainty, Noon report, Ship performance, Measurement, Monte Carlo, Continuous
monitoring
1
Abstract
There are increasing economic and environmental incentives for ship owners and operators
to develop tools to optimise operational decisions, particularly with the aim of reducing
fuel consumption and/or maximising profit. Examples include real time operational
optimisation (e.g. ship speed and route choice), maintenance triggers and evaluating
technological interventions. Performance monitoring is also relevant to fault analysis,
charter party analysis, vessel benchmarking and to better inform policy decision making.
The ship onboard systems (propulsion, power generation etc.) and the systems in which it
operates are complex and its common for data modelling and analysis techniques to be
employed to help extract trends. All datasets and modelling procedures have an inherent
uncertainty and to aid the decision maker, the uncertainty can be quantified in order to
fully understand the economic risk of a decision, if this risk is deemed unacceptable then it
makes sense to re-evaluate investment in data quality and data analysis techniques. This
paper details and categorises the relevant sources of uncertainty in performance
measurement data, and presents a method to quantify the overall uncertainty in a ship
performance indicator based on the framework of the Guide to Uncertainty in
Measurement using Monte Carlo Methods. The method involves a simulation of a ships
operational profile and performance and is validated using 4 datasets of measurements of
performance collected from onboard in-service ships. A sensitivity analysis conducted on
the sources of uncertainty highlight the relative importance of each. The two major data
acquisition strategies, continuous monitoring (CM) and noon reported (NR), are compared.
Abbreviations
CM: Continuous Monitoring
NR: Noon Reports
2014
1. Introduction
Ship operational performance is a complex subject, not least because of the various
systems and their interactions in which a ship operates, the major factors are presented in
Figure 1. At the ship level the ship design, machinery configurations and their efficiencies
determine the onboard mechanical, thermal and electrical energy flows which, despite
automation built in to the configuration mode settings at the ship design phase, there is still
an appreciable level of human interaction during day to day operations. The environmental
conditions (sea state, wind speed, sea/air temperature etc.) are dynamic, unpredictable and
complicated to quantify, due in part to the characteristics of the turbulent flow fields by
which they are determined. These environmental conditions exert an influence on the
ships resistance and therefore the ship power requirements in differing relative quantities.
The rate of deterioration in ship performance (engine, hull and propeller) is dependent on a
vast array of variables; including the quality and type of hull coating and the frequency of
hull and propeller cleaning which are also dependent on the ocean currents, temperature
and salinity in which the ship operates. Further, the shipping industry operates in an
economic sphere in which the global consumption of goods and global energy demand, and
conditions in the various shipping markets determine operating profiles, costs and prices
(see for example Lindstad (2013) which also explores environmental effects). In addition,
technological investment, fuel efficiency and savings are complicated by the interactions
between ship owner-charterer-manager [Agnolucci (2014)].
Figure 1: Ship performance influential factors
Data collection, either through daily noon reporting procedures or high frequency,
automatic data acquisition systems, and data processing techniques such as filtering and/or
modelling have so far proven to be useful tools in capturing and quantifying some of the
intricacies and nuances of these interactions to better understand the consequences of
operational decisions. These datasets and modelling outputs are applied in areas such as
real time operational optimisation including trim adjustments, maintenance triggers,
predicting and evaluating the performance of new technologies or interventions,
2
2014
particularly for cost benefit analysis, fault analysis, charter party analysis, vessel
benchmarking and to better inform policy decision making.
The need to conduct uncertainty analysis is linked to the amplitude of the noise or scatter
in the data relative to the underlying, longer term trends that are to be extracted. The ship
system interactions give rise to scatter in the data, not only from inherent sensor precision
but also from unobservable and/or unmeasurable variables. According to the central limit
theorem (assuming independent, identical distributions), over time the scatter will tend to a
normal distribution with zero mean. This time period is dependent on the data acquisition
and processing strategy; the temporal resolution of sensors and data collection frequency,
the sensor precisions and human interactions in the collection process and the modelling or
filtering methods all play a part. There are also uncertainties in the data that will introduce
a potentially significant bias in the results and this too needs to be understood and
evaluated. The magnitude of the underlying trends to be identified are a function of the
modelling application; for evaluating the performance of new technologies the signal delta,
i.e. the improvement in ship performance, may be a step change of the order of 1-3% (as in
the case of propeller boss cap fins) or up to 10-15% as in the case of hull cleaning or new
coating applications where analysis of trends in the time domain is also necessary.
Therefore, the required measurement uncertainty depends on the application and this drives
the desired acquisition strategy which is of course constrained by costs; economic, time
and resources.
Acquisition strategies are broadly separated into two dominant dataset types. Noon report
(NR) datasets are coarse but cheap to compile and readily available since they are currently
in widespread use across the global fleet. The frequency of recording is once every 24
hours (time zone changes allowing) and the fields reported are limited, generally included
as a minimum are ship speed and position, fuel consumption, shaft rotational speed, wind
speed derived Beaufort number, date/time and draught. Given the economic and regulatory
climate there has been a shift towards more complete, automatic measurement systems
referred to in this paper as continuous monitoring (CM) systems. The uptake of these has
been limited by installation costs in service while improved data accuracy, speed of
acquisition, high sampling frequency (5minutes) and repeatability are cited as the key
drivers. All datasets and modelling procedures have an inherent uncertainty associated and
as a prerequisite the uncertainty must be quantified in order to fully understand the
economic risk of the decision. The benefit of reducing uncertainty is a function of the
uncertainty of the data relative to the change in ship performance, the former depends on
the data acquisition strategy (noon reports vs continuous monitoring, for example) and the
latter depends on the application (the technological / operational decision being made) and
the cost-benefit of both determine the overall economic risk of the decision. If the
economic risk is deemed unacceptable then it makes sense to re-evaluate investment in
data quality and data analysis techniques.
The uncertainty is also important because of the risks and costs that are associated with the
decision derived from the measured ships performance. The desire to quantify these (in
other industries) has led to the field of research into methods for risk based decision
making. The application of these methods to the shipping industry is also important, for
example, measurement and verification is cited as a key barrier to market uptake in fuel
efficient technologies and retrofit. In order to secure capital, investment projects must be
3
2014
expected to yield a return in excess of some pre-defined minimum [Stulgis (2014)].

Weighing the economic risk of capital investment against the certainty of the effectiveness
of a fuel efficient technology is therefore key. A study of the sensitivities of the uncertainty
in the ship performance measurement is pertinent to inform where resources can be
invested most effectively in order to reduce the overall uncertainty to the desired level; is
the cost of obtaining additional information outweighed by the value of the improvement
in the model from which the performance estimate is derived? [Loucks (2005)]. It is of
course not just financial but legislative drivers that are significant in the uptake of fuel
efficient technologies and modelling ship performance in this case is also important in
order to establish if new policies have been effective either from a fleet wide or total global
emissions perspective.
An overview of uncertainty analysis methods and their application to ship performance
measurement uncertainty is described in section 2. This paper is based on a similar
framework but also employs an underlying time domain algorithm to simulate the ships
operational profile and performance trends in order to propagate the errors through the
model by Monte Carlo simulation. Section 3 presents a brief overview of ship performance
methods and introduces the ship performance indicator used in this study, the sources of
uncertainty in this measurement are then detailed and quantified in section 5. This method
is validated using data from 4 ships; 1 continuous monitoring dataset and 3 noon report
datasets, the validation results are presented in section 5.4. A sensitivity analysis is
employed in section 7 to examine the effect of sensor characteristics and data acquisition
sampling frequencies on the performance indicator uncertainty given the length of the
performance evaluation period. Different acquisition strategies (based broadly on noon
reporting and continuous monitoring acquisition strategies) are then compared. The type of
data processing has also been considered and while this paper focuses on a basic ship
model using filtered data, there is ongoing work that explores how the uncertainty may be
reduced by using a ship performance model (normalising approach). Sections 8 and 9
present the results, discussion and conclusions.
2. Uncertainty Analysis Methodology

The aim of an uncertainty analysis is to describe the range of potential outputs of the
system at some probability level, or to estimate the probability that the output will exceed a
specific thresholds or performance measure target value [Loucks (2005)]. The main aim in
the uncertainty analysis deployed in the quantification of performance trends is to estimate
the parameters of the output distribution and to conduct a sensitivity analysis to estimate
the relative impact of input uncertainties.
Uncertainty analysis methods have evolved in various ways depending on the specific
nuances of the field in which they are applied. However, a key document in the area of
uncertainty evaluation is the Guide to the expression of uncertainty in measurement
(GUM) [JCGM100:2008 (2008)] which provides a procedure adopted by many bodies
[Cox (2006)]. The methods of each adaptation essentially distil to that of the original GUM
framework;
1. Identify each uncertainty source and classify
4
2014
2. Assign probability distributions and their parameters

3. Propagate the errors through the linearised model (also known as the data reduction
equations (DRE) or transfer function)
4. Formulate the output distribution of the result and report overall uncertainty
The GUM framework is itself derived in part from the work of Coleman (1990). The
nomenclature and definitions of Coleman and Steele are consistent with those of the
ANSI/ASME standard on measurement uncertainty; precision error is the random
component of the total error, sometimes called the repeatability error, it will have a
different value for each measurement, it may arise from unpredictable or stochastic
temporal and spatial variations of influence quantities, these being due to limitations in the
repeatability of the measurement system and to facility (equipment / laboratory) and
environmental effects. The bias error does not contribute to scatter in the data but is the
fixed, systematic or constant component of the total error; in a deterministic study it is the
same for each measurement. Both types of evaluation are based on probability
distributions.
The GUM specifies three methods of propagation of distributions:
a. The GUM uncertainty framework, constituting the application of the law of
propagation of uncertainty
b. Monte Carlo (MC) methods
c. Analytical methods
The analytical method gives accurate results involving no approximations however it is
only valid in the simplest of cases while methods a. and b. involve approximations. The
GUM framework is valid if the model is linearised and the input probability distribution
functions (pdfs) are Gaussian, this is the framework followed by the AIAA guidelines
[AIAA (1999)] and the ITTC guide to uncertainty in hydrodynamic experiments (ITTC
2008) of which relevant examples include applications to water jet propulsion tests (ITTC
2011) and resistance experiments (ITTC 2008) and (Longo 2005). If the assumptions of
model linearity and Gaussian input pdfs are violated or if these conditions are questionable
then the MC method can generally be expected to lead to a valid uncertainty statement that
is relatively straightforward to find. The Monte Carlo method was applied in the shipping
industry by Insel (2008) in sea trial uncertainty analysis. A further advantage of the MC
method is that a more insightful numerical representation of the output is obtained which is
not restricted to a Gaussian pdf.
Taking into account the different methods that can be applied in the case of ship
performance analysis, the nature of the data available and the uncertainties associated with
that data (both aleatory and epistemic), this paper follows the GUM approach and the
errors are propagated through the model using the MCM. Instrument precision is evaluated
from repeated random sampling of a representative pdf and instrument bias / drift is
evaluated by elementary interval analysis with deterministic bounds. An output distribution
is formed at each time step and the overall precision is based on the distribution of the
coefficient of a linear regression of the performance indicator trend over repeated
simulations. Further details of the method can be found in section 4.
5
2014
3. Ship Performance Methods

One aim of ship performance monitoring is to quantify the principal speed/power/fuel
losses that result from the in-service deterioration (hull, propeller and engine). A
performance indicator may be defined to identify these performance trends; performance is
the useful output per unit of input. For a ship the most relevant unit of input is the fuel
consumption. Sometimes, its useful to isolate the performance of the hull and propeller
from that of the engine, in which case it is the shaft power that becomes the input to the
performance indicator estimation. The most aggregate unit of output is transport supply,
which for constant deadweight capacity and utilisation, ultimately comes down to the
ships speed. One of the most basic methods to extract this information is to control all
other influential variables (weather, draught, trim, water depth etc.) by filtering the dataset.
An alternative method is to normalise each influential variable to a baseline by employing
a model that quantifies the ships power/fuel for all environmental / operating conditions.
The main problem of normalising is that the model used for the corrections may lead to
uncertainties that arise from incorrect model functional form (or model parameters,
depending on the training / calibration dataset integrity) due to either omitted variables or
unknown effects. Instrument uncertainty also becomes important for all the input variables
to the correction algorithm. On the other hand, from a richer dataset trends may be derived
that have a more comprehensive physical significance and a larger dataset also reduces the
uncertainty making the results applicable to both short and long term analysis. The filtering
approach is easy to implement and interpret however filtering data removes many
observations and the analysis must be done over a longer time period in order to collect
adequate volumes to infer statistically significant results and to reduce the uncertainty to a
level appropriate to the application. There is therefore a trade-off between uncertainty
introduced due to the model form and the then required additional instrument uncertainty
and the uncertainty arising from a reduced sample size, this is the focus of ongoing work.
In this study the influential effects are controlled for by applying a filtering algorithm and
the correction model in the simulation is therefore a function of ship speed and draft. A
cubic speed-power relationship is assumed (ideally this would be sought from sea trials)
and the admiralty formula is used to correct for draught, a simple linear relationship
between displacement and draught is assumed.
4. Method
The simulation of the performance indicator is based on the difference between the
expected power and the measured power. The measured power is equal to the expected
power plus some linearly incremented power increase over time to simulate the effect of
ship degradation (hull / propeller), the simulation also allows for the inclusion of some
instrument error (precision, drift and bias) and model error. A sampling algorithm is able to
simulate the effect of sample averaging frequency and the change in the number of
observations. Further details regarding the nature and magnitude of these data acquisition
variables are provided in sections 5, 6 and 7.
The method is summarised by the following steps (Figure 2):
6
2014
1. Identify sources of uncertainty (instrument / model) and assign pdfs

2. Actual Power, Ptrue,i: Define underlying ship performance (the truth) at the
maximum temporal resolution;
a. Operating profile: The ships loading condition is assumed to be 50% loaded
and 50% ballast, the ships voyage speed variability is represented by a
Weibull distribution.
b. Environmental effects: The effect of small changes in the weather (within
the 0 < BF < 4 filtering criteria) is allowed for by the inclusion of some
daily speed variability, assumed to be normally distributed. This also
includes other small fluctuations that may not be filtered for (acceleration
for example)
c. Time dependent degradation: Assumed to be linear with time (power
increases ~5% per year) and independent of ship speed and draught, the
ships speed decreases to simulate constant power operation
3. Average actual power / speed / draught according to averaging frequency, fave
4. Measured Power, Pmeas,i: Add instrument uncertainties to Ptrue as random samples
from pdfs assigned in step 1.
5. Expected Power, Pexp,i: Reverse calculate from measured power using the same
model as in step 2 (assuming no degradation), add model uncertainty as random
samples from pdfs assigned in step 1
6. Percent change in Power, %Pi : 100(Pmeas,i - Pexp,i)/ Pexp,i
7. Repeat 4 to 6 n1 times at each time step to define the parameters, mean and
standard error (! ,si), of the pdf for %Pi at each time step
8. Randomly sample from each %Pi for the evaluation period length, teval and
according to the number of observations, N, and calculate overall %P by linearly
regressing %P on time and finding %P at time = teval,end
9. Repeat step 8 n2 times to find the distribution parameters of the result, precision is
twice the standard error (95% CI), the effect of bias is indicated by a change in the
mean.
2014
Figure 2: Diagrammatic presentation of the method
5. Sources of Uncertainty in Ship Performance Monitoring

There are many different classification structures for the sources of uncertainty proposed in
the literature, different both between industries to incorporate field specific nuances field
and also within industries according to the specific application. Figure 3 outlines the key
sources of uncertainty relating to ship performance monitoring.
2014
Figure 3: Source of uncertainty in ship performance monitoring
The sources of uncertainty presented in Figure 3 are discussed in detail in the sections that
follow. Table 1 presents the relevant data acquisition variables (pdf and sampling
algorithm parameters) input to the MC simulation and which are categorised as two
different data acquisition strategies; noon reports (NR) and continuous monitoring (CM).
DAQ decision variable
Baseline input,
NR
90
270
24
72
5.00
1.00
1.00
1.00
1.74
9.00
Evaluation Period teval, days

Number of observations, N
Shaft power / FC sensor precision (1), %
Ship speed (STW) sensor precision (1), %
Draught sensor precision (1), m
Averaging frequency, fave samples/day
Daily speed variability, %
Model precision error, %
Baseline input, CM
90
2160
270
6480
0.51
1.00
1
96
1.74
9
Table 1: Continuous monitoring and noon report data inputs
5.1
Sampling
Sampling error arises from the fact that a finite sample is taken from an infinite temporal
population and because variable factors are averaged between samples. Both of these
effects are investigated through a sampling algorithm that samples from the underlying
ship performance simulation which is based on a maximum temporal resolution of 4
samples per hour.
The number of observations in Table 1 are based on a sample rate which is, realistically,
less than the sample averaging frequency because of periods when the ship is at anchor,
9
2014
manoeuvring, waiting or in port and because of the filtering criteria. The days spent at sea
depends on ship type and size, figures from the latest IMO GHG study (Smith 2014) show
that for 2012 the average number of days at sea for all sizes of tankers, bulk carriers,
containers and general cargo ships was 53%. This is similar to the results found from the
397 day CM dataset analysed in this study (51.5% at sea days), the results of which are
presented in Table 2.
All data, 397 days

At sea data
Wind < BF 4
Inliers
Water depth > min*
Number of Observations
38112
19618
12799
12798
9570
% of All
100.0
51.5
33.6
33.6
25.1
Table 2: Effect of filtering on the number of observations of a CM dataset. *See equation [1]
The proportion of the results that are removed due to filtering depends on the
environmental / operational conditions. From the CM dataset used in this study, after
filtering, 25% of the observations for use in the ship performance quantification remained.
The NR datasets that were used in this study showed a high variation in the data filtered for
environmental conditions with 50% being the average, meaning that of the possible
number of samples 26.5% were used in the ship performance quantification (generally NR
data is only recorded while at sea).
The averaging frequency is either equal to the underlying temporal resolution of the true
ship performance parameters (96/day) as in the CM baseline, or 96 samples are averaged
for a daily frequency (NR baseline). To simulate the effect of daily averaging of noon
reported variables, a daily speed variability is included at a level of 1.74% of the ships
speed. This was the outcome (median) of a study of the CM dataset covering a 396 day
time period after filtering for outliers and wind speed > BF 4 (not sea water depth since
this data is not generally recorded in the NR data) . The variability is due to the effects of
ship accelerations, rudder angle alterations, wind speed fluctuations within 0 < BF < 4
range, and crew behaviour patterns (slow steaming at night for example), which arent
included in the ship performance model or cant be filtered out in the NR data set. The
variability is introduced into the simulation through the addition of normally distributed
noise to the underlying true ship speed. The effect of averaging daily draught variability
(i.e. due to intentional changes in trim or due to influences on trim due to fuel
consumption) is not included but assumed to be negligible.
The number of observations is also affected by the evaluation period length and the
sensitivities of the overall uncertainty to this are investigated by evaluating both 90 day
and 270 day periods. There is seen in some datasets the effect of seasonality, whereby
weather fluctuations (even within the BF filtering criteria) cause discontinuities in the
measured power and therefore a minimum of one years data collection is required. The
presence of this is dependent on the cyclic nature and the global location of the ships
operations and is difficult to generalise, it is therefore not included in this analysis.
10
5.2
2014
Instrument Uncertainty
Instrument uncertainties arise from variations attributable to the basic properties of the
measurement system, these properties widely recognised among practitioners are
repeatability, reproducibility, linearity, bias, stability (absence of drift), consistency, and
resolution (ASTM 2011). Since the truth is unknown then bias, and by extension drift
(change in bias over time), are epistemic uncertainties, and are assumed to be zero for all
measurements in the baseline. Their effect on the overall uncertainty is investigated in the
sensitivity analysis by an elementary interval analysis.
Some manufacturers quote sensor precisions and/or accuracy which are assumed to reflect
manufacturing tolerances, however these are not well documented or publicly available
and also they are more likely to quote sensor repeatability calculated under laboratory
conditions (same measurement system, operator, equipment) or reproducibility (different
operator) and therefore likely to under estimate the actual precision in service conditions
where different operators, equipment/facility and measurement system may increase the
overall precision. Quoted shaft power sensor accuracies range from 0.25% (Seatechnik,
http://www.seatechnik.com/) to 0.1% Datum Electronics (http://www.datumelectronics.co.uk/). The IMO resolution Performance Standards for Devices to Indicate
Speed and Distance (IMO 1995) stipulate that errors in the indicated speed, when the ship
is operating free from shallow water effect and from the effects of wind, current and tide,
should not exceed 2% of the speed of the ship, or 0.2knots, whichever is greater. The shaft
power in the CM is the addition by quadrature of an RPM sensor error of 0.1% (1) and a
torque sensor error of 0.5% (1). Discussions with industry experts and experience of ship
performance datasets suggest that the instrument precisions quoted in Table 1 are
appropriate, however they are estimates and the sensitivity of the performance indicator to
changes in these is explored in section 7. All sensors are assumed to be consistent during
the simulation (no change in repeatability over time) and linear (absence of change in bias
over the operating range of the measurement instrument). Sensor resolution is assumed to
be reflected in the quoted sensor precision and not a restricting factor in the uncertainty of
ship performance measurement.
In the NR dataset, fuel consumption is generally recorded rather than shaft power; fuel
flow meter accuracy ranges between 0.05 per cent and 3 per cent depending on the type,
the manufacturer, the flow characteristics and the installation. Fuel readings by tank
soundings are estimated to have an accuracy of 2-5% [Faber (2013)]. This uncertainty
estimate does not include however additional uncertainties associated with the fuel
consumption, for example fuel oil quality variation (grade and calorific value), pulsations
in the circulation system, back flushing filters, the effect of cat-fine, waste sludge from
onboard processing and inaccuracies in ullage measurements due to weather, fuel
temperature etc. The use of fuel consumption as a proxy for shaft power is included by
assuming a power sensor precision of 5% for the NR strategy.
The draught uncertainty associated with draught gauges is of the order 0.1m, in noon
report entries the draught marks at the perpendiculars may be read by eye and, depending
on sea conditions, a reading error of 2 cm can be assumed (Insel 2008). However, noon
report entries are often not altered during the voyage to record variability due to trim
11
2014
and/or the reduction in fuel consumption and therefore a greater 1.0 m uncertainty is
assumed. Often the draught field in CM datasets is manually input and therefore the same
uncertainty applies.
5.3
Model Uncertainty
The model parameter uncertainty depends on whether a filtered or normalised (corrected

by modelling interactions between variables) dataset is used to calculate the performance
indicator.
For a normalised dataset, model parameters and model form may be based on theoretical or
statistical models, both of which have associated uncertainties. For example, the speedpower relationship is often approximated to a cubic however in reality the speed exponent
may for example be between 3.2, for low speed ships such as tankers and bulk carriers, and
4.0 for high speed ships such as container vessels (MAN 2011). In a statistical ship
performance model, the model parameter uncertainty is a result of the sampling error that
occurs during the calibration/reference period. From this period a training dataset captures
the ship performance in terms of a larger array of influential environmental/operational
factors and the correction model is defined. Therefore some sampling error exists because
the training dataset is a finite sample taken from an infinite temporal population, the
instrument uncertainty discussed in the previous section will also be present in the training
dataset. Model form uncertainty may arise because the variation in the data during the
reference period is insufficient to capture the interaction between variables in the baseline
performance, for example, if the ship is operated at only one draught then there will be no
way to define the general relationship between draught, shaft power and other variables.
For the filtered approach applied in this paper, the model parameter uncertainty is from the
sea trial dataset. This has been investigated by Insel (2008) who found that from sea trial
data of 12 sister ships that represented a wide variety of sea trial conditions a precision
limit of about 7-9% of shaft power can be achieved. This uncertainty represents both
model uncertainties (due to corrections applied to trial measurements) and instrument
uncertainties for the measurement of torque, shaft rpm, speed, draught and environmental
measurements etc. The author identifies the Beaufort scale estimation error as the key
measurement error affecting the overall sea trial data uncertainty. The magnitude of the
uncertainty induced in this method is assumed to be the same for both CM and NR
simulations because the sea trial data is from the same source in both cases. It is notable
however that because reliable methods do not currently exist, the influence of currents,
steering and drift are not corrected for, for more details see Insel (2008).
Model form uncertainty is notoriously difficult to quantify, in this paper the expected
power and the measured power are based on the same model therefore the model form is
assumed to be correct and no model form error is accounted for. Model form error in
reality may arise from, for example, unobservable or unmeasurable variables that are not
filtered for, the effect of these are exacerbated in NR datasets because fewer variables are
recorded (such as acceleration and water depth).
12
5.4
2014
Human Error
Human error (which is often categorised as instrument uncertainty) may occur in any
measurements when operating, reading or recording sensor values if the report completion
is not automated. For example the noon data entry may not occur at exactly the same time
each day, the recording of time spent steaming may not be adjusted to compensate for
crossing time zones and it is possible that different sensors are used to populate the same
field, for example, some crew may report speed from the propeller RPM and others report
speed through water. The measurement of wind speed through crew observations may also
be subject to uncertainties. Human error is not included in this current analysis.
6. Experimental Precision Comparison

The uncertainty of the performance indicator as calculated from the propagation of
elemental uncertainties via the simulation method (NR baseline) was validated against the
precision calculated from experimentally collected data from the same number of
observations over the same length evaluation period. The overall uncertainty in both
methods include uncertainties from the measurement sensor repeatabilitys, the sampling
uncertainty (frequency and averaging effects) and model parameter uncertainty while only
the experimental method also includes uncertainty from human error, model form, missing
model parameters not filtered out (particularly in the NR dataset) and any possible
measurement sensor non-linearity, bias or drift, the magnitudes of which are unknown. The
data is filtered according to the following criteria where data is available, (outliers are
removed manually):
Mean draught is between ballast and design

True wind speed is between BF 0 and BF 4
Ship speed is greater than 3 knots
Water depth is greater than the larger of the values obtained from the two formulae:
!!
= 3 ! = 2.75
(1)
where
h: water depth [m]

B: ship breadth [m]
TM: draught at midship or mean draught [m]
Vs: ship speed [m/s]
g: gravitational acceleration (9.81 m/s2)
Table 3 shows the comparison of the methods, the simulation method through error
propagation of elemental uncertainties tends to over-estimate the uncertainty.
13
Evaluation
period length
(days)
200
265
377
373
CM 1
NR 1
NR 2
NR 3
Number of Error propagation

observations
of elemental
uncertainties, %
1392
2.16
70
9.42
81
8.67
62
10.22
2014
Statistical
uncertainty
calculation*, %
1.4
8.78
6.00
4.95
Table 3: Statistical uncertainty calculation vs elemental uncertainty propagation for method 2, error
reported as the confidence interval of P (95% level) as a percentage of mean shaft power (CM) or
mean fuel consumption (NR)
Figure 4 shows the experimental precision from the continuous monitoring dataset and the
simulation results. The y-axis is the change in the number of samples for the same
evaluation time period. The performance indicator uncertainties (plotted as a percentage of
mean shaft power over the evaluation period) are similar, although again, slightly over
estimated in the simulation. However the difference is again minimal and the comparison
provides evidence in support of the use of the simulation and propagation of elemental
uncertainty in this way.
Simula>on and Experimental Precision Comparison for Varying

Sampling Sizes
% change in nr of samples (from baseline)
0
0
10
-10
Change samples
per day,
experimental
precision
-20
-30
-40
-50
Change samples
per day,
simulaAon
-60
-70
-80
-90
-100
Condence Interval, % of mean power
Figure 4: Simulation and experimental precision comparison for varying evaluation periods and
sampling frequencies
14
2014
7. Sensitivity Analysis
The sensitivity indices relate to the precision of the uncertainty in the overall performance
indicator. The bias in the result is investigated and graphically represented also.
7.1
Relative Effects of Data Acquisition Factors
A local sensitivity analysis was conducted by computing partial derivatives of the output
function (y) with respect to the input factors (xi). The output function, y is the precision of
the measurand; twice the standard error of the P at time = teval (see section 4). The input
factor is one of the data acquisition parameters of Table 4. Each factor is altered onefactor-at-a-time (OAT) and a dimensionless expression of sensitivity, the sensitivity index,
SIi is used to measure changes relative to the baseline.
! =
!
.
!
The input parameter interval bounds reflect results from studies of actual data where
possible or, in the absence of data, estimates of realistic values are made given the source
of the uncertainties as detailed in section 5. Sensor precisions are induced by the addition
of a normally distributed error, and in the baseline this is assumed to have a mean of zero
and a standard deviation according to Table 4. Sensor bias is introduced by a deterministic
offset to the mean of the normal distribution of the error.
Evaluation Period (days), teval
Number of observations, N
Power sensor precision (1) %
Power sensor bias %
Draught sensor precision (1) m
Draught sensor bias m
Ship speed sensor precision (1) %
Ship speed sensor bias %
Ship speed sensor drift %/270 days (linear
increase)
Averaging frequency, fave, sample/day
Daily speed variability, %
Model precision error, (1) %
Baseline input
90
270
24
72
1
0
0.1
0
1
0
90
45
SA input
270
135
5
3
1.5
1
5
3
1
1.74
1.5
96
3.76
9
Table 4: Adjustments to the baseline for the sensitivity analysis (SA)
The increase in the sample size from 25.0% or 26.5% to 50% (presented in the table by the
increase in number of samples, N) is to investigate the effect of increasing the sample size
to that of a dataset that is normalised rather than filtered by increasing the number of
observations to the total observations before filtering. The daily speed variability SA input
represents the upper interquartile of the results of an investigation conducted on a CM
dataset of time period length of 396 days. Model precision error represents the results of
15
2014
the study by (Insel 2008) who found 7-9% of shaft power to be the precision error in the
sea trial data of 12 sister ships studied. If the ship reporting is speed over ground (SOG)
rather than speed through the water (STW) then this will increase the uncertainty; a study
of the difference between STW and SOG for 20 ships over 3 to 5 years indicates a standard
deviation of 0.95knots, (Smith 2014). The effect of using SOG as a proxy for STW is
therefore investigated by increasing the speed sensor precision to 5%. Likewise, power
sensor precision is to reflect the effect of fuel consumption as a proxy for shaft power.
7.2
Comparison of Data Acquisition Strategies
The second study is to compare the uncertainty for the data acquisition parameters of a
standard CM dataset with the data acquisition parameters of a standard NR dataset. The
measurand is the uncertainty (SE at 95% confidence level) reported as a percentage of the
parameter of interest, for the purposes of this investigation, the parameter of interest is the
change in ship performance over time, i.e. increase in shaft power for a given speed and
draught (e.g. due to hull fouling), this was set to linearly increase by 18% over 6months.
This is a practical metric since it puts the uncertainty in a useful context and is of greater
interest when comparing the data acquisition strategies however it is worth highlighting
that the uncertainty measured in this way is not only representative of the impact of all the
elemental uncertainties but it is also a function of the magnitude of the performance
indicator itself and consequently of the time period of evaluation. For example, if the rate
of deterioration was actually 10%/year and not 5% or if the evaluation time period was set
to increase from 90 days to 270 days then in both cases the uncertainty would naturally
decrease even if the elemental uncertainties were unchanged
This was carried out for each of the two baselines (CM and NR, Table 1) and for two time
periods of evaluation; 90 days and 270 days. This experiment is repeated once with an
increase in number of observations (N) for both, and once with a reduction in model
precision error Table 5.
Evaluation Period days
Number of observations
N
Shaft power sensor
precision (1) %
Ship speed sensor
precision (1) %
Ship draught sensor
precision (1) m
Averaging frequency
/day
Model precision error
(1) %
Alternative Alternative 1 Alternative

1 NR
CM
2 NR
90
270
90
270
90
270
48
144
4320 12960
24
72
Alternative
2 CM
90
270
2160
6480
0.51
0.51
96
96
1.5
1.5
Table 5: Input parameters for comparison of data acquisition strategies
16
2014
8. Results and Discussion

8.1
Relative Effects of Data Acquisition Factors: Precision
This section looks at sensitivities of the performance indicator uncertainty relative to the
variations in the data acquisition decision parameter uncertainties (see section 7.1). The
sensitivity index indicates the change in the parameter of interest (the precision of overall
uncertainty quantified by 2) relative to the percent change in the input quantity
examined (the data acquisition variable) and then weights it according to the relative size
of each. For example, an SI of unity, similar to the case of the speed sensor precision, SI =
0.96 (Figure 5), indicates that the change in overall uncertainty due to a change in the
precision of the speed sensor is almost exactly offset by the magnitude of the ratio of its
absolute precision (5%) to the resultant overall uncertainty.
Speed driL
Speed bias
Daily speed variability
Power bias
Draught bias
Daily averaging
Power precision
Sample frequency
Draught precision
Model precision
Speed precision
0
0.1
0.2 0.3
90 days
0.4 0.5
270 days
0.6
0.7 0.8 0.9

1
Sensi>vity Index
Figure 5: Sensitivities of the performance indicator uncertainty to model, instrument and sampling
input uncertainties (evaluation time period: 90days)
The exponent in the relationship between speed and power means that the performance
indicator uncertainty is more sensitive to the speed sensor precision than either of the
draught or shaft power sensor precisions. This highlights the importance of investment in
high precision speed sensor, and the criticality of using speed through the water sensor
rather than speed over ground (which may cause a precision of 5%, see section 7.1) which
will have dramatic consequences for obtaining meaningful information from the data. CM
systems that augment core data acquisition with additional measurement of GPS, tides and
currents would provide a means to independently calculate and verify the speed through
water measurement from speed over ground. The results emphasise the importance of a
precise draught measurement which may be manually input, even in continuous monitoring
systems, the overall uncertainty would benefit from investment in a draught gauge which
also records potentially significant variations during the voyage (including in trim) rather
than recording the draught at the last port call. If fuel consumption is used as a proxy for
shaft power, which may cause an inaccuracy of 5% (due to uncertainties in the fuel
17
2014
consumption measurement) then there could be a significant effect also on the overall
uncertainty. The effect of sensor precision reduces over a longer evaluation period (270
days relative to 90 days) because over time the variability cancels and stabilises towards
the mean.
Sensor bias has a lesser relative effect on the overall precision, the order of significance is
the same as for precision; speed, draught then power. For the speed sensor, the bias and
drift cause the precision in the overall PI to increase because the increased Vmeas is
compounded by the cubic relationship which causes P to drift (see next section, Figure 6)
and therefore decreases the precision in the linear trend line of the PI. The drift in P
means that the effect of bias and drift in Vmeas increases for a longer evaluation period. It is
worth highlighting however that the presented magnitude of the effect is influenced by the
input parameters of the ships operational profile and the rate of speed reduction over the
time period. There is a similar but reduced effect occurring due to the draught and power
bias also causing P to drift as the evaluation period increases but this is to a much lesser
extent and small relative to the counteracting effect of increasing sample size which
reduces the overall precision as the evaluation period increase.
The uncertainty in the performance indicator is second most sensitive to changes in the
model precision error, the model is the calibration or correction factor applied to correct
for the ships speed as established from sea trial data which will have some level of
uncertainty. The effect on overall uncertainty is significant because of how this translates
to Pmeas through a cubic relationship. Instead of sea trial data, a continuous monitoring
calibration/reference dataset representing a short period of operation may be used to derive
the ship speed/power performance curve (the ship performance should be as close to
stationary as practically possible during the time period). The advantage of the latter
method is the increased number of samples in the dataset, even after filtering for
environmental / operational conditions, which reduces the uncertainty. The CM dataset
will also include the effect of technological enhancements made since the sea trial dataset
was compiled; the effects of these more recent interventions may not vary linearly with the
model input parameters and may therefore be incorrectly attributed to ship performance
changes as measured during the evaluation period. The effect of model precision reduces
over a longer evaluation period because over time the variability cancels and stabilises
towards the mean.
Increasing the sampling frequency is also significant because the overall standard error is
inversely proportional to the square root of the sample size, the absolute values, when
increased on a samples per day basis cause the effect over 270days to be more significant
relative to 90 days. An increased sample size may be achieved by addressing root causes of
outliers or missing data in the datasets (i.e. due to stuck sensors or human reporting errors);
although this is also limited by the ships at-sea days and the days the ship operates in
environmental/operational conditions outside the bounds of the filtering algorithm. There
is therefore a considered trade-off to be made between the sample size and the model error
which reflects the advantages of filtering over normalising (see follow up paper).
The other sampling effect is related to the daily averaging frequency, the impact of this on
the overall uncertainty is because the daily environmental fluctuations are captured;
averaging a range of speeds will cause power due to higher speeds to be mistakenly
attributed to deterioration in ship performance that in reality is not present. The actual
18
2014
influence on uncertainty is of course a function of the daily speed variability, and this
interaction is not explicitly studied here however the 1.78% found from the data is realistic
and the SA provides the relative significance under the assumption of linearity. The daily
averaging effect is independent of evaluation period length because the daily speed
variability is assumed to be constant over time.
8.2
Relative Effects of Data Acquisition Factors: Bias
Generally, the effects of the data acquisition factors studied do not affect the absolute mean
of the performance indicator (the bias component of the overall uncertainty). Instrument
bias however not only affects the precision component (as demonstrated in the previous
section) but also biases the result; this is observable in Figure 6.
Performance Indicator Bias, 90 day evalua>on period
Monte Carlo Simula>on: Average Performance
Indicator, kW
4000
3000
2000
1000
-1000
-2000
Figure 6: Performance indicator bias with error bars indicating the confidence interval (95% level).
The baseline performance indicator is highlighted by the red line
The effect of increasing the STW sensor bias from 0 to 5% is to increase the average
performance indicator value from 790 kW to 1196 kW. The graphic is indicative only
since the exact magnitude of the effect is to some extent dependent on the operational
profile but it is clear that in cases of sensor bias and drift the actual underlying
deterioration trend is difficult to identify.
The error bars in Figure 6 shows how results based on daily averaging for a short 90 day
time period may be inconclusive; if for example, speed sensor precision is 5% then it might
19
2014
not be possible to conclude (with 95% confidence) if the ships performance has improved
or deteriorated.
8.3
Comparison of Data Acquisition Strategies
This sensitivity index is detailed in section 7.2. Figure 7 shows the results from the MC
methods deployment to calculate uncertainty for a range of changes to the baseline data
acquisition variables. As described previously, the uncertainty measured in this way is not
only representative of the impact of all the elemental uncertainties but it is also a function
of the magnitude of the performance indicator itself and consequently of the time period of
evaluation.
Changes to Input Uncertain>es
Eect of Input Uncertain>es on CM and NR baselines for

Dierent Evalua>on Periods
AlternaAve 2
AlternaAve 1
2
4
2
3
Baseline
0.00
26
16
159
25
22
165
36
235
50.00
100.00
150.00
200.00
250.00
Uncertainty as a Percentage of the change in Ship Performance

CM 270 days
CM 90 days
NR 270 days
NR 90 days
Figure 7: Simulation uncertainty sensitivity analysis results
In all cases the simulation based on the CM baseline is demonstrating a significant

improvement in uncertainty compared to the NR baseline for the same evaluation periods.
For noon report data, a short evaluation period (3months) does not give useful results since
the uncertainty is greater than 100% of the parameter of interest. This is to be expected
since a low sample frequency, and reduced power sensor precision due to the use of fuel
consumption as a proxy for shaft power, means that a longer time series is required to
achieve the same uncertainty. In fact, the uncertainty of the 90 day CM baseline is similar
to the uncertainty achievable from a 270 day NR dataset. Both these findings demonstrate
a significant uncertainty benefit of CM data over NR data; this is of the order of 90%
decrease in uncertainty.
In many instances, shortcomings in the availability or precision of the measurements can
be addressed through the deployment of an algorithm/model to normalise CM data. This
should generally increase the sample size of data that can be deployed in the calculation of
performance. The effect of increasing the sample size, as in alternative 1 is to further
reduce the baseline uncertainty in the CM dataset to 2% for 270 days of data, this is as
20
2014
effective as reducing the model uncertainty from the sea trial data from 9% to 1.5% as in
alternative 2, but arguably more achievable given resources.
The extent to which the uncertainty is improved depends on the quality of the model used;
it should be emphasised that every algorithm/model introduced will also increase the
model form uncertainty as well as the number of data fields with a consequent possible
increase in instrument uncertainty, this needs to be carefully considered to ensure that the
addition of the algorithm produces an overall reduction in uncertainty. This analysis also
assumes there is no bias and so if the sensors, in particular the speed sensor, are not
calibrated or maintained then the positive effect of the increased time period on the overall
uncertainty will be reduced (see section 8.1).
There is no uncertainty included to model the influence of human error, therefore the
results for the noon report model is perhaps optimistic, although the overestimation from
the simulation, even for noon report data, in the experimental precision comparison of
section 6 indicates that this might not always be an issue if procedures are in place to
ensure careful reporting.
9. Conclusions
This paper proposes and describes the development of a rigorous and robust method for
assessing the uncertainty in ship performance quantifications. The method has been
deployed in order to understand the uncertainty in estimated trends of ship performance
resulting in the use of different types of data (noon report and continuous monitoring) and
different ways in which that data is collected and processed. This is of high relevance to
the shipping industry, which regularly uses performance quantifications for operational
decision making, and the method and results in this paper can both inform the decision
making process and provide insight into how lower uncertainty in some of the key decision
variables could be achieved. The desired level should be appropriate to the particular
application (the technological / operational decision being made); this informs the
appropriate data acquisition strategy as does an analysis of the cost-benefit of reducing
uncertainty which should also be weighed against the economic, environmental or social
risk of an incorrect decision.
The results indicate the significant uncertainty benefit of CM data over NR data; this is of
the order of 90% decrease in uncertainty, and is especially relevant to shorter term
analysis. It has been shown in this analysis that the uncertainty of the 90 day CM baseline
is similar to the uncertainty achievable from a 270 day NR dataset.
The precision of the speed sensor is of fundamental importance when it comes to achieving
low uncertainty and using SOG rather than STW is likely to have a dramatic effect on the
overall uncertainty. The sensor precision however only affects the aleatory uncertainty of
the performance indicator and the effect decreases over longer evaluation periods; the
additional confounding factor in the uncertainty analysis is epistemic uncertainty which
may be introduced through sensor bias and drift. Speed sensor bias and drift causes
significant changes in the performance indicator, it is difficult to extract the underlying
performance trend when bias or drift is present, after 90 days the effect on the precision of
21
2014
the overall performance indicator is negligible, however it becomes significant as the

evaluation period increases, this highlights the importance of proper maintenance and
calibration procedures. Routine sensor checks should be incorporated into the data
acquisition strategy and onboard procedures. CM systems that augment core data
acquisition with additional measurement of GPS, tides and currents also provide a means
to independently calculate and verify the speed through water measurement from speed
over ground.
The number of observations in the dataset also has a significant effect, this can be achieved
either through data representing a longer time series (which is less desirable for real time
decision support tools), through a higher frequency of data collection (highlighting further
the positive contribution of continuous monitoring systems) or through the use of data
processing algorithms rather than filtering techniques whereby applying ship performance
models enables a more comprehensive analysis. In the latter case, the resultant effect of the
increased sample size (without considering additional model error or instrument
uncertainties) is as effective as reducing the model uncertainty from the sea trial data from
9% to 1.5%, but arguably more achievable given resources.
Acknowledgements
The research presented here was carried out as part of a UCL Impact studentship with the
industrial partner BMT group. The authors gratefully acknowledge advice and support
from colleagues within the UCL Mechanical Engineering department and within UCL
Energy Institute, in particular Dr David Shipworth, Energy & the Built Environment.
References
Agnolucci, P., Smith, T. W. P., Rehmatulla, N. (2014). "Energy efficiency and time charter
rates: Energy efficiency savings recovered by ship owners in the Panamax market."
Transportation Research Part A: Policy and Practice 66: 173-184.
AIAA (1999). S-071A-1999 Assessment of Experimental Uncertainty With Application to
Wind Tunnel Testing, American Institute of Astronautics and Aeronautics.
ASTM (2011). American National Standard Standard E2782 - 11: Guide for Measurement
Systems Analysis (MSA). ASTM Committee on Quality and Statistics, ASTM. E2782.
Coleman, H. W., Steele, W. G. (1990). Experimentation and Uncertainty Analysis for
Engineers. US, John Wiley & Sons.
Cox, M. G., Harris, P. M. (2006). Software Support for Metrology Best Practice Guide No.
6: Uncertainty Evaluation. UK, National Physics Laboratory.
Faber, J., Nelissen, D., Smit, M. (2013). Monitoring of bunker fuel consumption, Delft, CE
Delft.
IMO, A. R. (1995). Performance Standards for Devices to Indicate Speed and Distance.
22
2014
Insel, M. (2008). "Uncertainty in the analysis of speed and powering trials." Ocean
Engineering 35(11-12): 1183-1193.
ITTC (2008). Guide to the Expression of Uncertainty in Experimental Hydrodynamics.
ITTC - Recommended Procedures and Guidelines. 7.5-02-01-01
ITTC (2008). Testing and Extrapolation Methods, General Guidelines for Uncertainty
Analysis in Resistance Towing Tank Tests. ITTC - Recommended Procedures and
Guidelines. 7.5-02-02-02.
ITTC (2011). Uncertainty Analysis - Example for Waterjet Propulsion Test. ITTC Recommended Procedures and Guidelines. 7.5-02-05-03.3.
JCGM100:2008 (2008). "Evaluation of measurement data - Guide to the expression of
uncertainty in measurement (GUM 1995 with minor corrections)."
Lindstad, H., Asbjrnslett, B. E., Jullumstr, E. (2013). "Assessment of profit, cost and
emissions by varying speed as a function of sea conditions and freight market."
Transportation Research Part D: Transport and Environment 19: 5.
Longo, J., Stern, F. (2005). "Uncertainty Assessment for Towing Tank Tests With
Example for Surface Combatant DTMB Model 5415." Journal of Ship Research 49(1): 5568.
Loucks, D. P., Van Beek, E., Stedinger, J. R., Dijkman, J. P.M., Villars, M. T. (2005).
Water Resources Systems Planning and Management: CH 9 Model Sensitivity and
Uncertainty Analysis. Paris, UNESCO.
MAN (2011). Basic Principles of Ship Design. Denmark.
Smith, T. W. P., Jalkanen, J. P., Anderson, B. A., Corbett, J. J., Faber, J., Hanayama, S.;
OKeeffe, E., Parker, S., Johansson, L., Aldous, L., Raucci, C., Traut, M., Ettinger, S.,
Nelissen, D., Lee, D. S., Ng, S., Agrawal, A., Winebrake, J. J., Hoen, M., Chesworth, S.,
Pandey, A. (2014). Reduction of GHG Emissions from Ships: Third IMO GHG Study
2014. I. M. O. (IMO). London, UK.
Stulgis, V., Smith, T. W. P., Rehmatulla, N., Powers, J., Hoppe, J. (2014). Hidden
Treasure: Financial Models for Retrofits. T. L. H. MCMAHON.
23

Uncertainty Analysis in Ship Performance Monitoring Draft Under Review

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Uncertainty Analysis in Ship Performance Monitoring Draft Under Review

Uploaded by

Copyright:

Available Formats

Uncertainty Analysis in Ship Performance Monitoring