You are on page 1of 9

Y. Lechevallier, G.

Saporta
Editors

Proceedings of COMPSTAT’2010
19th International Conference on
Computational Statistics

Paris - France, August 22-27, 2010

Keynote, Invited and Contributed


Papers

Physica-Verlag
A Springer Company
Nonlinear Regression Model of Copper
Bromide Laser Generation

Snezhana Georgieva Gocheva-Ilieva1 and Iliycho Petkov Iliev2


1
Department of Applied Mathematics and Modelling, University of Plovdiv
24 Tzar Assen Str., 4000 Plovdiv, Bulgaria, snow@uni-plovdiv.bg
2
Department of Physics, Technical University of Sofia, branch Plovdiv
25 Tzanko Djusstabanov Str., 4000 Plovdiv, Bulgaria, iliev55@abv.bg

Abstract. The focus of this study is on the relationship between the output laser
power and basic laser input variables in copper bromide vapour laser with wave-
lengths of 510.6 and 578.2 nm. Based on experimental data, a nonlinear regression
model has been constructed. To deal with the multicolinearity the initial predictors
were grouped in three PCA factors. The transformation of factors by the Yeo-
Johnson transformation was applied. The model has been validated using indepen-
dent evaluation data sets. The results obtained via the model allow for a more
thorough analysis of relationship between the most important laser parameters in
order to improve further experiments planning and laser production technology.

Keywords: Yeo-Johnson transformation, PCA factors, nonlinear regression,


output laser power

1 Introduction

The use of multidimensional statistical methods for the study of the behav-
ior of output laser characteristics of gas vapor lasers, and in particular those
of a copper bromide vapor laser, was introduced in the last few years (see
Iliev et al. (2008a, 2008b, 2007, 2009) and Gocheva-Ilieva and Iliev (2010)). In
papers Iliev et al. (2008a, 2008b) we used factor analysis to study 10 indepen-
dent laser variables, showing that only 6 of them are statistically significant.
These variables were grouped in three factors derived by means of multiple
factor analysis with Principal Component Analysis (PCA). After that, using
the factors we constructed linear regression models for the estimation of the
response variable output laser power Pout. In Iliev et al. (2007, 2009) the
same data population was examined through cluster analysis. The relevance
of the observed data was confirmed both for their grouping and their level
of influence on the dependent variable. Recently in Gocheva-Ilieva and Iliev
(2010), these models were compared to nonparametric models, constructed
using the multivariate adaptive regression splines technique (MARS), devel-
oped in Friedman (1991). It was established that nonparametric methods
1064 Gocheva-Ilieva, S. and Iliev, I.

provide better estimates and better prediction compared to standard meth-


ods for multidimensional linear regression (MLR).
Within this study, on the basis of an experimental data sample and the
obtained factor variables, a nonlinear model is constructed using the least
squares method. Nonlinear regression (NLR) has been applied to factor vari-
ables subject to Yeo-Johnson transformation. The model was tested using a
cross-validation technique. It has been established that the nonlinear model
provides better estimates for output laser power as compared to the respective
parametric MLR. The model can be utilized when describing the relationship
between independent input laser characteristics and output laser generation
in order to improve the experiment.
Modeling was carried out based on experimental data obtained at the
Laboratory of Metal Vapour Lasers with the Georgi Nadjakov Institute of
Solid State Physics, Bulgarian Academy of Sciences. The models have been
calculated using the statistical package SPSS and M athematica software.

2 Problem setup

We are studying a copper bromide vapor laser with wavelength 510.6 nm


and 578.2 nm. This is a metal vapor laser which operates under medium and
low pressures. It is notably the laser with the highest efficiency in the visible
spectrum and is easy to maintain due to its low gas temperature (around 600
0
C) while at the same time it is capable of self-heating. The copper bromide
vapor laser has a wide range of applications (see for instance Sabotinov (2006)
and Foster (2005)).
In order to construct the nonlinear model and to carry out the statistical
analysis we use the following independent laser variables: D (mm) - inside
diameter of the laser tube; dr (mm) - inside diameter of the rings; L (cm) -
length of the active area (electrode separation); P in (kW) - input electrical
power; P L (kWm−1 ) - input electrical power per unit length; P H2 (Torr) -
hydrogen gas pressure.
The response variable is laser generation or output laser power, P out (W).
All of the used experimental data has been published in Astadjov et al.
(1985, 1994, 1997a, 1997b), Dimitrov and Sabotinov (1996), NATO (2000),
and Stoilov et al. (2000). It includes different CuBr lasers, which can be
divided into three main groups according to their geometry: small-bore lasers
of inside diameter D < 20 mm, medium-bore lasers of D = 20 − 40 mm and
large-bore lasers of D > 40 mm. From the available data for about 300
experiments with the three general types of lasers, a random sample with
size n = 109 has been used. Since over 60% of all data is about small-bore
lasers, the sample is partially stratified, in order to avoid the imbalance of
the available data. The data is of historical type. Here we have to mention
the complexity, long duration and high cost of each conducted experiment.
Nonlinear Regression Model of Copper Bromide Laser Generation 1065

Typically, the studied data does not meet the requirement for multivariate
normal distribution, although this can be assumed for the global population.
Furthermore, as already shown in Iliev et al. (2008a, 2008b, 2007, 2009)
the abovementioned six independent variables exhibit a strong multicolin-
earity. For this reason first we apply a multivariate factor analysis in order
to determine the PCA factors, which later on act as predictors. The models
constructed so far with the aid of MLR are not completely satisfactory and
can be considered to be the first approximation for the description of the
dependencies we are interested in.
In this study, our goal is to construct a nonlinear regression model which
would provide a more accurate description of the relationship between the
data and to study the predictive power of the model.

3 Application of Principal Components Factor Analysis

In order to avoid the multicolinearity phenomenon we are going to group the


six independent variables using a classic multidimensional factor analysis.
Normally this method can be applied without making any distributional as-
sumption (e.g., Gaussian) (see for instance Izenman (2008, page 583). Using
the SPSS software for our data sample we obtained the Kaiser-Meyer-Olkin
measure of sampling adequacy KMO=0.660 and Bartletts test of sphericity
with significance level equal to 0.000. The respective measures of sampling
adequacy (MSA) are also of significance for each variable. This indicates that
the factor analysis of the sample is adequate and can be carried out. The fac-
tors have been extracted using PCA. Usually the number of factors chosen
is equal to the number of eigenvalues of the correlation matrix greater than
1. However, as shown for instance in Jolliffe (1982) and Izenman (2008), the
low-variance principal components may also be important. In our case, al-
though there is only one eigenvalue greater than one, we have chosen the
number of factors to be three. When variables are grouped in three factors
the subsequent rotation using the Varimax method with Kaiser normalization
clearly reveals the following orthogonal factors: F1 (including P in, dr, L, D),
F2 (including P L) and F3 (including P H2). They account for 95.41% of the
total variability of the data sample. The choice of three factors is justified as
follows. When hydrogen is added this leads to a two-fold increase of P out,
which is an indisputable fact proven by experimental results (see Astadjov
et al. (1985, 1997b)) and so the factor F3 must not be overlooked. The P L
variable (factor F2 ) also plays a special role and during experiments it has
been detected to noticeably affect laser generation. Omission of this variable
leads to regression models which do not provide sufficiently good estimates.
For a sample with size n = 109 at level of significance α = 0.05, the
statistically significant factor loadings are those with absolute value over 0.5
(see SOLO (1993)). The factor loadings of the observed six input variables
are respectively: in factor F1 - P in(0.913), dr(0.887), D(0.807), L(0.769); in
1066 Gocheva-Ilieva, S. and Iliev, I.

factor F2 - P L(−0.914); in factor F3 - P H2(0.929). The good quality of the


factor model is confirmed by the calculated reproduced correlations matrix,
for which there is only one non-redundant residual with absolute value greater
than 0.05 (actually it is equal to -0.059).
The factor scores which are used in this study have also been calculated
at this stage of the statistical analysis.

4 Nonlinear regression model


4.1 Yeo-Johnson transformation of PCA factors
The careful examination of generated PCA factor variables shows that their
relationships with the output laser power P out are partially polynomial
rather than linear.
Statistics utilizes various transformations in order to improve the mutual
data distribution. In our case the standardized factor scores have both pos-
itive and negative values. For this reason, we are going to use the following
transformation of Yeo-Johnson (Yeo and Johnson (2000)):

{(x + 1)λ − 1}/λ, x ≥ 0, λ 6= 0


log(x + 1), x ≥ 0, λ = 0
ψY −J (λ, x) = −{(−x + 1)2−λ − 1}/(2 − λ), x < 0, λ 6= 2
− log(−x + 1), x < 0, λ = 2

Here x is the transformed variable, and λ is a parameter. The Yeo-Johnson


transformation has a number of good properties, including continuous first
and second derivatives with respect to λ, usually λ ∈ [−2, 2].

4.2 Construction and estimation of the nonlinear model


We are looking for the nonlinear model for estimation of P out in the following
form:

P[
out(θ, λ) = θ0 + θ1 ψY −J (λ1 , F1 ) + θ2 ψY −J (λ2 , F2 ) + θ3 ψY −J (λ3 , F3 ), (1)

where the parameters θi , (i = 0, ..., 3) and λj , (j = 1, ..., 3) are determined


using the least squares method.
In order to carry out the calculations we have compiled the M athematica
compact code shown in Figure 1.
The resulting parameters for the seven-dimensional model (1) are

θ0 = 39.735372, θ1 = 27.167573, θ2 = 4.456846, θ3 = 11.777153, (2)

λ1 = 1.290534, λ2 = 0.381756, λ3 = 0.767572.


Nonlinear Regression Model of Copper Bromide Laser Generation 1067

Fig. 1. The M athematica code for calculating the nonlinear model (1)-(2).

Figure 2 shows a comparative graphic of experimental data for laser out-


put power P out versus those estimated by the model (1) - (2). In particular
the estimated value of the highest experiment P out = 120W obtained by the
model (1)-(2) is 122.639.
The main results from ANOVA are shown in Table 1. Further diagnostics
of the model give maximum parameter-effects twice greater than the critical
value of the 95% confidence region of the fit curvature. This corroborates that
more appropriate model for our data is a nonlinear regression model rather
than a linear one. Finally, high asymptotic correlation between parameters
for all pairs is not observed, so that the model (1)-(2) is correct.
The calculations were carried out using double precision arithmetic on a
dual core personal computer and took approximately 30 minutes.

5 Assessment of the model predictive ability

In order to have a reliable estimate of the predictive ability of the nonlinear


model (1), we apply the common practice to use data set independent of that
used to fit the model. he initial data sample was splited randomly into one
training and one evaluation data set, containing approximately 70% and 30%
of the total cases, respectively. The training data set was used to generate
the model which was then tested with the independent evaluation data set.
The following are the parameters for the nonlinear regression model of
type (1), for the randomly chosen 70% training data set from all 109 cases:

θ0 = 40.063269, θ1 = 26.973166, θ2 = 4.283957, θ3 = 11.859342, (3)

λ1 = 1.257025, λ2 = 0.389093, λ3 = 0.767572.


1068 Gocheva-Ilieva, S. and Iliev, I.

Fig. 2. The observed vs estimated values of laser generation P out.

Predicted values for 30% evaluation data set compared to those already
known for P out are shown in Figure 3.
The basic statistics for the considered cases are given in Table 1.

Model R2 of the estimates R2 adj. Std. Err.


Model (1), (2) 0.952 0.950 7.58979
Model (1), (3) for a 70% subset 0.950 0.948 7.94853
Model (1), (3) for a 30% subset 0.955 0.954 6.96404
Table 1. Basic statistics of the basic nonlinear regression model (1) and cross
evaluation of 70% and 30% randomly extracted sets.

6 Discussion and conclusions

The comparison between constructed models is carried out on the basis of


the quality of the calculated estimate values for laser output power and the
results from the cross evaluation of the model. From the results given in Ta-
ble 1 it is seen that the nonlinear model (1), (2) fits the data very well. Also,
the indexes of model (1), (3) are relatively good and fall only a little behind
those of (1), (2). The substituted in (1), (3) values from the 30% evaluation
data set, which have not been included in the extraction of parameters (3)
confirm the good quality of the constructed models. We can conclude that
Nonlinear Regression Model of Copper Bromide Laser Generation 1069

Fig. 3. Predicted values for P out compared to the initial observed values for a 30%
evaluation data set.

nonlinear models of the suggested type are stable and fit the data well. The
indexes of these estimates exceed the analogical statistics, obtained for the
same data set using MLR. They are almost equal of the statistics from the
second degree polynomial regression and fall behind the accuracy of the poly-
nomial regression of the third degree and the MARS models based on linear
regression splines and splines with first and second order interactions (see
Gocheva-Ilieva and Iliev (2010)).
One can conclude that the obtained nonlinear regression model is fully
applicable for estimation and prediction of the output laser power.

Acknowledgements
This study was conducted with the financial support of the Scientific Na-
tional Fund of Bulgarian Ministry of Education, Youth and Science, project
number VU-MI-205/2006 and the Scientific Fund of Plovdiv University Paisii
Hilendarski - NPD, projects RS2009-M-13 and IS-M-4.

References
ASTADJOV, D. N., DIMITROV, K. D., JONES, D. R., KIRKOV, V.K., LITTLE,
C. E., LITTLE, N. and SABOTINOV, N. V. (1997a): Influence on operating
characteristics of scaling sealed-off CuBr lasers in active length. Optics Com-
munications 135, 289-294.
1070 Gocheva-Ilieva, S. and Iliev, I.

ASTADJOV, D. N., DIMITROV, K. D., JONES, D. R., KIRKOV, V.K., LITTLE,


C. E., LITTLE, N. and SABOTINOV, N. V. (1997b): Copper bromide laser
of 120-W average output power. IEEE Journal of Quantum Electronics 33,
705-709.
ASTADJOV, D. N., DIMITROV, K. D., LITTLE, C. E. and SABOTINOV, N. V.
(1994): A CuBr laser with 1.4 W/cm3 average output power. IEEE Journal of
Quantum Electronics 30, 1358-1360.
ASTADJOV, D. N., SABOTINOV, N. V., VUCHKOV, N. K. (1985): Effect of
hydrogen on CuBr laser power and efficiency. Optics Communications 56, 279-
282.
DIMITROV, K. D. and SABOTINOV, N. V. (1996): High-power and high-efficiency
copper bromide vapor laser. SPIE 3052, 126-130.
FOSTER, P. G. (2005): Industrial applications of copper bromide laser technology.
Ph.D. Thesis, University of Adelaide, School of Chemistry and Physics, Dept.
of Physics and Mathematical Physics, Australia.
FRIEDMAN, J. H. (1991): Multivariate adaptive regression splines (with discus-
sion). The Annals of Statistics 19 (1), 1-141.
GOCHEVA-ILIEVA, S. G. and ILIEV, I. P. (2010): Parametric and nonparametric
empirical regression models: case study of copper bromide laser generation.
Mathematical problems in Engineering, Article ID 582732, 16 p.
ILIEV, I. P. and GOCHEVA-ILIEVA, S. G. (2007): Statistical techniques for ex-
amining copper bromide laser parameters. In: T. E. Simos, G. Psihoyios and
Ch. Tsitouras (Eds.): Proceedings of International Conf. of Numerical Analysis
and Applied Mathematics, ICNAAM 2007, Corfu - Greece, Proc. AIP CP936.
Springer, New York, 267-270.
ILIEV, I. P., GOCHEVA-ILIEVA, S. G. and SABOTINOV, N. V. (2008a): Sta-
tistical approach in planning experiments with a copper bromide vapor laser.
Quantum Electronics 38 (5), 436-440.
ILIEV, I. P., GOCHEVA-ILIEVA, S. G., ASTADJOV, D. N., DENEV, N. P. and
SABOTINOV, N. V. (2008b): Statistical analysis of the CuBr laser efficiency
improvement. Optics and Laser Technology 40 (4), 641-646.
ILIEV, I. P., GOCHEVA-ILIEVA, S. G. and SABOTINOV, N. V. (2009): Classifica-
tion analysis of CuBr laser parameters. Quantum Electronics 39 (2), 143-146.
IZENMAN, A. J. (2008): Modern Multivariate Statistical Techniques: Regression,
Classification, and Manifold Learning. Springer, New York.
JOLLIFFE, I. T. (1982): A note on the use of principal components in regression.
Journal of Royal Statistical Society, Series C (Applied Statistics) 31, 300-303.
NATO CONTRACT SfP (2000): 97 2685, 50W Copper Bromide laser.
SABOTINOV, N. V. (2006): Metal vapor lasers. In: M. Endo and R.F. Walter
(Eds.): Gas Lasers. CRC Press, Boca Raton, 449-494.
SOLO (1993): Computation with solo power analysis. BMDP Statistical Software
Inc., LA.
STOILOV, V. M., ASTADJOV, D. N., VUCHKOV, N. K. and SABOTINOV, N.
V. (2000): High spatial intensity 10 W- CuBr laser with hydrogen additives.
Optics and Quantum Electronics 32, 1209-1217.
YEO, I. K. and JOHNSON, R. A. (2000): A new family of power transformations
to improve normality or symmetry. Biometrika, Oxford Press 87 (4), 954-959.

You might also like