Optimizing Health Care Assignments and Precedures With AI

OPTIMIZING HEALTH CARE IN HOSPITALS,
EMERGENCY ROOMS AND MEDICAL

CENTERS
07/08/2016
An approach based on AI
tools and Bayesian decision
An alternative based on AI adaptive
software, designed to tackle the challenge
of providing effective medical care,
better defining the delicate borders
between emergency care, community
medical center, and hospitalization.
Optimizing health care in hospitals, emergency rooms and medical centers
Optimizing health care in hospitals,

emergency rooms and medical centers
An approach based on AI tools and Bayesian decision
I – The context of the health care assistance in developing countries

Persistent shortcomings and inadequacies of the public health service have lead, over almost a century, to
the creation of various levels and complementing networks of medical service. In addition to the general
hospitals, units specialized in trauma and orthopedics were created, health posts, and more recently, in
Brazil, the UPAs (“Unidades de Pronto Atendimento” – small community medical centers that provide first-
care assistance). These small units were designed to increase the reach into poorer zones through a wider
and deeper capillarity, serving as a combination of screening with outpatient care, filtering and absorbing
more severe cases than normally served by health posts, but which do not require rigid structures (or the
long, bureaucratic processes) of a conventional hospital, redirecting only the cases that require further and
more complex assistance.
However, chronic deficiency in hospitals’ service structure prevents the most severe and complex cases to be
forwarded to them, so many patients who require a hospital structure end up staying in UPAs in a non-
declared status of admission, without possibility of receiving adequate hospital care. Such a state of affairs
prevents, by overcrowding beds and seats, the outpatient treatment UPAs can provide properly, thereby
undermining the very designed purpose of such small units.
When creating technical capacity for an "internment by default," the overcrowding of UPAs became
inevitable: Patients, hospitals and bigger medical centers would predictably use them as an overflow
recipient of their own overwhelmed capacity, undermining the UPA’s original purpose and simply increasing
the reach of pre-existing lack of structure in the hospital to the networks of community health centers, which
further complicates its solution, as a result of increased management complexity and reduction of scale.
The proposal that follows offers a way to manage the logistics of care in an integrated network (hospitals,
UPAs, health posts, bigger community medical centers) in order to minimize or even eliminate the problems
aforementioned.
II – Objectives of the proposed solution

Our proposed solution is actually a combination of adaptive AI tools (mainly neural networks optimized by
genetic algorithms) that feed a Bayesian decision engine as described in the text. Armed with this
combination, we aim to tackle the aforementioned problems through:
➢ Process streamlining and increased precision in the screening process and diagnosis of the patient;
➢ Allowing faster decision to admit local UPA treatment or to forward the patient to a bigger unit;
➢ Forecasting the evolution of the case and the length of time and cost of patient stay.
Copyright IntelliSearch – Experteam Consultoria Ltda. (text) and Ward Systems (screen shots) - Page 1
III – Application outline

The solution consists of two main layers, as illustrated below. The most fundamental one absorbs all the
“adaptive” demands, as it handles all the aspects inherent to the evolution of a pathology (the collective
view) with time. In this layer, we concentrated the AI tools. The top layer, consisting in a Bayesian engine,
receives various inputs, including those from the adaptive layer, and elaborates suggestions following a
methodological approach of decision analysis. It also serves as an interface between users (doctors, nurses
and hospital administrators) and the components of the adaptive layer.
Mobile
interfaces
Decision
layer
Decision
making Health care
(Bayesian integrated DB
engine) (available beds and
resources)
Classification Predictive
Prediction Adaptive
model – model –
model – Patient layer
Pathology evolution of the
stay and cost
identification patient case
Fig. 1
Other important inputs to the decision layer come from already existing external databases containing
information on availability of resources and statistics involving the pubic health care system.
We proceed with the details of each component of the solution, in a bottom-up description, which we
consider the most appropriate for the correct understanding.
IV – Solution components
IV.1 – Classification model for pathology identification

The most fundamental module in the adaptive layer, the pathology categorizer, is essential to the process
that culminates with appropriate decisions in each case.
The challenge of categorizing a disease from an often incomplete anamnesis (limited to only an account of
the symptoms that afflict the patient at the time of consultation) and often fragmented clinical history or
completely absent / unavailable, is quite intimidating.
Approaches like check list type, flow chart or decision tree with rigid and fixed alternatives, often lead to
wrong pre-diagnosis and therefore an improper direction of the patient. As illustrated below, by a simplified,
hypothetical example, for didactic purposes only, imagine the following flowchart as a pre-diagnostic tool:
Page 2 – Copyright IntelliSearch – Experteam Consultoria Ltda. (text) and Ward Systems (screen shots)
Dengue
Y
Fever?
Rash?
Y Y
Y
Muscle N
Respiratory pain?
symptoms? N Flu
N
N Common
Cold Abdominal
Y
pain?
Gastric
Symptoms? Y
N
Fig. 2
If the health care unit (UPA or medical post) is attending a population sample with skin color that is not white
(because of regular sun exposure to sunshine or as an outcome of natural pigmentation), or it that specific
community is being affected by a virus mutation, rash may not be a valid symptom, or may not be detectable
at all. This would lead to a wrong pre-diagnosis and consequently to a misdirection. The connection between
the negative output “fever” and the test on gastric symptoms can also be misleading, particularly if a patient
with dengue is still on the upward curve of temperature, or has taken a medicine with antipyretic properties
without being aware of the fact.
The limitations of these methods stem from three basic weaknesses:
➢ They are fully deterministic and non-adaptive, so even small changes (in frequency) or gaps in the
typical symptoms list will likely lead to a completely wrong logical path;
➢ The typical flowchart branching prevents multiple aspects (located in different branches of the tree)
to be analyzed together, as a whole. In the above example, if the patient has fever and respiratory
symptoms, the test of gastric symptoms is skipped altogether, which could be important to
differentiate, for example, dengue from zika;
➢ The hierarchy / priority inevitably imposed by the tree-like structure can lead in many cases to a
complete reversal of which would be the proper sequence of tests. Ex .: In some cases, the test (or
inquiring) of gastric symptoms should precede that of respiratory symptoms.
The solutions based on adaptive methods, such as artificial neural networks systems (ANNs) are not fooled by
changes or premature deviations caused by the branching of a typical flowchart or decision tree, and the
underlying analysis always takes into account all relevant factors have been collected during the anamnesis
procedure, thereby avoiding the pitfalls listed above.
We use, as the core of our solution to the categorization of the pathology, the NeuroShell Classifier, from
Ward Systems, for its flexibility, unlimited adaptive capacity and accuracy of the results, thanks to its
architecture of a ANN supervised by a Genetic Algorithm (GA). It is a tool 15+ years in the market, constantly
updated and improvements, with thousands of completely satisfied customers, including all who use it in
health care applications. In some cases, the training phase of the neural network, we supplement the NS
Classifier with Chaos Hunter, also from Ward Systems and equally time-tested and robust.
In the following text we describe the complete process, illustrating with examples of actual screens taken
from the system. The first one (Fig. 3) shows the variables used in training the neural network. On the first
screen we pick the input variables of the model as well as its “output” (to be in a production regime). In
modeling jargon, we are choosing the independent variables and the dependent variable (the one which
value will be predicted by the model), though, a priori, we cannot say that the input variables are
independent from each other. But as we are dealing with models based on neural networks, inherently
nonlinear and non-regressive, the fact that the input variables may bear some form of interdependence
(provided they are not completely determinants of each other), does not affect the end quality
(discrimination capability) of the model.
Note that all variables display numeric values, which is an absolute need when it comes to ANNs. Some
variables have their domains within the set of Real (continuous) numbers, other are pure integers (discrete or
categorical) and other yet are Boolean (binary). Nevertheless, many inputs that come out from a regular
anamnesis procedure are alphanumeric, and one of the interface functions mentioned in topic III is precisely
to translate alphabetic / alphanumeric data into numerical (and vice versa, when presenting results to the
user). Moreover, even to originally numerical data, the interface applies normalization of the respective
domains. Thus, for example, systolic and diastolic blood pressures must be converted to numbers in a
continuous range from 0 to 1 rather than applied as an input by its face value in mmHg. One of the values
IntelliSearch adds to the solution is the set of transformation methodologies and data normalization
embedded in this interface.
In the same screen we also selected training the neural network by using genetic algorithm’s optimization, an
important feature of Ward Systems’ software (being a supervised ANN), thereby reducing the tendency to
"overfitting" (typical behavior of unsupervised neural networks) to the data set used for training specific
cycle, improving the generalization ability of the predictive model, as a consequence. The genetic training
method in NeuroShell Classifier is a Probabilistic Neural Net (PNN), which uses a GA to find the weights for
each input.
Fig. 3
The data on the next screen illustrate the types of variables and their domains. The input dependent variable
(the right-most column), the category of the pathology (patology-group) is a categorical variable: it takes
integer discrete values 1 to 6, each representing a category. Ex .: 1 means cardiovascular disease, 2 a
respiratory illness, 3 endocrine and so on.
On the input independent variables, filling all other columns except the first left (just the patient ID), we
illustrate some examples, more specifically:
• Categorical discrete variables, such as cardiovascular history (Hist-card) and the type (localized or not
and where) of rash detected (rash);
• Discrete non-categorical variables, typically counts, fed at their face value, such as the number of the
patient's previous visits (prev-visit) to the health care unit, and the number of respiratory symptoms
reported by the patient (resp-signs);
• Continuous variables input at their face value, such as age (age) and number of days already under
observation in the unit, formally admitted or not (in days);
• Boolean variables (binary), telling if the patient has or no fever (fever) or abdominal (ab-pain) or has
had previous visit sporting the same symptoms without getting a successful diagnosis (prev-hist);
• Continuous variables that have gone through normalization of their domains, such as the clinical
score (clin-score) and red and white blood counts (ind4 and ind5).
Fig. 4
Note that this is an extremely simplified example for illustrative and didactic purposes only, so with fewer
variables than normally used. There is no limit to the number of independent variables used as input,
although there should be, for each network, only one dependent variable. When modeling additional
dependent variables, one can use different networks in parallel or in sequence (when an output from a
network is used as independent input variable to another). Indeed, in most applications for disease
classification, the most common approach is to use more than one network, the first classifying into large
categories, such as illustrated herein. The output of this network (one of the six categories) serves as
independent on the following networks (one for each category) that defines the pathology’s subcategory. Ex.
The first network sets the category as endocrine illness and second (using the result as the first input) pins
down the definition as thyroid disorder.
Important to also note that the data fed to the network for its training, are actually observed data, including
the values of the dependent (to be output in production runs) variable, the pathology category. That is, the
patient with id 828 was actually diagnosed with a disease belonging to category 5 (gastroenterological).
After uploading the historical data, the next step is to train the neural network with such sample data. Given that we have
previously selected (Fig. 3) optimization (supervision) by genetic algorithms, one of the graphics available to monitor the
training process is the vertical histogram showing the relative importance of input variables (independent), as depicted on
Fig. 5. This chart is very important because it provides an essential feedback to the model designers, since it may indicate
variables that are not significant for the predictive capacity of the output dependent variable (the pathology categorization
in this case), potentially suggesting to replace them by others that may be more influent.
Fig. 5
Fig.5 and Fig. 6 illustrate the adaptive capacity of the neural network. After adding more data to the network,
usually more recent cases, the “rash” variable shows (Fig. 6) a substantial drop of importance in determining
the category of pathology. The relative weights of the other variables also changed. This happens simply, and
automatically, just by adding data to the network training cycle, without having to reflect the changes in any
program code.
Fig. 6
The network training process is so more time consuming and demanding of computer resources as the larger
the sample data (rows and columns) is. To speed up the training process, we can use strategies to distribute
the processing among many computers (automatically or manned).
After performing the network training, we submit it to a test of its classification performance of the patient's
pathology. The scoring matrix (for illustrative purposes only, as this whole paper) is shown below. The better
the predictive ability of the network, more counts concentrate on the pink background diagonal.
Fig. 7
Also shown in the same screen are the percentages of false positives, false negatives, their counterparties
and the sensitivity and specificity coefficients. Another important chart is the ROC curve shown below. The
better the predictive ability / classification model, more the apex of the bend gets closer to the upper left
corner of the graph.
Fig. 9
Network training should be re-run periodically, on a monthly basis or so, always adding new data and eventually discarding
variables that eventually start showing low influence. It is the repetition of the training process, exposing the supervised
neural network to new data, that drives the adaptive feature of neural networks, making it "learn from experience",
thereby increasing its predictive / classification capability.
After successfully concluding the training process, the developer can save the classification matrix and the ROC curve for
documentation purposes of the training cycles, as well as the final statistics, illustrated in the following report, which can be
printed, saved to PDF or exported to Excel, for example.
Fig. 9
Finally, the developer can save the network (run-time format) for use within the application.
Fig. 10
The run-time module (.NET extension), can be used as a called routine in virtually any application (which
must pass it the parameters i.e the independent variables), but it can also be run directly by Ward System’s
run-time launcher (Fire NeuroShell), as shown below, which may be useful when running additional tests
before embedding it into the final app.
Fig. 11
IV.2 – Prediction model of patient’s length and cost of stay

For didactic purposes, we will now discuss the module that predicts the patient's length and cost of stay in
the health care system.
Unlike the pathology classifier, where the dependent variable (to be the prediction output) is categorical (6
different classifications), so with a discrete domain, here we are dealing with a continuous dependent
variable (domain contained in the set of Real numbers). Hence the proper tools here are NeuroShell
Predictor and / or Chaos Hunter. We follow on with an illustrative example, aimed to infer the patient's
length of stay under observation, which is also simplified for didactic purposes.
As in the case of the pathology categorization module, before we start the training process of the neural
network, we must import the data and select which are independent and dependent (to be predicted), as
well as the training method (pure, unsupervised neural network or optimized by genetic algorithms). All
these info are input to the fields of a configuration window identical to that depicted in Figure 3 (p. 4).
In this specific case, the 13 independent input variables, displayed in the 13 rightmost columns of the
spreadsheet (2-14) below (fig. 12), are:
• Categorical discrete variables, such as house income range (seg-renda), patient’s profession (nature),
if she/he has any kind of health insurance, and what is the coverage (indplan1);
• Non-categorical, discrete variables, typically counts, fed at their face value, such as the number of
visits to the unit by the same patient, that ended up in an stay period longer than expected (ind2);
• Continuous variables fed at their face value, such as age range (seg-idade) and number of days
already under observation in the health care unit, (days-in);
• Boolean variables (binary), the first one (empregado) about the patient’ employment status and the
other one (precedentes) telling if the patient has been previously admitted in the system (followed
by hospitalization), reporting the very same symptoms;
• Continuous variables submitted to normalization in their domains, such as the clinical score (ind3)
and diastolic and systolic blood pressure (ind4 and ind5 respectively);
• The variable "Gravidade" (severity) summarizes the expectations regarding the evolution of the
patient’s case. In the production system, this variable will be provided as output (predicted value) by
the module "Predictive model for the evolution (severity) of the patient’s case", presented in the
next section. Here, however, in the neural network training phase, the value supplied in this column
should be the severity degree actually observed.
• Similarly, the variable "pathology-group", at this stage an independent input variable, when in
production run, will receive the classification output from the pathology classification module,
described in the previous section. Here, in the neural network training phase, the value supplied in
this column should be the category actually observed pathology for that specific patient (row in the
input spreadsheet).
“Permanencia”, the independent output variable of this model when in future production runs, will tell the
predicted length of stay of the patient in the health care unit (or in the health care system as a whole). Here,
for network training purposes, we must load its values as input, by the actually observed value.
Fig. 12
Once we have completed the input of historical (actual) data set for training, we can choose which
performance parameter we will use to assess the predictive ability of the model. In this case, we chose to
minimize the mean square error (MSE). These same options shown below (fig. 13) are also available in
NeuroShell Classifier (used in pathology classifier module) and Chaos Hunter.
Fig. 13
The following step is to run the neural network training cycle that makes available, among other information,
the graphics depicting the relative importance of inputs, which is of significant usefulness for model
management purposes allowing the model developer to add new variables and/or delete old who fail to
contribute). The graph is structurally identical to that of Fig. 6 on page 6.
Fig. 14
An equally important chart to monitor the training cycle is the "Learning Level" line graph that shows the
evolution of the parameter chosen to measure the predictive ability of the model (in our case, the mean
square error). When the curve becomes asymptotic after several generations of neural networks created by
the genetic algorithm, the developer may stop the training, not only to save time and processing power, but
also to reduce the tendency to "overfit" the resulting neural network to the set of data used for training,
thereby increasing its generalizing capability.
Fig. 15
Upon completion of the training cycle, the network can be tested by applying it not only to the data used for
training (“in-sample”), but to a larger set, preferably disjoint from that used during training ("out-of-sample").
The chart below compares the predicted value against the one actually observed in each line of the sample
tested (in-sample, out-of-sample or both samples combined).
Fig. 16
An additional chart that helps to visualize the predictive ability of the model is the following scatter plot. For
models that have, as dependent “output” is a continuous variable, it is the analogue of the classification
matrix, which is used to models where the output (dependent variable) is a discrete variable (categorical)
such as the pathology categorizer discussed in the previous section (see Fig. 10 in p. 8).
Fig. 17
The two graphs above (Figures 16 and 17) show the test results of a neural network where the training cycle
was interrupted after only three generations created by the genetic algorithm. To illustrate the effect of a
longer training cycle for the same data, we show below similar charts that result from a network test which
training cycle has gone thru 32 generations (the figure cycle 15), using the same data, thereby demonstrating
the significant gain in predictive ability ("fitting").
Fig. 18
Fig. 19
When the model reaches a level of performance (predictive power) considered good enough by the
developer, she/he can save the network as a run-time module as shown in Figure 10 (page 8).
Here we have shown a predictive model for the time of the patient's stay. For a model intended to predict
the cost, a second network is needed, since the output variable (dependent) is a different one. Certainly the
model will also have a different set of input variables, but the structure and operation are similar to those
exemplified herein.
IV.3 – Predictive model for the evolution of the patient’s case

The purpose of this model is to estimate the evolution of the patient’s case, having obtained the category
and subcategory of the pathology that affects her/him, and other clinical data and medical history, which will
serve as input (independent variables). As the dependent variable - indicator of the maximum expected
severity - is a continuous one, the proper Ward Systems tools to be used here are NeuroShell Predictor,
already shown, and/or Chaos Hunter.
Since the operation differs very little module described in the previous section, and there is nothing new
from the conceptual point of view, we will use the following example to demonstrate the features of Chaos
Hunter, not shown in this document.
The selection of input variables (independent and dependent), is identical to the the same process in
NeuroShell classifier and the NeuroShell Predictor, as shown in the following figure.
The only difference is that Chaos Hunter allows a simple method of data normalization by subtracting the
average, which improves the predictive capability and overall performance for very large domains. However,
our experience recommends additional specific normalization processes for each variable, and IntelliSearch
offers a range of data normalization functions that show extremely productive not only in improving the
prediction (reduced MSE/RMSE) as well as in the training cycle performance (making it faster).
Fig. 20
The sixteen input variables used in the training cycle follow, with short descriptions:
1. age Patient age (continuous variable – fed by its face value)
2. stable Tells if the patient is in a stable condition (Boolean Y/N)
3. active Is the patient conscious and active? (Boolean Y/N)
4. bilirubin Bilirubin level (continuous variable, normalized)

If the patient is a chronic hypoglycemic, or circumstantial hypoglycemic,
controlled pre-diabetic, uncontrolled pre-diabetic, controlled type 1 diabetic,
5. glicemic-status controlled type 2 diabetic, type I uncontrolled, type 2 uncontrolled diabetic
(discrete variable / categorical)
6. cardiac-status Patient’s cardiovascular status (discrete variable / categorical)
7. resp-status The patient's respiratory status (discrete variable / categorical)
8. gastric-status Gastric status of the patient (discrete variable / categorical)
9. nephr-status Renal status of the patient (discrete variable / categorical)
10. histopath-status Patient’s histopathological status (discrete variable / categorical)
11. average fever Normalized continuous variable
12. avrg-score Average value of the patient's clinical score (normalized continuous variable)
13. diastolic Average diastolic pressure of the patient (normalized continuous variable)
14. systolic Average systolic pressure of the patient (normalized continuous variable)
Macro (upper level) classification of patient's pathology (categorical variable).
In a production regime, this will be the outcome of the classification provided
15. patology-group by the pathology categorization module. Here, for the training cycle we input
the value actually observed in the patient’s specific case.
Second level detailed classification of the patient's pathology (categorical

variable). In production regime, this will be the result of the classification
16. pathology-subgroup provided by the pathology categorization module. Here, for the training cycle
we load the value actually observed in the patient’s specific case.
The output dependent variable "Gravidade" (severity) is a continuous variable that summarizes the expected
evolution of the patient’s health status. In a production regime, it will be input to the “length of stay and
cost” inference module (IV-2) and to the Bayesian decision engine in the top application layer, presented in
the next section.
After loading the data, we selected which subset of it will be used for training ( "in-sample" or "optimization
set") and in which one ("out-of-sample") will be used to test the model.
Fig. 21
Then we chose the optimization method. Chaos Hunter provides, in addition to the alternative of genetic
algorithms ("evolution strategy"), the Particle Swarm the optimization (PSO) approach. The choice of one or
the other depends on how the solution space contains "local optima" that can attract the PSO algorithm and
divert it (preventing it from achieving) a "global optimum". Genetic algorithms are less vulnerable (but not
immune) to this effect, although the PSO strategy tends to converge ("swarm") faster to a "global optimum"
if we choose the right algorithm seeds. IntelliSearch team developed a seed generation algorithm that
maximizes the dispersion on the solution space, thereby avoiding the pitfall of being trapped in a "local
optimum", for both PSO and genetic algorithm approach.
Fig. 22
The next step is to delimit the scope of mathematical functions that the training cycle of the model will use,
the so-called building blocks that will be included in the final formula - the model itself. This is an important
difference between Chaos Hunter and NeuroShell Predictor: Both derive a model for a continuous dependent
variable, from a set of input variables (independent). However, when using NeuroShell Predictor, the engine
is a General Regression Neural Network (GRNN). The genetic algorithm finds the correct weightings for each
of the input. These "mutations" of the neural network are being refined until, by the "evolutionary" strategy,
they reach an ANN with maximum predictive capacity, according to the criteria predetermined by the
developer (e.g., higher R2 or lower MSE).
On the other hand, Chaos Hunter, although capable of generating models based on neural networks, also
allows to develop models involving more conventional mathematical functions, becoming the most
appropriate tool when the developer prefers to model the desired variable using non-linear regression
methods (Logistic, polynomial, exponential / hypergeometric, logarithmic or trigonometric, for example).
Illustrating with a practical example, in the screen of the following figure, we choose, as mathematical
"building blocks", the logarithmic, exponential, Boolean and polynomial functions. We have not activated
neural networks to better demonstrate the difference. However, we activated the sigmoid function within
the set of neural networks to allow the genetic algorithm to develop a formula that may use some variation
of logistic regression.
In Chaos Hunter, the genetic algorithm will create several generations of formulas, and the difference from
one generation to another, the so-called "mutations" of the formulas, are obtained by changing exponents,
function arguments, coefficients, addition and exclusion of terms, inclusion/exclusion of functions (delimited
by pre-determined basic blocks), inclusion and exclusion of input variables (among the 16 supplied to the
model) and constants. The greater the number of variables, data lines and building bocks selected, the
greater the precision, at a cost of a longer processing time and a greater demand on computational capacity,
required by the training cycle. That's why Chaos Hunter is often presented as an AI tool for "Formula
optimization".
Fig. 23
The following figure illustrates the screen of the model training process in its final stage. The larger, upper-
left window shows the evolution of the formula, adding a new formula each time a new mutation delivers a
higher R2 value than the one in the previous "best formula." The generation number in which the formula
was created is depicted in the first column of the window, in descending order. The next column displays the
value of R2, also in descending order, obviously (the Appendix to this document shows the same content in a
larger table).
Fig. 24
The third column of the same window presents the best formula obtained by the model for deriving the
severity of the patient's condition, using only six of the independent input variables (the model disregarded
the others because they did not add predictive capacity). Therefore, the formula that best predicts the
severity (“Gravidade”) of the patient’s case is the following:
Gravidade=sqrt(abs((cube(log(((-2,677042)^2 + abs(histopath-status)^2)))*nephr-status+abs(bilirubin)*log(diastolic)+abs((1/((-
2,677042) + active)))*exp(average fever))))
The independent variables used are: active, bilirubin, nephr-status, histopath-status, average fever, diastolic
The window on top-right of figure 24 shows the same formula in a tree format.
The graph immediately below the two windows, shows the evolution of R2 over the several generations
created during the training cycle, becoming asymptotic and leading to the interruption after 1500
generations without improvement of the parameter (criterion defined by the developer - see fig 22)
Taking the formula derived by Chaos Hunter, we can apply it to the "out-of-sample" sample, in order to test
the predictive and generalizing capacity of the developed model.
The following graph, equivalent to that illustrated in Figures 16 and 18 (generated by NeuroShell Predictor),
compares the predicted versus observed values of the output variable "Gravidade" for each out-of-sample
sample line. The graph under it, in green, shows the differences between the two sets.
Fig. 25
As NS Predictor does, Chaos Hunter also offers the demonstration of test results in the most synthetic and
expressive form of the scatter plot, shown below for the "out-of-sample" and "in-sample" data (respectively
Figures 26 and 27). The first graph shows that the model is slightly "optimistic" (which may be a desired
outcome) with respect to the higher severity levels. This is because we have heavily selected polynomial
functions, but it can be corrected/balanced (if desired by the developer) by adding trigonometric functions to
the formula “building blocks”.
Fig. 26
Fig. 27
Once the developer is satisfied with the model's performance, she/he can simply save it by using the option
shown in the drop-down list of the "file" menu, in fig. 26 above. The executable can be called – in a
production regime - as a run-time module in any application program, which will pass the values of the six
input variables, receiving in response, the prediction of the evolution of the patient's severity. Alternatively,
since the model is a formula, it can simply be implemented as a built-in code. Finally, the model training
report can be viewed, printed or exported, as shown below.
Fig. 28
IV.4 – Bayesian decision-making engine

The outputs of the adaptive layer modules, namely the category (and possibly the subcategory) of the
pathology, the prediction of the evolution of the severity of the patient's state, and the predictive estimate of
the patient stay (resulting from the first two) are the main inputs to the decision layer, which also uses data
accumulated in external databases of the health system for which the application has been developed.
For the development of applications using decision analysis methodologies based on Bayesian mechanisms,
we frequently employ open source software such as BAT, BEAST and WinBUGS (the latter developed by the
MRC of Cambridge in partnership with the Imperial College of Medicine of London). For some very specific
applications, we can also use JAGS and / or Stan.
Perhaps the most instructive way to illustrate a Bayesian decision mechanism, without resorting to software
screens or topology alternatives for the decision network to be constructed, is to use simple “naked”
examples, as we will do next.
Let us assume that the adaptive layer gave the decision layer the following data about hypothetical patients
Juliana and Oswaldo:
Juliana Oswaldo
Age 38,7 y.o. Age 56,3 y.o.
Cat.: Systemic Cat.: Gastroenterological
Pathology Probability: 83% (*) Pathology Probability: 79% (*)
Subcat.: Dengue Subcat.: Hepatitis B
Estimated Estimated
12,7 days Probability: 77% (**) 16,2 days Probability: 81% (**)
length of stay length of stay
Severity 7,25 (high) Probability: 85% (**) Severity 8,72 (high) Probability: 77% (**)
(*) Calculated as p=1-Fp where Fp is the percentage of false positives for this pathology category, provided by NS Classifier training data.
(**) Calculated as 1-DL where DL (local dispersion) is the root mean square error (RMSE is an output info from NS Predictor and/or Chaos Hunter)
divided by the supplied value (severity or length of stay).
In summary, these are two relatively serious cases of different pathologies, and both are expected to be long-
term stay, so each should be redirected to a hospital as soon as possible. Both can, for example, require
hepatic dialysis support, thus requiring equipment usually unavailable in small community medical centers
and much disputed even in the conventional hospital network.
Now suppose that the application obtains from the external database (see figure 1), the following availability
data related to hospitals X and Y, both belonging to the hospital network of the health system in case.
Hospital X Hospital Y
Number of installed dialysis equipment 8 Number of installed dialysis equipment 5
Number of dialysis equipment currently in use 8 Number of dialysis equipment currently in use 5
Number of patients in the waiting list 5 Number of patients in the waiting list 4
Average turnover of the waiting list (days) 4 Average turnover of the waiting list (days) 2
Average turnover of the equipment set (days) 3,5 Average turnover of the equipment set (days) 2,1
Still from the same external database, the application captures the following statistics for the health system
as a whole (probabilities obtained from the frequency of records):
✓ Probability that a hospitalized user of hepatic dialysis (H) is a patient with severe cases of dengue (d): P(d│H) = 25%
✓ Probability that a hospitalized user of hepatic dialysis (H) is a patient with a severe case of hepatitis B (b): P(b│H) = 36%
✓ Probability that a patient in a serious condition (disregarding the pathology) makes use of hepatic dialysis: P(H)= 15%
Let us now assume, as a didactic simplification, that the severity of the patient's health status is independent
(correlation coefficient = 0) from the category and subcategory of the pathology (the formula found by Chaos
Hunter gives support to such hypothesis). We can then assume that the probability of Juliana being suffering
from a severe case of dengue is:
P(d)= 83% * 85% ≈ 70,6%
In the case of Oswaldo, the probability that he is a patient of a serious case of hepatitis B is:
P(b)= 79% * 77% ≈ 61%
Thus, according to Bayes' law, for Juliana's case, the probability that she will require hepatic dialysis support
P (H│d) provided that she is very likely to suffer from a severe case of dengue, is:
P(H│d) = P(d│H) * P(H)/P(d) => P(H│d) = 25% * 15%/70,6% => P(H│d) ≈ 5,31%
In the Oswaldo’s case, the probability that he will require hepatic dialysis support P(H│b) provided that he is
very likely to have a severe case of hepatitis B:
P(H│b) = P(b│H) * P(H)/P(b) => P(H│b) = 36% * 15%/61% => P(H│b) ≈ 8,85%
Based on the availability data of Hospitals X and Y, the decisions suggested by the application would be:
➢ Both patients should be forwarded to the conventional hospital network of the health system in
question, given that they will require care, technical resources and length of stay that can’t be
provided by small community health care units.
➢ Oswaldo should be forwarded to Hospital Y, and added to the waiting list for the establishment's
hepatic dialysis resource, since he is more likely to require such support.
➢ Juliana should be forwarded to hospital X, added to the queue for the dialysis resource of the same
hospital, but with a review of the whole situation within the next 48 hours (both the evolution of the
patient's status and the availability of hospital resources).
Obviously, this is a very simplified example, used just to illustrate, in a didactic way, the integrated working of
the four main components of the proposed solution. More complex statistical analysis should be considered
in a real situation. Example: Before deciding which patient will be transferred to which hospital, the standard
deviation of the dialysis equipment’s turnover (not just the average), as well as considerations about the
availability of beds for internment in the hospital, compared with the expected patient’s stay, should be
taken into account in the decision process.
To incorporate such analysis, it becomes inevitable to define and construct a Bayesian decision network, with
a topology that may take many shapes, although it usually follows the methodological guideline of influence
diagrams and / or causal diagrams. Even taking into account that the adaptive layer absorbs most of the
evolutionary dynamics of diseases and clinical conditions, decision trees and flowcharts with rigid rules still
have to be avoided in the decision layer. Instead, probabilistic trees (and equivalent flowcharts) should be
used as partial "branching", in the final stages of the decision-making process, but only in such cases, which
are rare.
V – About Ward Systems and IntelliSearch

Ward Systems is a world-class reference in AI applications designed to support technical decisions (in
medicine, biology, engineering, etc.), financial, commercial and administrative decisions. The software pieces
mentioned in this paper have gone through many years of development, exhaustively tested in a variety of
applications, and are constantly being improved to integrate new trends in AI (e.g., the inclusion of Particle
Swarm Optimization as an optimization method in Chaos Hunter, as an alternative to Genetic Algorithms).
On Health Care applications, there are many references on Ward Systems' own website:
( http://www.wardsystems.com/apptalk.asp )
Among the most illustrative (there are others in the link above), the following are especially remarkable:
➢ Forecasting Treatment Costs - Cleveland Clinic;
➢ Forecasting Length of Patient Stay - Johns Hopkins University School of Medicine;
➢ A Neural Network/Genetic Algorithm Model to Predict Caesarean Section in a Busy Labor Ward -
University Medical Center of UMDNJ
➢ Diagnosing Prostate Cancer - Kaman Sciences Corporation (Colorado Springs, CO)
IntelliSearch, for its part, has more than 12 years of continuous experience in the use of Ward Systems tools,
and along this period has developed additional software layers and methodologies (interfaces, input and
output data normalizers for neural networks, seeding rules and tools").
Our long-standing cooperation with Ward Systems also gives us the advantage of knowing the best way to
configure optimization parameters (used in the training process of neural networks), as well as methods and
tools used when selecting training data.
Finally, as the only active partners in Brazil, we enjoy the full credibility and support of Ward Systems to
provide local consulting services for its software tools. Thanks to this same long standing status of
partnership, we are authorized to practice special discounts for local customers, upon the standard license
price list.
Appendix
Evolution of the formula optimization, by Chaos Hunter, for the variable "Gravidade" (severity) of the
patient's overall condition. The last generation that showed improvement in formula (No. 1,693) is at the top
of the list. Previous improvements are listed in descending order of the number of the respective formula
generation.
Gen. R-Squared Formula deduced by Chaos Hunter for the dependent variable “Gravidade” (severity) Input used
active, bilirubin,
sqrt(abs((cube(log(((-2,677042)^2 + abs(histopath-status)^2)))*nephr- nephr-status,
1.693 0,89973 status+abs(bilirubin)*log(diastolic)+abs((1/((-2,677042) + active)))*exp(average histopath-status,
fever)))) average fever,
diastolic
active, bilirubin,
sqrt(abs((cube(abs(log(((-2,677042)^2 + histopath-status^2))))*nephr- nephr-status,
1.544 0,899029 status+bilirubin*log(diastolic)+abs(((1/(-2,677042)) + active))*abs(exp(average histopath-status,
fever))))) average fever,
diastolic
active, bilirubin,
sqrt(abs((cube(log(abs(((-2,322496)^2 + histopath-status^2))))*nephr- nephr-status,
1.297 0,892387 status+bilirubin*log(diastolic)+abs(((1/(-2,322496)) + active))*exp(average histopath-status,
diastolic
active, bilirubin,
sqrt(abs((nephr-status*cube(log(((-2,322496)^2 + abs(histopath- nephr-status,
765 0,889701 status)^2)))+log(diastolic)*bilirubin+abs((1/((-2,322496) + active)))*exp(average histopath-status,
diastolic
active, bilirubin,
762 0,887835 status)^2)))+(diastolic + active)*log(bilirubin)+abs((1/(-2,322496)))*exp(average histopath-status,
diastolic
active, bilirubin,
755 0,886372 status)^2)))+diastolic*active+(1/abs((log(bilirubin) + (-2,322496))))*exp(average histopath-status,
diastolic
stable, active,
sqrt(abs(((nephr-status*cube(log(((-2,322496)^2 + abs(histopath- bilirubin, nephr-
715 0,879897 status)^2)))+active*bilirubin+abs((1/(diastolic + (-2,322496))))*exp(average status, histopath-
fever)) + log(stable)))) status, average fever,
diastolic
stable, active,
sqrt(abs(((nephr-status*cube(log(((-2,322496)^2 + abs(histopath- bilirubin, nephr-
552 0,87943 status)^2)))+diastolic*bilirubin+abs((1/((-2,322496) + active)))*exp(average status, histopath-
fever)) + log(stable)))) status, average fever,
diastolic
active, bilirubin,
sqrt(abs((nephr-status*cube(log((histopath-status^2 + (- nephr-status,
505 0,868679 2,322496)^2)))+diastolic*abs(((-2,322496) + histopath-status,
abs((1/active))))+exp(bilirubin)*average fever))) average fever,
diastolic
stable, active,
abs(sqrt(abs((nephr-status*cube(log((histopath-status^2 + (- bilirubin, nephr-
415 0,868251 2,322496)^2)))+(diastolic + (1/(-2,322496)))*stable+(active + status, histopath-
exp(bilirubin))*average fever)))) status, average fever,
diastolic
stable, active,
abs(sqrt(abs((nephr-status*cube(log((histopath-status^2 + (- bilirubin, nephr-
415 0,865713 2,322496)^2)))+diastolic*(stable + (-2,322496))+(active + exp(bilirubin))*average status, histopath-
fever)))) status, average fever,
diastolic
stable, active,
bilirubin, nephr-
sqrt(abs((nephr-status*(cube(log((histopath-status^2 + 0,5174103^2))) +
376 0,861564 status, histopath-
(diastolic + abs(0,5174103)))+abs(active)*stable+exp(bilirubin)*average fever)))
status, average fever,
diastolic
stable, active,
sqrt(abs(((nephr-status*cube(log(((-2,322496)^2 + histopath- bilirubin, nephr-
358 0,861379 status^2)))+diastolic*abs((abs((-2,322496)) + active))+exp(bilirubin)*average status, histopath-
fever) + stable))) status, average fever,
diastolic
stable, active,
sqrt(abs(((nephr-status*cube(log((histopath-status^2 + (- bilirubin, nephr-
301 0,846962 2,322496)^2)))+diastolic*abs(((-2,322496) + abs(active)))+bilirubin*exp(stable)) + status, histopath-
average fever))) status, average fever,
diastolic
active, bilirubin,
sqrt(abs((((nephr-status*cube(abs(log((histopath-status^2 + (- nephr-status,
242 0,815608 2,322496)^2))))+diastolic*abs((-2,322496))+active*bilirubin) + (1/patology- histopath-status, avrg-
group)) + avrg-score))) score, diastolic,
patology-group
active, bilirubin,
sqrt(abs((abs((nephr-status*cube(log((histopath-status^2 + (- nephr-status,
226 0,814172 2,322496)^2)))+diastolic*abs((-2,322496))+active*bilirubin)) + (1/patology- histopath-status,
group)))) diastolic, patology-
group
stable, active,
sqrt(abs(((nephr-status*cube(abs(log((histopath-status^2 + (- bilirubin, nephr-
181 0,81123 2,322496)^2))))+diastolic*(abs((-2,322496)) + active)+bilirubin*stable) + status, histopath-
(1/patology-group)))) status, diastolic,
patology-group
stable, active,
bilirubin, nephr-
sqrt(abs((nephr-status*abs(abs((cube(log((histopath-status^2 + (-2,322496)^2)))
170 0,805314 status, histopath-
+ diastolic)))+active*bilirubin+(1/stable)*(patology-group + (-2,322496)))))
status, diastolic,
patology-group
stable, active,
sqrt(abs((nephr-status*cube(abs(log((histopath-status^2 + (- bilirubin, nephr-
160 0,78736 2,322496)^2))))+abs(diastolic)*(active + (-2,322496))+bilirubin*exp((stable + status, histopath-
(1/patology-group)))))) status, diastolic,
patology-group
stable, active,
sqrt(abs((nephr-status*cube(log(((-2,322496)^2 + histopath- bilirubin, nephr-
145 0,650744 status^2)))+(abs(diastolic) + abs(active))*bilirubin+stable*exp((patology-group + status, histopath-
(-2,322496)))))) status, diastolic,
patology-group
stable, active,
abs(exp(ln((stable*active+histopath-status*log(nephr-status)+avrg- bilirubin, nephr-
117 0,607214
score*abs((1/bilirubin)))))) status, histopath-
status, avrg-score
nephr-status,
101 0,537564 cube(abs(log((nephr-status^2 + (histopath-status + (-1,154784))^2))))
histopath-status
nephr-status,
71 0,537546 cube(log((nephr-status^2 + (histopath-status + (-1,154784))^2)))
histopath-status
nephr-status,
64 0,449154 log((3,054628^2 + (cube(histopath-status) + nephr-status)^2))
histopath-status
nephr-status,
56 0,448525 log(((-1,988753)^2 + (cube(histopath-status) + nephr-status)^2))
histopath-status
nephr-status,
41 0,447577 log((4,5419^2 + (cube(histopath-status) + nephr-status)^2))
histopath-status
nephr-status,
28 0,423851 log((nephr-status^2 + cube(histopath-status)^2))
histopath-status
nephr-status,
22 0,340846 log((histopath-status^2 + cube(nephr-status)^2))
histopath-status
20 0,323247 log(((-14,07932)^2 + cube(nephr-status)^2)) nephr-status
gastric-status, nephr-
18 0,282241 log((gastric-status^2 + cube(abs(nephr-status))^2))
status
bilirubin, glicemic-
17 0,250548 log(((bilirubin + pow(glicemic-status, diastolic))^2 + cube(nephr-status)^2)) status, nephr-status,
diastolic
bilirubin, nephr-
15 0,245556 log(cube((pow(bilirubin, diastolic)^2 + nephr-status^2)))
status, diastolic
bilirubin, cardiac-
10 0,23731 log(((cardiac-status + pow(bilirubin, diastolic))^2 + cube(nephr-status)^2)) status, nephr-status,
diastolic
bilirubin, cardiac-
log(((bilirubin + pow(sqrt(abs(cardiac-status)), diastolic))^2 + abs(cube(nephr-
5 0,197058 status, nephr-status,
status))^2))
diastolic
Rights of use and intellectual property of this paper

NeuroShell Classifier (also aforementioned as NS Classifier), NeuroShell Predictor (or NS Predictor),
NeuroShell Fire, and Chaos Hunter are trademarks of Ward Systems. The same applies to all figures shown in
this material, except for numbers 1 and 2 (pages 2 and 3, respectively) and the illustration on the cover.
All the remaining text and figures (except the free one in the cover) are intellectual property of IntelliSearch
(business name Experteam Consultoria Ltda.).
This content is authorized for use only as a consultation paper for your organization and must not be
distributed to any other natural person or organization, even if closely related to you or your company.

Optimizing Health Care Assignments and Precedures With AI

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimizing Health Care Assignments and Precedures With AI

Uploaded by

Copyright:

Available Formats

OPTIMIZING HEALTH CARE IN HOSPITALS,

EMERGENCY ROOMS AND MEDICAL

Optimizing health care in hospitals,

I – The context of the health care assistance in developing countries

II – Objectives of the proposed solution

III – Application outline

IV.1 – Classification model for pathology identification

IV.2 – Prediction model of patient’s length and cost of stay

IV.3 – Predictive model for the evolution of the patient’s case

1. age Patient age (continuous variable – fed by its face value)

2. stable Tells if the patient is in a stable condition (Boolean Y/N)

3. active Is the patient conscious and active? (Boolean Y/N)

4. bilirubin Bilirubin level (continuous variable, normalized)

6. cardiac-status Patient’s cardiovascular status (discrete variable / categorical)

7. resp-status The patient's respiratory status (discrete variable / categorical)

8. gastric-status Gastric status of the patient (discrete variable / categorical)

9. nephr-status Renal status of the patient (discrete variable / categorical)

10. histopath-status Patient’s histopathological status (discrete variable / categorical)

11. average fever Normalized continuous variable

Second level detailed classification of the patient's pathology (categorical

IV.4 – Bayesian decision-making engine

V – About Ward Systems and IntelliSearch

Rights of use and intellectual property of this paper

You might also like