12 Classification

Classification Methods
Up to now we have been concerned with methods that: Display complex information. Detect patterns or trends. Now we will introduce methods that can be used to classify samples based on models that are developed.
Classification problems
Level I Simple classification into predefined categories. Level II Level I + detection of outliers, Level III Level II + prediction of an external property. Level IV Level II + prediction of more than one property.
Classification Methods
Many methods have been developed with new ones being published all of the time. Well look a some representative approaches. Linear Learning Machine
Supported by XLStat
Classification Methods All of these methods are considered supervised learning. Initial assumptions regarding membership or properties are made when developing a model. An initial evaluation of the data using exploratory data analysis is useful.
Discriminant Analysis Classification Trees K Nearest Neighbor SIMCA
The available methods and approaches may vary based on the package use.
Data sets
Needed develop and evaluate a classification model. Training set Representative samples used to build the model. The modeling software uses the class information. Evaluation set Samples of known class, used to test the model. The modeling software does not know the classes. Test set True unknowns.
Data pre-processing
With any of these methods, you may choose to do some sort of data preprocessing. Raw Is fastest. Scaled Gives equal weight to the variables. PCA Can be used to reduce noise, insignificant variables.
Data pre-processing
With some data sets, you may also want to some other types of pre-processing. Example. Spectral or chromatographic traces. Options may include: Smoothing, baseline correction, signal averaging, using the first or second derivative.
Creating an evaluation set The evaluation set is typically a sub-set of the training set that was omitted when building a model. Randomly pick a subset of the data. Random pick members from each class. Any approach that selectively removes a portion of the data could cause bias.
Leave-one-out validation A standardized approach for validation of a model where each sample serves as an evaluation set. 1. Omit a single sample from the set 2. Build the model 3. Test the omitted sample 4. Repeat the above steps until each sample has been omitted and tested once.
Your data While Leave-One-Out testing is the best approach, it can be slow for large sets. Alternate approaches are to leave two or more samples out with each pass. Samples should be randomly listed in the matrix. The same two (or more) sample should never be omitted together more than once.
Rule building methods

Methods where a set of rules are created to discriminate between classes. Linear learning machine One or more linear vectors are created to discriminate between classes. Discriminate analysis Linear or quadratic equations are used to separate classes. Classification trees Series of rules are used to sequentially classify.
Linear learning machine

The assumption is that one or more vector can be found that can be used to discriminate between our classes. This can make use of our raw data or work in PC space. PC space would be better as there would be noise reduction.

For simple classications, there can be many linear vectors that give complete class discrimination. You would select the one that gives the best partitioning. You are not limited to just 1 or 2-D vectors.

As the number of classes increases, the potential number of usable vectors will decrease. The problem can become complex very rapidly. You can reach a point where simple linear lines can no longer solve the problem.

In this example, a linear solution cant be found that discriminates between the classes. Clearly, there should be a way to discriminate - the classes appear to be well dened. A non-linear function may offer the best approach (discriminate analysis.
Discriminant Analysis (DA)

First descried by Fisher in 1936. Similar to LLM but can use both quantitative and qualitative variables. Approach uses linear models when sample classes have similar covariance matrices. Uses quadratic models when classes have dissimilar covariance matrices. Can have problems if you have variables with null variance or multicolinearity - must be eliminated.
Iris example
Well return to the Iris example dataset - using XLStats built in DA function. Were going to use autoscaled data.
DA with XLStat.
1 Sepal Width
0.75
0.5 Petal Sepal width Length 0.25
F2 (0.99 %)
Petal length 0
-0.25
-0.5
-0.75
-1 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1
F1 (99.01 %)
Coffee example
3 33 3 33 3 3 1 3 3 3 2 2 2 2 22 2 2 2 2 1 2 0 2 2 1 5 1 3 33 3 3 2 3 3 3 3 3 3 3 3 3 33 2 33 3 3 3 -10 3 3 3 3 3 3 -5 33 3 3 2 3 1 1 1 1 11 1 1 1 1 1 11 11 11 1 1 11 1 1 1 1 1 1 11 11 11 1 1 1 1 1 1 1 1 1 1 11 1 1 1
3 3 3
33
22 2 2 22 22 222 2 2 2 2 3 22 2 2 2 2 2 2 2 2 2 2 222 2 2 2 2 -4
10
This consisted of 6 types of coffee - identified based on MS data. To avoid colinearity and null variable problems, PCA scores were used (first 5 components).
15 K KK K K
K K
10
F2 (22.79 %)
E E E E E
5 S S 0 S S S S S 15 UU U U U U 20
-15
-10
-5
10
R C C C C C C C
R R
-5
R R R -10
F1 (56.19 %)
Classification trees
Predicts class membership by sequential application of rules based on predictor variables. With DA and LLM, you create a set of math models that are all applied at once. With classification trees, the predictor variables are evaluated as ordinal rules, one at a time.
Classification trees
Solid - liquid
Density > 1
Red or green
Density > 1
Iris example (yet again!) XLStat supports the use of classification and regression trees. Classification if the Y variable (class) is qualitative, regression if the Y variable is quantitative. The iris example is a classification example.
Iris example
= If Petal width is between 1 and 8 the assign to Species 1

[1, 8[
50
Node: 1 Size: 150 2 %: 100 1 Purity(%): 33.3
50
50
Petal width [8, 25[
30
50
Node: 2 0 Size: 50 2 %: 33.3 1 Purity(%): 100
50
Node: 3 Size: 100 2 %: 66.7 10 Purity(%): 50
50
Petal width [8, 16.5[ [16.5, 25[
3
2
45
Node: 4 Size: 53 2 %: 35.3 10 Purity(%): 90.6
48
Node: 5 Size: 47 2 %: 31.3 10 Purity(%): 95.7
Petal length [30, 50.5[ [50.5, 58[ [45, 50.5[
Petal length [50.5, 69[
37
Node: 6 Size: 46 2 %: 30.7 10 Purity(%): 97.8
45
Node: 7 2 Size: 7 %: 4.7 10 Purity(%): 57.1
Node: 14 Size: 10 2 %: 6.7 10 Purity(%): 80
Node: 15 0 Size: 37 2 %: 24.7 10 Purity(%): 100
Petal length [30, 47.5[ [47.5, 50.5[ [60, 62.5[
Sepal Length [62.5, 72[
30
41
Node: 8 Size: 41 2 %: 27.3 10 Purity(%): 100
Node: 9 2 Size: 5 %: 3.3 10 Purity(%): 80
Node: 12 2 Size: 3 %: 2 10 Purity(%): 66.7
Node: 13 2 Size: 4 %: 2.7 10 Purity(%): 75
Sepal Width [22, 23.5[ [23.5, 31[
30
3
Node: 10 2 Size: 2 %: 1.3 10 Purity(%): 50
Node: 11 2 Size: 3 %: 2 10 Purity(%): 100
Node: 1 Size: 150 2 %: 100 1 Purity(%): 33.3
50
50
Petal width [8, 25[
50
Node: 3 Size: 100 2 %: 66.7 10 Purity(%): 50
50
Petal width [8, 16.5[ [16.5, 25[
3
2
45
Node: 4 Size: 53 2 %: 35.3 10 Purity(%): 90.6
48
Node: 5 Size: 47 2 %: 31.3 10 Purity(%): 95.7
Petal length [50.5, 58[ [45, 50.5[
Petal length [50.5, 69[
Using Classication Tree

37
ode: 6 ize: 46 2 %: 30.7 10 urity(%): 7.8
30, 0.5[
45
Node: 7 2 Size: 7 %: 4.7 10 Purity(%): 57.1
Node: 14 Size: 10 2 %: 6.7 10 Purity(%): 80
Node: 15 0 Size: 37 2 %: 24.7 10 Purity(%): 100
Petal length [47.5, 50.5[ [60, 62.5[
Sepal Length [62.5, 72[
Node: 9 2 Size: 5 %: 3.3 10 Purity(%): 80
Node: 12 2 Size: 3 %: 2 10 Purity(%): 66.7
Node: 13 2 Size: 4 %: 2.7 10 Purity(%): 75
Purity is just the percent of samples assigned to that node.
Using DA
Sepal Width [23.5, 31[
30
3
10 2 2 3 10 (%):
Node: 11 2 Size: 3 %: 2 10 Purity(%): 100
Wine example
Riesling vs. Chardonnay. Ohio vs. California. Assayed 5 organic and 4 trace metal components. Yes, youll do the same with your homework.
Node 1 2 3
Class CaC CaC CaR
Freq. 17 17 7
Purity 41.46% 58.62% 58.33%
Rules
If Ca in [17.5, 60.75[ then Class = CaC in 58.6% of cases If Ca in [60.75, 94.75[ then Class = CaR in 58.3% of cases If 2,3-butanediol in [0, 0.065[ and Ca in [17.5, 60.75[ then Class = CaR in 60% of cases If 2,3-butanediol in [0.065, 0.514[ and Ca in [17.5, 60.75[ then Class = CaC in 70.8% of cases
CaR
60.00%
CaC
17
70.83%
CaC
14
If Mn in [0.82, 1.625[ and 2,3-butanediol in [0.065, 100.00% 0.514[ and Ca in [17.5, 60.75[ then Class = CaC in 100% of cases If Mn in [1.625, 3.51[ and 2,3-butanediol in [0.065, 70.00% 0.514[ and Ca in [17.5, 60.75[ then Class = OhC in 70% of cases
OhC
OhR
6 8 10 17
CaC
If K in [735.5, 881.75[ and Mn in [1.625, 3.51[ and 60.00% 2,3-butanediol in [0.065, 0.514[ and Ca in [17.5, 60.75[ then Class = CaC in 60% of cases If K in [881.75, 1147.5[ and Mn in [1.625, 3.51[ and 100.00% 2,3-butanediol in [0.065, 0.514[ and Ca in [17.5, 60.75[ then Class = OhC in 100% of cases If 1-hexanol in [0.638, 0.723[ and K in [735.5, 881.75[ and Mn in [1.625, 3.51[ and 2,3-butanediol 100.00% in [0.065, 0.514[ and Ca in [17.5, 60.75[ then Class = OhC in 100% of cases If 1-hexanol in [0.723, 1.056[ and K in [735.5, 881.75[ and Mn in [1.625, 3.51[ and 2,3-butanediol 100.00% in [0.065, 0.514[ and Ca in [17.5, 60.75[ then Class = CaC in 100% of cases If 1-hexanol in [0.409, 0.673[ and Ca in [60.75, 83.33% 94.75[ then Class = OhR in 83.3% of cases If 1-hexanol in [0.673, 1.218[ and Ca in [60.75, 100.00% 94.75[ then Class = CaR in 100% of cases
[0.638, 0.723[
OhR
0 2 0 0
Node: 1 OhC Size: 41 CaR %: 100 CaC Purity(%):
Ca [17.5, 60.75[
OhR
1 8 3 17
OhC
Node: 2 OhC Size: 29 CaR %: 70.7 CaC Purity(%):
[60.75, 94.75[
OhR
5 0 7 0
2,3-butanediol [0, 0.065[

OhR
1 1 3 0
1-hexanol
0 7 0 17
10
OhC
Node: 4 OhC CaR Size: 5 %: 12.2 CaC Purity(%):
[0.065, 0.514[
OhR
[0.409, 0.673[
OhR
5 0 1 0
[0.673, 1.218[
OhR
0 0 6 0
Mn [0.82, 1.625[
OhR
0 0 0 14
11
CaC
[1.625, 3.51[
OhR
0 7 0 3
K [735.5, OhR 881.75[OhC

0 2 0 3
12 13
OhR CaR
5 6
Node: 8 CaR Size: 5 %: 12.2 CaC Purity(%):
[881.75, OhR 1147.5[
0 5 0 0
1-hexanol [0.723, 1.056[

OhR
0 0 0 3
K nearest neighbor classification

A similarity-based classification method.
Confusion matrix for the estimation sample: from \ to CaC CaR OhC OhR Total CaC 17 0 0 0 17 CaR 0 9 1 1 11 OhC 0 0 7 0 7 OhR 0 1 0 5 6 Total 17 10 8 6 41 % correct 100.0% 90.0% 87.5% 83.3% 92.7%
It attempts to assign categories to unknown samples based on multivariate proximity to other samples. It works best with discrete classification types and is tolerant of poor data sets. K - ! The number of closest neighbors being compared. Consider this as the supervised version of HCA.
K nearest neighbor classification In its simplest form, KNN is conducted by: First, a training set is collected that contains examples of each class. Intersample distances are then calculated. 2 N "
KNN
The distance matrix is sorted and the distance of the unknown sample can be compared to: 1. The K nearest neighbors 2. The nearest class cluster. Option 2 requires that K = 1.
da " b=
!^a
j =1
- b bh
where N = # of variables or components used.
KNN
When using the distance to a class, you can use the same link options that were discussed earlier. The distance can be based on: Single link - closest member of class. Complete link - farthest member of class. Centroid - center of class cluster.
KNN - single link

Single link
In this example, the unknown is compared to the 3 closest known samples.
K=3
In this case, the three closest samples are all red.
KNN - centroid link

Centroid link
KNN
Ideally, if a test sample falls well within a known class, its closes neighbors should all be of one class.
With this approach, the distance to the center of a class cluster is determined and compared.
Here, all of the blue samples would be closer to the unknown than any of the green.
Mycobacteria - HCA
1000 900 800
Mycobacteria - k means
A quick review of ALL of the ways that this data set was difcult to get useful information from.
700
600
500
400
300
200
100
46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 49 46 46 46 44 44 44 44 44 47 44 44 44 44 44 44 44 44 44 44 43 44 43 43 43 43 43 43 43 43 43 43 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 43 43 43 43 43 43 45 45 45 45 45 45 45 45 45 45 45 45 45 45 45 43 43 43 43 43 43 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 43 43 45 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47 47
Mycobacteria - PCA
4.000 3.000
Mycobacteria - DA
2.000
1.000
-6.000
-4.000
-2.000
0.000 0.000
2.000
4.000
6.000
8.000
10.000
42 43 44 45 46 47 49
-1.000
-2.000
-3.000
Mycobacteria - DA
42
Mycobacteria - DA
42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42
10
F2 (29.63 %)
-25
-20
-15
-10
-5
47 47 47 47 4744 47 47 47 47 47 47 47 44 44 44 47 47 44 44 44 47 44 43 43 44 44 44 47 43 43 44 44 44 47 44 47 43 43 4344 47 43 4343 5 47 43 43 43 43 43 43 43 4343 43 43 43 43 43 43 45 43 45 45 45 45 45 45 45 45 49 0 45 45 45 45 45 45 45 0 45 45 45 4549 45 45 49 49 5 49 45 49 45 45 45 49 49 45 45 45 4949 45 4949 49 49 49 49 49 49 49 -5 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 -10 46 46 46 46
10
46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46
-15
F1 (56.45 %)
10
Mycobacteria - DA
47 43 43 43 43 43 43 45 45 45 0 45 45 45 45 43 43 43 44 44 44 44 43 43 43 44 43 43 43
Getting out the vote

47 47 4747 47 47 47 47 47 47 47 44 44 47 44 47 47 44 44 44 47 47 44 44 44 44 44 44 47 4747 47
42 42 42 42 42 42 42 42 42 42 42 42 45 45 45
What if a samples distances is such that it could be in more than one class? When you have more than one possible class, we can take a vote. The class with the most votes wins.
F2 (29.63 %)
43 43 43 43 43 43 43 43 43 45 45
43
43 49 49 49 49 49 4949 49 49 49 5 49 49 49 49
45 45 45 45 45 45 45 45 45 45 45 45 45 49 45 45 45 45 45 45 45 45 45 49 49 49
49
-5 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46 46
F1 (56.45 %)
K=5

Example - K = 5 Sample Class 1 A 2 B 3 A 4 B 5 B Distance 0.134 0.145 0.158 0.234 0.502
Example - K = 3 Sample Class 1 A 2 B 3 A
Distance 0.134 0.145 0.158
Here you would end up with 3 votes for B and 2 for A. B would win.
Here you would end up with 2 votes for A and one for B - A would win and the distances would be smaller.

Example - K = 5 Sample Class 1 A 2 B 3 A 4 B 5 C
KNN validation
The optimum number for K can be found by trial an error but for a close match, it should make no difference. The classifying power of your data can be evaluated by leave one out validation of your training set. This should be done before any sort of real classification begins.
Distance 0.134 0.145 0.158 0.234 0.502
Here, A and B would tie. The tie-breaker would be that A averages a smaller distance so would be made the winner.
KNN validation
Validation You can sequentially leave out each of your samples and test it for votes at several K values. You end up with a vote matrix that will tell you the optimum K value for each class. You will also get a misclassification matrix "this tells you how often one of your knowns are incorrectly classified.
K nearest neighbor classification

So KNN will always assign a class. What if you have a material that is not a member of an existing class? One option is to set a maximum distance. Example Your intraclass distances run about 0.2 for all of your classes, you might want to omit votes with distances that exceed 0.2.
Iris (of course)

The Iris data set is included with a demo of the program Pirouette. Well be using the Pirouette demo to show how to conduct KNN and SIMCA classifications. You can download a copy of the demo from www.infometrix.com. The demo is fully functional but only with the data sets that are provided by Infometrix. The actual software is pretty easy to use but too expensive for our use in the course.
Iris example
Iris - scores by class
Iris - voting results
Iris - class partitions
Cola example
What? NOT the Iris data set? Headspace MS of 4 cola classes. Two cola brands. Diet and regular. m/e 44 - 149. May need to preprocess to eliminate any nonvariant data.
Class 1 2 3 4 Brand 1 Diet brand 1 Brand 2 Diet brand 2
PCA scores
PCA scores
PCA loadings
KNN classification
Not a bad job!
KNN classifications
SIMCA
Soft Independent Modeling of Class Analogy
A method of classification that provides: Detection of outliers. Estimates of confidence for a classification. Determination of potential membership in more than a single class.
SIMCA
Basic approach. For each class of samples, a PCA model is constructed. This model is based on the optimum number of components that best clusters an individual class. The optimum number of components can vary from class to class and can be determined by cross-validation
SIMCA models
Since the number of components used can vary, each class will be best described by its own hypervolume.
SIMCA models
Limitation of a class hypervolume. You can limit the size of a hypervolume by setting a standard deviation cutoff. This results in better defined classes.
SD = 3
SD = 2
SIMCA models
Once a model has been created for each class, you are ready to classify unknowns. For each model/sample combination: + The sample is transformed into PC space and compared to see if is a likely class member. + If it is within the hypervolume of a single class, you have a match.
SIMCA classification
The potential still exists for a sample to be classied as a member of more than one class.
It may also not be a member of any known class
SIMCA classification
SIMCA will give you an estimate as to the probability of class membership. Example - two possible classes. " " Probability " Class A" " 0.90 " Class B 0.45 Here, the sample is more likely to be a member of Class A.
SIMCA summary
Of the methods covered, SIMCA offers the most options for developing a classification model when the classes are well known. It also requires the most development time as you must determine the optimum model conditions for each class. If used, plan on spending quite a bit of time working with all of the available options.
SIMCA example - Iris.

Of course well look at the iris dataset again.
Note: We have a separate model for each class in the data set - in this case three.
Pirouette will provide an estimate as to the class hypervolumes based on the rst three PCs.

These plots show the relative positions of each sample when projected into any of the three class models - two classes at a time - with color coding based on known class.
It appears that petal length is the most useful for classifying.
Cola example
With the cola example (two brands, diet and regular), we have 4 classes. Here you can see that the classes are pretty well resolved.
Cola example
Mycobacteria again
This data set is included with the Pirouette demo. File = Mycosing.wks It is a subset of the version Ive been using (only 72 samples)
Mycobacteria SIMCA
Perfect classifications - a first for this dataset.
Mycobacteria SIMCA
Mycobacteria SIMCA
Example shows that a different number of components were used in developing the individual SIMCA hypervolumes.
Discriminating Power is a measure of which variables show the biggest class differences.
Mycobacteria SIMCA
Modeling power indicates the relative importance of each variable for classification.
Mycobacteria SIMCA
PC plots are pretty boring since you only have one class. However, it can be used to see if you have any sub-classes.
Loadings, as always show the relative significance of each variable in constructing each PC There are relatively unimportant.
Outliers are test for by plotting sample residuals (difference between sample and center of hypervolume) vs its Mahalanobis distance from the center of the cluster - similar to a Euclidian distance but takes into account correlations of the data and is scale invariant.
Mycobacteria SIMCA

12 Classification

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12 Classification

Uploaded by

Copyright:

Available Formats

Classification Methods

Discriminant Analysis Classification Trees K Nearest Neighbor SIMCA

Rule building methods

Linear learning machine

Linear learning machine

Linear learning machine

Linear learning machine

Discriminant Analysis (DA)

0.5 Petal Sepal width Length 0.25

-1 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 1

= If Petal width is between 1 and 8 the assign to Species 1

Node: 1 Size: 150 2 %: 100 1 Purity(%): 33.3

Petal width [8, 25[

Node: 2 0 Size: 50 2 %: 33.3 1 Purity(%): 100

Node: 3 Size: 100 2 %: 66.7 10 Purity(%): 50

Petal width [8, 16.5[ [16.5, 25[

Node: 4 Size: 53 2 %: 35.3 10 Purity(%): 90.6

Node: 5 Size: 47 2 %: 31.3 10 Purity(%): 95.7

Petal length [30, 50.5[ [50.5, 58[ [45, 50.5[

Petal length [50.5, 69[

Node: 6 Size: 46 2 %: 30.7 10 Purity(%): 97.8

Node: 7 2 Size: 7 %: 4.7 10 Purity(%): 57.1

Node: 14 Size: 10 2 %: 6.7 10 Purity(%): 80

Node: 15 0 Size: 37 2 %: 24.7 10 Purity(%): 100

Petal length [30, 47.5[ [47.5, 50.5[ [60, 62.5[

Sepal Length [62.5, 72[

Node: 8 Size: 41 2 %: 27.3 10 Purity(%): 100

Node: 9 2 Size: 5 %: 3.3 10 Purity(%): 80

Node: 12 2 Size: 3 %: 2 10 Purity(%): 66.7

Node: 13 2 Size: 4 %: 2.7 10 Purity(%): 75

Sepal Width [22, 23.5[ [23.5, 31[

Node: 10 2 Size: 2 %: 1.3 10 Purity(%): 50

Node: 11 2 Size: 3 %: 2 10 Purity(%): 100

Node: 1 Size: 150 2 %: 100 1 Purity(%): 33.3

Petal width [8, 25[

Node: 3 Size: 100 2 %: 66.7 10 Purity(%): 50

Petal width [8, 16.5[ [16.5, 25[

Node: 4 Size: 53 2 %: 35.3 10 Purity(%): 90.6

Node: 5 Size: 47 2 %: 31.3 10 Purity(%): 95.7

Petal length [50.5, 58[ [45, 50.5[

Petal length [50.5, 69[

Using Classication Tree

ode: 6 ize: 46 2 %: 30.7 10 urity(%): 7.8

Node: 7 2 Size: 7 %: 4.7 10 Purity(%): 57.1

Node: 14 Size: 10 2 %: 6.7 10 Purity(%): 80

Node: 15 0 Size: 37 2 %: 24.7 10 Purity(%): 100

Petal length [47.5, 50.5[ [60, 62.5[

Sepal Length [62.5, 72[

Node: 9 2 Size: 5 %: 3.3 10 Purity(%): 80

Node: 12 2 Size: 3 %: 2 10 Purity(%): 66.7

Node: 13 2 Size: 4 %: 2.7 10 Purity(%): 75

Purity is just the percent of samples assigned to that node.

Sepal Width [23.5, 31[

Node: 11 2 Size: 3 %: 2 10 Purity(%): 100

Class CaC CaC CaR

Purity 41.46% 58.62% 58.33%

Node: 1 OhC Size: 41 CaR %: 100 CaC Purity(%):

Node: 2 OhC Size: 29 CaR %: 70.7 CaC Purity(%):

Node: 3 OhC Size: 12 CaR %: 29.3 CaC Purity(%):

2,3-butanediol [0, 0.065[

Node: 4 OhC CaR Size: 5 %: 12.2 CaC Purity(%):

Node: 5 OhC Size: 24 CaR %: 58.5 CaC Purity(%):

Node: 12 OhC CaR Size: 6 %: 14.6 CaC Purity(%):