You are on page 1of 26

Bernard, O., Lalande, A., Zotti, C.

, Cervenansky,
F., Yang, X., Heng, P. A., ... & Sanroma, G.
(2018). Deep learning techniques for
automatic mri cardiac multi-structures
segmentation and diagnosis: Is the problem
solved?. IEEE transactions on medical imaging,
37(11), 2514-2525.

1
Agenda
- What is CMRI?
- challenges related to CMRI
- The Purpose of the paper
- Machine learning techniques for CMRI from 2009-2015
- Challenges
- Non deep Learning approach
- Deep Learning approach
- Evaluation Framework
- Evaluated architecture
- Results
- Clinical applications

2
What is CMRI?

Cardiac function analysis through the assessment (Manual, semi-automatic


segmentation) of the left and right ventricular:

- Ejection fractions (EF)


- stroke volumes (SV)
- Left ventricle mass
- Lyocardium thickness

The analysis require accurate delineation of the left and right ventricular
endocardium and epicardium, for both end diastolic (ED) and end systolic (ES)

3
Challenges related to CMRI Segmentation

1. Brightness heterogeneities in the left ventricular/right ventricular cavities


due to blood flow;
2. Presence of trabeculae and papillary muscles with intensities similar to the
myocardium;
3. Non-homogeneous partial volume effects due to the limited CMR resolution
along the long-axis;
4. Inherent noise due to motion artifacts and heart dynamics;
5. Shape and intensity variability of the heart structures across patients and
pathologies;
6. Presence of banding artifact(inacurate colors).

4
Purpose of the paper

1. How accurate recently proposed segmentation methods are at delineating


the LV, RV and myocardium given clinical MR images?
2. How accurate recently proposed classification methods are at predicting
the pathology of a patient given clinical MR images?
3. When methods fail, where do they fail?
4. How far are we from ”solving” the problem of automatic CMRI analysis?

5
Machine learning techniques for CMRI from 2009-2015: challenges

2009 2011 2012 2015

MICCAI LV Segmentation MICCAISTACOM LV Right Ventricle Kaggle Second Annual


challenge Segmentation Challenge Segmentation MICCAI Data Science Bowl4

●45 CMRI ●200 CMRI ●48 CMRI ●700 patients(500


●4 patients classes ●2 patients classes ●Manually drawn by one training , 200 testing)
●2 manually drawn contour ●semi-automatic method cardiologist ●Data contains ED and
● Dice scores between 0.90 ●Best method: guide-point
●Best result from graph ES reference volumes
and 0.94 modeling technique
cut method ●Best results were
● Average Jaccard score of
. 0.84
●Dice score 0.78 provided by
●with fully connected convolutional neural
CNN gave better results networks CNN , UNet
Dice score of 0.85.
6
U-net CNN

7
Machine learning techniques for CMRI from 2009-2015: non deep learning
techniques

strong prior methods which require


sets with manual anotations

●shape prior based deformable


weak prior methods (assumptions such
models
as spatial, intensity or anatomical
●active shape and appearance models
information)
●atlas based methods
●image-based techniques (threshold,
dynamic programming)
●pixel classification methods
●deformable models (active contour,
level-set)
●graph-based approaches (graph-cut)

8
Machine learning techniques for CMRI from 2009-2015: deep learning techniques

To extract relevant features , segment cardiac structures or


detect missing slices

● patch-based CNN
● 2D convolutional neural networks (CNNs)
● Deep learning frameworks
Dozen of papers
Not used before LV segmentation
published in the
2013 ● combined approach between CNN and multi-atlas
subject ● automatic combined deep-learning and deformable
model approach
● Recurrent Fully-Convolutional Network (RFCN) for both
LV and RV

temporal regression framework: 2D CNN: to encode the spatial


information + recurrent neural network (RNN): to decode the
temporal information
recurrent fully-convolutional network (RFCN)learns image
representations from the full stack 2D slices by transforming
images from 2D to 3D

9
Evaluation Framework dataset: Patients selection

Origin 150 patients splitted in 5 classes Exclusion criteria

Real clinical exams acquired ● Dilated Cardiomyopathy Patients with ambiguous


at the University Hospital of (DCM) clinical indices were excluded
Dijon (France). ● Hypertrophic from this study.
cardiomyopathy (HCM)
● Myocardial infarction with
altered left ventricular
ejection fraction (MINF)
● Patients without cardiac
disease (NOR)

1
Evaluation Framework dataset:
Acquisition protocol
6 years

● Depending on the patient, 28 to 40 2 MRI scanner


volumes were acquired

● The dataset contains natural


variability in the image quality Siemens Area
Siemens Trio Tim
● Long axis slices were not provided to
be in compliance with the previous
challenges

11
Evaluation Framework dataset: Training and testing dataset

The data for each subject was converted to a general 4D image representation
format (nifti) without loss of resolution.

- balanced training 100 patients, i.e. 20 patients for each group.


- balanced testing dataset is composed of 50 patients, i.e. 10 patients per
group.

12
Evaluation Framework: Reference segmentation and contouring protocol

● References are manually-drawn 3D volumes


● Data represents: LV and RV cavities as well as the myocardium, both at the
ED and ES gates.
● The contours were drawn and double-checked by two independent experts
● Conventional boundaries were defined for each region included in the
dataset
● The ground truth label images were stored in nifti (Neuroimaging Informatics
Technology Initiative)format.
● label values vary from 0 to 3 and represent voxels( pixel for 3D images)
belonging to the background (0), the RV cavity (1), the myocardium (2) and
the LV cavity (3).
13
Evaluation Framework:Evaluation metrics(1)

Geometrical metrics:

3D Hausdorff distance which is the greatest of all the distances from a point in
one set to the closest point in the other set. allows an intrinsic management of
the missing segmentation problem on the end slices.

Dice similarity index: D = 2 (|Vuser \ Vref| ) = (|Vuser| + |Vref| )

measure of overlap between the segmented volume Vuser extracted from a


method and the corresponding reference volume Vref. It gives a measurement
value between 0 (no overlap) and 1 (full overlap).

14
Evaluation Framework:Evaluation metrics(2)

Clinical performance:

Correlation (corr), Bias, Standard deviation (std)

i) the End Diastolis (ED) volumes

ii) the Ejection Fractions (EF)

iii) the myocardium mass

Bias+std: Provides useful information on the corresponding limit of agreement


values.

15
Evaluation Framework:Evaluation metrics (3)

Classification performance:

● Accuracy: For the whole examinations of the testing database and per
disease
● Confusion matrix (Precision, Recall, F-score)

16
For segmentation For classification

2D U-Net SVM
Dense U-Net 2D Random forest
3D U-Net
2D+3D U-Net
2D M-Net
2D Grid-Net
SVF-Net
Levelset+MRF
Dilated CNN
17
EVALUATED ARCHITECTURES

18
A. Architectures for cardiac multi-structure segmentation :

● Ten architectures involved in this study is provided in this Table.

● Nine methods implemented a deep convolutional architecture, most of which


a U-Net like networks analyzing the 3D data slice by slice.

● Wolterink et al.is the only team that implemented a CNN without an


encoder-decoder architecture.

19
. B.Solutions for automatic cardiac diagnosis:

Three participants of the segmentation challenge used their segmentation


result to extract features for cardiac diagnosis.

● · Isensee et al. extracted a series of instants and dynamic features from the
segmentation maps and used an ensemble of 50 multilayer perceptrons
(MLP) and a random forest to perform classification.

● · Khened et al. used 11 features, 9 derived from their segmentation map


in addition to the patient weight and height. From those features, they trained
a 100 trees random forest classifier.

20
● Wolterink et al. extracted 14 features (12 from the segmentation maps +
patient weight and height) and used a five-class random forest classifier with
1,000 decision trees.

● Cetin et al. were the only one to involved a semiautomatic segmentation


method to manually extract the contours of the cardiac structures.

21
The best performing method
They computed 567 features including physiological features (e.g. height
and weight) and radiomic features such as shapebased features, intensity
statistics, and various texture features.

To prevent their method from overfitting, they selected the most


discriminative features and used SVM for classification.

From these results, one can see that the 2D-3D U-Net ensemble model
proposed by Isensee et al. is overall the top performing method for
segmentation .
As we had 50 patients . Khened et al. obtained nearly perfect results with
48 patients correctly classified.(96% AUC ) with random forest algorithm.
22
Discussion
Where do methods fail?

· One hypothesis can be that hearts suffering from a pathology may be


more difficult to segment.

· Images from healthy subjects (NOR) are not easier to segment than
those from pathological cases as the scores relative to this group get the
largest Hausdorff distances for the LV-ED and LV-ES.

· Another hypothesis would be that 1.5T images are more difficult to


segment than 3T CMR images .

23
...
○ Another hypothesis commonly accepted in the community is that slices
next to the valves and/or the apex of the ventricle are more difficult to
segment due to partial volume effect with surrounding structures.

○Finally, it is worth pointing that the use of a larger database than the one
involved in this project might help in resolving the listed remaining issues.

24
CONCLUSIONS

The state-of-the-art machine learning methods can successfully classify patient


data and get highly accurate segmentation results. Results also reveal that the
best convolutional neural networks get accurate correlation scores on
clinical metrics . However, methods are still failing at the base and the apex,
especially when considering the Hausdorff distance.

There is a need of a new metric because the fact that clinical and geometrical
metrics used to assess results have important limits and that methods within the
inter-observer variability may still be error-prone.

25
...
We believe that some other pathologies such as inflammatory
cardiomyopathy could be successfully diagnosed with the proposed
machine learning methods, other (yet more complex) diseases such as
congenital heart diseases or heart defect, would need dedicated studies.

26

You might also like