Professional Documents
Culture Documents
Student Name
Matric No
Supervisor Name
Semester
Educational Video
Caption Text
Scene Text
Basic Architecture
Method
Weakness
(Huang,
2012)
(Bai et al.,
2012)
(Liu and
Wang, 2012)
(Yang et al.,
2011)
Text Detection
Used to come together with text
localization.
Usually ignored by researchers.
Current methods are too complicated
(create issue of performance due to
video domain).
Text Detection
Reference
Method
Weakness
(Jung et al.,
2009)
(Li et al.,
2011; Pan et
al., 2011)
Text Localization
Text Localization
Reference
Method
Weakness
(Carnicer et
al., 2011)
(Liu and
Wang, 2012)
Text Extraction
Aims to produce a binary text image.
To distinguish between text pixels and
background pixels.
Noise and Missing Stroke cause the
imperfectness of binary text image.
Text Extraction
Reference
Method
Weakness
(Haneda and
Blockwise segmentation and
Charles, 2011) global segmentation.
(Liu and
Wang, 2012)
Problem Statement
Sub. Questions:
Research Objectives
Scopes
Targeted text with stroke bigger than 7
pixels.
Limited to video encoded with
H.264/MPEG-4 codec.
Text recognition is performed by open
source OCR software.
Research
Methodology
Frame Extraction
Educational Video
Text Images
Text Localization
I-frames extraction
Filtered Frames
Text Detection
Identify sharp change of intensity
exceed height and width threshold
Text Extraction
Morphological Dilation expansion
on edge images
Video Sequence
Layer
Group of Picture
Layer
Decoder Sequence
Picture-1
Picture-2
PictureN
25
4
24
1
19
5
32
17
10
25
5
24
2
19
3
28
14
11
25
3
24
2
19
5
31
15
12
25
4
24
2
19
2
32
17
10
30
16
10
25for Educational
24 19
Text Extraction
Video
80
60
40
20
0
100
200
300
400
Image Width
500
600
70
100
200
300
400
Image Width
500
600
70
Line 1
Line 2
300
Intensity
250
200
150
100
50
0
33
20
18
8
28
34
16
63
49
37
56
32
17
21
8
4
7
29
18
19
0
19
14
4
34
20
21
70
22
52
75
33
18
22
25
8
22
22
20
19
0 32 32 26 24 34
4 30
4 22 39 33
22 22
8 21 28 31
21 20 22 23 29 30
29
8 42 51 35 30
20 19 49 57 63 53
24 28 73 64 79 42
38 62 75 63 73 72
63 84 63 102 84 93
65 115 101 104 111 79
69 102 104 104 100 96
56 50 37 79 68 87
33 36 22 34 84 91
38 19 36 49 70 85
12 21 23 40 74 69
8
8 28 94 120 118
0 21 27 80 63 69
4
7 29 36 21 26
36 18 28 29 33 39
8 31 18 25 32 19
38 39 42 50 46 16
36 39 52 62 72 51
34 48 75 76 68 61
42 45 60 90 112 98
36 48 62 97 104 53
45 52 63 81 80 77
40 51 48 68 66 55
77 93 65 78 60 71
107 106 87 101 59 78
102 91 109 97 88 106
117 108 93 86 86 104
110 83 78 62 84 60
100 98 80 59 62 70
104 80 96 87 53 58
84 79 89 80 72 70
111 103 115 122 108 89
95 78 69 78 75 49
28 38 22 20 14 20
22 21 22
4
4 20
25 22
4 19 34
4
12 28
36 29
37 24
38 25
31 31
38 21
39 27
16 28
57 84
89 97
84 103
76 46
73 50
66 54
61 50
58 45
43 19
22 19
8 17
19 13
11
24
8
12
12
12
13
12
26
22
22
8
27
10
8
7
6
21
4
4
8
16
39
18
19
19
15
23
10
14
16
8
8
8
17
17
11
20
2
16
22
23
23
12
8
8
8
9
14
12
7
19
17
18
8
4
19
18
35
36
Text of
Extraction
Fill in the strokes
text
Before
After
Apply SWT
on the outer
and inner
pixels
Remove the
original
dilation
Broken edge
Morphological Dilation in horizontal
and vertical until edges from
different direction is meet
Broken Edge
Image
Apply SWT
on the outer
and inner
pixels
Remove the
original
dilation
Significance of Research
Conclusion
The research target to extract text
information in educational video.
Targeted enhanced algorithms are to
improve the performance and the
accuracy for text extraction in video.
Thank You
References
Bai, B., Yin, F. and Liu, C.L. (2012), A Fast Stroke-Based Method for Text Detection in
Video, 10th IAPR International Workshop on Document Analysis Systems (DAS), pp.69-73.
Epshtein, B., Ofek, E. and Wexler, Y. (2010) , Detecting text in natural scenes with stroke width
transform, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.29632970.
Haneda, E. and Bouman, C.A. (2011), Text Segmentation for MRC Document
Compression, IEEE Transactions on Image Processing, vol.20, no.6, pp.1611-1626.
Huang, X.D. (2011), A novel video text extraction approach based on Log-Gabor filters, 4th
International Congress on Image and Signal Processing (CISP), vol.1, pp.474-478.
Huang, X.D. (2012), Automatic Video Text Detection and Localization Based on Coarseness
Texture, Fifth International Conference on Intelligent Computation Technology and Automation
(ICICTA), pp.398-401.
Li, M.H., Bai, M., Wang, C.H. and Xiao, B.H. (2010), Conditional random field for text
segmentation from images with complex background, Pattern Recognition Letters, Vol. 31,
Issue 14, pp. 2295-2308.
Liu, X.Q. and Wang, W.Q (2012), Robustly Extracting Captions in Videos Based on Stroke-Like
Edges and Spatio-Temporal Analysis, IEEE Transactions on Multimedia, vol.14, no.2, pp.482489.
Pan, Y.F., Hou, X.W. and Liu, C.L. (2011), A Hybrid Approach to Detect and Localize Texts in
Natural Scene Images, IEEE Transactions on Image Processing, vol.20, no.3, pp.800-813.
Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M. and Tan, C.L. (2012), A New Method for
Arbitrarily-Oriented Text Detection in Video, 10th IAPR International Workshop on Document
Analysis Systems (DAS), pp.74-78
Wei, Y.C. and Lin, C.H. (2012), A robust video text detection approach using SVM, Expert
Systems with Applications, Vol. 39, Issue 12, pp. 10832-10840.
Yang, H.J., Siebert, M., Luhne, P., Sack, H. and Meinel, C. (2011) , Automatic Lecture Video
Indexing Using Video OCR Technology,
IEEE
International
Symposium
on Multimedia
(ISM),
Video
Watermarking
Technology
for Semantic
Search