You are on page 1of 23

Running head: USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 1

How can Capsule Neural Networks be used to improve traffic

light image recognition applications for autonomous driving?

Keshav Shenoy

Kennesaw State University


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 2

Table of Contents

Rationale of Study…………………………………………………………………………………3

Concept Map………………………………………………………………………………………4

Chapter 1: The Problem and its Setting…………………………………………………………...5

A. Statement of the Problem……………………………………………………………...5

B. Context for the Study………….………………………..……………………………..5

C. Subproblems and Hypotheses………….…………………………….………………..5

D. Definition of Terms………….…………………………………………….…………..7

E. Assumptions………….………………………………………………………………..8

F. Delimitations and Limitations………….……………………………………………...9

G. Importance of Study………….………………………………………………………10

Chapter 2: Review of Literature……...………………………………………………………….11

Chapter 3: Methodology……………...………………………………………………………….15

A. Evaluation of End Product……………………..…………………………………….15

B. Type of Design and Data…….………………..………..……………………………16

C. Development and Prototyping of Solution….……………………….………………16

D. Management Plan and Timeline……….………………………………….…………18

E. Data Tables………………….…………………………………………….…………20

References…………………………….………………………………………………………….21
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 3

Rationale of Study

Autonomous driving is a growing field of study with large numbers of potential commercial

applications. So far, contemporary autonomous driving researchers have constructed

Convolutional Neural Networks (CNN) to recognize traffic light signals, but CNNs have

significant flaws, most notably in their ability to evaluate positional data. As a result, this

research will investigate the possible benefits of implementing Capsule Neural Networks

(CapsNets) in place of CNNs. Specifically, the research will look for improvements in final

accuracy with faster minimization of loss. The basis for this hope can be found in the work of

Kumar, Arthika, and Parameswaran (2018), who implemented CapsNets in traffic sign

classification with positive results and a 97.6% accuracy (p. 4546). CNNs have been the leading

edge of image recognition for a long period of time and, as such, an alternative with a significant

improvement in performance could provide a substantial boost to the development of road-ready

vehicles. The researchers wish to observe the benefits and the drawbacks of CapsNet architecture

over that of CNNs. The comparison is multifaceted and could lead to further research for the

researcher.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 4

Research Concept Map


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 5

Chapter 1: The Problem and its Setting

A. Statement of the Problem

a. Overarching Question

How can CapsNets be used to improve traffic light image recognition applications for

autonomous driving?

b. Purpose of the Study

This study will determine the potential of CapsNets to improve the final accuracy of traffic light

image recognition. This necessitates that the CapsNet can minimize loss at a faster rate than

current CNNs.

B. Context for the Study

Currently, according to Hinton (2018), CNNs are the predominant machine learning technique

being used for detecting and identifying objects within images (7:00). This has led to their

inclusion within multiple language libraries like Keras and TensorFlow. Unfortunately, like any

other emerging technology, there are a number of flaws with CNNs. Hinton (2018) has proposed

that many of these flaws can be remedied through the alteration of CNNs into a new, similar

neural network structure called a CapsNet (3:09). This research applies Hinton’s assertion to the

field of autonomous driving, where CNNs are used to assist autonomous motor vehicles in

identifying the objects around them.

C. Subproblem(s) and Hypothesis

Foundation Sub-problem 1: What do image recognition applications currently use for

autonomous driving?
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 6

Hypothesis: The most utilized current image recognition applications are CNN models, which

utilize feedforward supervised learning.

Independent Variable: The model of the image recognition application.

Dependent Variables: The utilization of the model.

Foundation Sub-problem 2: What is the benefit of Capsule Neural Networks over current image

recognition applications?

Hypothesis: CapsNets will have more accurate results because of positional data preservation

and dynamic routing.

Independent Variable: The model of the image recognition application.

Dependent Variable: The accuracy of the model’s results.

Applied Sub-problem 1: How do CapsNets perform when implemented in traffic light image

recognition?

Hypothesis: A trained CapsNet will recognize traffic lights from multiple autonomous driving

related image datasets with more than 85% final validation accuracy.

Independent Variables: The model, build, and design of the machine learning (ML) system that

has been trained and tested.

Dependent Variable: Final accuracy of CapsNet operating on validation data after training.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 7

D. Definition of Terms

a. Terms

− Artificial Intelligence: Nilson (2010) defines it as “Artificial intelligence is that activity

devoted to making machines intelligent...” (as cited in Stone et al., 2016, p.12)

− Artificial Neural Network: Machine Learning using a collection of interconnected nodes,

“neurons,” loosely based on the organization of certain neurons in human brains (Rawat

& Wang, 2017, p.2354).

− Autonomous driving: An emerging technology in which artificial intelligence will control

the movement of transport vehicles instead of humans. (Stone et al., 2016, p. 7)

− Convolutional Neural Network: A type of feedforward artificial neural network built

from layers of convolutional and pooling layers (Rawat & Wang, 2017, p. 2354).

− Capsule Neural Network: A type of artificial neural network that modifies convolutional

neural networks by segmenting groups of neurons into capsules for the better evaluation

of positional data. (Hinton, 2018, 3:08)

− Image Recognition (or Image Classification): “…the task of categorizing images into one

of several predefined classes…” (Rawat & Wang, 2017, p. 2352)

− Convolutional Layers: “…serve as feature extractors, and thus they learn the feature

representations of their input images…” (Rawat & Wang, 2017, p.2355)

− Machine Learning: “…the design of learning algorithms, as well as scaling existing

algorithms, to work with extremely large data sets.” (Stone et al., 2016, p. 9)

− Pooling Layer: LeCun et al. (1989a), LeCun et al. (1989b), LeCun et al. (1998), and

Ranzato et al. (2007) claimed that pooling layers “…reduce the spatial resolution of the
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 8

feature maps and thus achieve spatial invariance to input distortions and translations” (as

cited in Rawat & Wang, 2017, p. 2356).

− Pose: A specific type of positional data, including position, orientation, scale,

deformation, velocity, color, and more, which is recorded by CapsNets. (Hinton, 2018,

3:23)

b. Acronyms and Abbreviations

− API: Application Programming Interface

− CapsNet: Capsule Neural Network

− CNN: Convolution Neural Network

− ML: Machine Learning

− TF: TensorFlow

− SVM: Support Vector Machine

E. Assumptions

It is assumed that the datasets accurately represent the population of traffic lights that

autonomous motor vehicles would encounter in practice. While traffic lights are not very

variable, some alterations exist in structure and orientation based on locale.

It is assumed that the performance of the produced CapsNet after training accurately models the

performance of a theoretical CapsNet that is trained more extensively.

It is assumed that the dataset developers annotated the datasets with the correct bounding boxes

and signal.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 9

It is assumed that the power of the Central Processing Unit and processors of the computer used

during training and testing will not affect accuracy results.

It is assumed that the ML algorithm can be created and tested at full potential within the TF API

and Python language.

F. Delimitations and Limitations

There are many different types of ML algorithms and neural networks. The research will be

confined to the performance of CNNs and CapsNets due to their relevance and current use within

the field.

The research will limit itself to the study of accuracy, with the understanding that a high final

accuracy indicates the ability for optimization in terms of performance and speed on more

suitable processing equipment.

The research will only investigate the performance of CapsNets within the TF framework and

will not attempt to reconstruct the design within Caffe or any other ML framework.

The research will limit itself to a few levels of image quality and dimensions with the

understanding that practically applied autonomous driving applications will have similar or

greater levels of image quality.

While planning to attempt to identify relatively small traffic lights with artificial neural

networks, the research will set a minimum pixel size at around 4px width according to the futility

of classifying objects with a smaller size than that.

The research is limited to the ML area of artificial intelligence and will not examine other areas

of artificial intelligence within autonomous driving.


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 10

G. Importance of Study

By showing the performance of CapsNet technology within traffic light image recognition in

autonomous driving, this research can support or fail to support a shift in resources towards

further CapsNet research. The potential for a more powerful and accurate CNN is very

significant, because CNNs are currently within the forefront of object recognition (Hinton, 2018,

7:00). Improving upon the capabilities of CNNs with CapsNets could change how researchers

approach image recognition problems and push further forward the adoption of autonomous

motor vehicles globally as well as the incorporation of artificial intelligence and ML in common

objects.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 11

Chapter 2: Review of Literature

Currently, the field of image object detection and recognition within ML is increasing in

importance for a number of different applications. Specifically, Fairfield and Urmson (2011)

discuss its growing significance in the field of autonomous driving, where it has been used to

build perception systems in combination with cameras (p. 1). They specifically cite the issue of

traffic light image recognition, which cannot be performed by alternative measures like sonar or

radar (p. 1), because it requires knowledge of color. As such, a large amount of development has

gone into designing the best learning algorithms for traffic light image recognition problems. So

far, Huang et al. (2017) found that the leading models used are CNNs (p. 1). Lim et al. (2017)

discusses this, describing CNN architecture as one where image data is fed through a series of

deep (convolutional) and pooling layers, as well as a kernel, to extract features for classification

(p. 11). They explain that CNN technology is state-of-the-art, needing only one network to

accurately classify traffic signs (Lim et al., 2017, p. 10).

Despite this, there are still significant issues with the CNN model. One significant

problem Liu et al. (2016) identify is balancing speed performance and accuracy (p. 1). To

alleviate some of this, Liu et al. (2016) suggest SSD (Single Shot MultiBox Detector) – a “deep

network based object detector that does not resample pixels or features for bounding box

hypotheses and and is as accurate as approaches that do” (p. 2). By replacing bounding boxes

proposals with a convolutional filter, Liu et al. (2016) are able to construct a model that operates

at higher frames per second than previous approaches with Faster R-CNN (p. 16). However, in

contrast with Liu et al.’s research, Huang et al. (2017) suggests that Faster R-CNN can be

improved to a similar speed as SSD by minimizing proposals while maintaining accuracy on

very small objects, a task SSD struggles with (p. 14). Meanwhile, in the similar field of traffic
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 12

sign image recognition, Lim et al. (2017) took a unique approach to the optimization problem by

combining a Support Vector Machine (SVM) model – an ML system which does not utilize

neural networks – with CNN technology to improve results (p. 2). SVMs were utilized first to

verify the sign and a CNN afterwards to classify the sign (Lim et al., 2017, p. 2). Lim et al.’s

(2017) combination worked out, forming a system able to classify images at real-time with

97.9% average accuracy and with improved accuracy specifically in poor lighting (p. 19). It is

difficult to compare Lim et al.’s (2017) sign model to the traffic light models of Liu et al. (2016)

or Huang et al. (2017), but the improvements of Huang et al. and Lim et al. over Liu et al. in

such a short time frame show the speed of significant advancements occurring within

autonomous vehicle image recognition applications.

Outside of the actual model usage, multiple researchers have attempted to make

improvements through external changes to structure or learning strategy. A strong example of

this are Fairfield and Urmson (2011), who show the ability for mapped traffic lights to improve

detection results within a model (p. 6). By mapping the location of traffic lights against current

location of the vehicle, a network can predict when it should expect to detect traffic lights and

when it should expect not to, reducing false positives and false negatives (Fairfield & Urmson,

2011, p. 6). Ghahramani (2015) takes a more technical approach, exploring the ability for

probabilistic frameworks – models which “make predictions about future data, and take

decisions that are rational given these predictions” (p. 1) – to increase accuracy. Tyukin et al.

(2018) mirrors this by considering the use of multiple ML models within a teacher-student

model, speeding up the training of classification algorithms and improving the universality of

models in application to data (p. 1). They improve on previous work in the field by creating a

framework for the teacher-student model which requires less raw data and training (Tyukin et al.,
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 13

2018, p. 2). Though not implemented within the context of automated driving, the success of the

model within CNN image recognition suggests its potential for the field.

More than anything else, however, the biggest challenge that has been issued against

CNNs is from Hinton (2014), who references their lack of structure as a major flaw with their

performance in handling positional data (1:47). As a way to fix this, Hinton (2011; 2014)

proposes CapsNets, similar to CNNs but with layers loosely replaced with “capsules” (p. 2;

3:09). According to Hinton (2014), capsules would output the likelihood that a feature is present

and “pose” information, which would include a large amount of positional information (3:09).

First, Hinton (2014) claims, capsules would improve massively on the current CNN

practice of max pooling, which reduces the available information in a subsampling procedure

(6:57). CapsNets get rid of pooling completely, instead using coincidence filtering to find

clusters of inputs at high dimensions, removing unwanted background inputs while keeping

useful data (Hinton, 2014, 5:26). Secondly, Sabour, Frosst, and Hinton (2017) point out the

benefits of capsules for the dynamic routing of information, specializing specific capsules for

certain tasks (p. 2). This contrasts with max pooling, which Sabour et al. (2017) states will,

“throw away information about the precise position of the entity within the region” (p. 2),

because it considers multiple input vectors, not just the most active one. These two effects, the

removal of subsampling and the introduction of dynamic routing, could lead to improvements in

a number of fields, including: digit segmentation and separation, like that performed by Hinton

& Ghahramani (2000, p. 1) and Sabour et al. (2017); traffic sign image recognition, like that

done by Lim et al. (2017); and shape analysis, like that described by Hinton (2014, 15:15). In

fact, Kumar, Arthika, and Parameswaran (2018) have already implemented CapsNets in traffic

sign image recognition with strong results: 97.6% accuracy and 0.0311038 loss at the end of
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 14

validation (p. 4546). Unfortunately, it does not seem like CapsNets have yet been applied to the

primary issue of this research, traffic light recognition. Based on the results of Kumar et al.

(2018), however, the CapsNet architecture should have a strong accuracy rating when

implemented within traffic light detection.

Discussion

From the literature, it becomes clear that there are numerous areas for potential

improvement within CapsNets that do not exist in CNNs. These include the elimination of

information loss from down-sampling suggested by Hinton et al. (2011, p. 7) and by Hinton

(2014, 6:55), as well as within dynamic routing between capsules to enable specialization

(Sabour et al., 2017, p. 2). Sabour et al. (2017) goes so far as to state that, “The fact that a simple

capsules system already gives unparalleled performance at segmenting overlapping digits is an

early indication that capsules are a direction worth exploring” (p. 9). This supports the

conclusion that CapsNets, if developed at the same level as CNNs have enjoyed, should become

one of the leading approaches towards image classification.

Further Research: Beyond just this research’s exploration of the utilization of CapsNets

in traffic light imaging, research should also be conducted into applying the improvements made

within CNN architecture to CapsNets. As an example of that, the emulation of Fairfield and

Urmson’s (2011) traffic light mapping (p. 1) or Lim’s (2017) utilization of SVMs as a pre-

processing measure (p. 1) within a CapsNet framework could provide valuable evidence towards

the potential of CapsNets within autonomous driving applications.


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 15

Chapter 3: Methodology

A. Evaluation of End Product

Applied Sub-Problem 1: How do CapsNets perform when implemented in traffic light image

recognition?

Need: A trained CapsNet will recognize traffic lights from multiple autonomous driving related

image datasets with more than 85% final validation accuracy.

Research Basis: This need provides a good basis from which to begin examination, because it

establishes clear proof of concept from which the CapsNet can improve. As detailed in the

research paper, state-of-the-art image recognition technology has reached the point of greater

than 90% accuracy after a reasonable number of iterations. Hinton (2014) describes how CNNs

have been extensively developed and improved by researchers for many years now (7:02). As

such, 85% accuracy is an ambitious, but reasonable level of accuracy to expect from an emerging

model for learning. Reaching that level supports the idea that there is potential for CapsNet

architecture to improve to the point of replacing CNN architecture in traffic light image

recognition applications in autonomous driving systems, but is not too high a bar for the newer

system to jump over.

Independent Variable: The model, build, and design of the machine learning (ML) system that

has been trained and tested.

Dependent Variable: Final accuracy of CapsNet operating on validation data after training.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 16

B. Type of Design and Data

Type of Design: The research will utilize the Engineering Design Process to implement a

CapsNet ML system within traffic light image recognition. If the design does not meet the

evaluation criteria, another iteration will be introduced until the best possible CapsNet structure

that can be produced within the timeframe is produced. The design process assumes that the

product is possible to construct and that the successful implementations of previous researchers

will cross apply to this work, as well as the assumptions listed previously in Chapter 1.

Type of data: The data is numerical. The final accuracy is a single number taken at the end of

validation from a table of accuracy over iteration, while loss will be measured per epoch as a

residual sum of squares. Final accuracy is the number being used to evaluate the success of the

product, while loss will simply inform the researchers of how the model’s accuracy increased

over training and validation. The data is descriptive, because it is summarizing the success of the

model in classification. It also encompasses the whole scope of the network, not just a sample of

it.

C. Solution Development and Prototyping

Testing: The model evaluation is done as part of its operation, during the validation section of

the code. This section will test the code against images it has not yet seen, but that are of the

same type as those the model was trained on. The accuracy of the model in recognizing traffic

lights within these images will be the final accuracy.

Analysis: The only way to analyze the model’s accuracy data is by direct greater-than, less-than

comparison of accuracies to previous versions and the evaluation objective, because the network

is built entirely around minimizing loss and increasing accuracy. As such, it would not make
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 17

sense to analyze the model than by any other metric than its own accuracy. A model which

reaches the threshold of 85% final accuracy is successful, while a model which does not is not a

success.

Validity:

− Internal Validity: Internal validity will be increased by the randomization of all possible

assignments within the design process. These include the order the examples are read during

training and validation, which subsamples of data are used for training, which subsamples of

data are used for validation. It will also be improved by the controlling of as many variables

as possible, including the number of steps allowed within training and validation, the dataset

used, and the computer the algorithm is executed on.

− External Validity: This will be increased by trying to have as universal a coverage as possible

of traffic light images. By incorporating every type of traffic light, the model will be

applicable to almost all of the subject. This can be done by using multiple datasets, as this

research will, and by using datasets with many, diverse sets of images from multiple

locations, as this research will.

− Criterion-related Validity: If the results of the produced CapsNet in traffic light image

recognition are similar to results of other CapsNets produced for traffic sign image

recognition or other perception problems, it suggests that the model is operating correctly in

relation to the body of contemporary work.

Reliability:

− Test-Retest Reliability: If the model, run twice on the same dataset, produces the same

results, it will indicate strongly that the model is reliable.


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 18

− Inter-rater Reliability: If the model, run on two different datasets, returns similar results, it

suggests that the model is operating correctly and is not overfitted to a specific set of data

samples.

Consistency:

Both forms of reliability addressed above and the factors discussed within internal validity can

be applied to measure the consistency of the model. Additionally, because of the linear nature of

Python programs, as long as conditions for testing are controlled, the program should operate in

the same manner each time.

Feasibility:

Kumar et al.’s (2018) successful construction of a CapsNet model for traffic sign image

recognition (p. 4547) demonstrates the feasibility of the system within artificial intelligence and

ML broadly, as well as within specifically autonomous driving perception problems.

D. Design Management Plan

Week 1-2: Building on Empirical Examples: The first step is to examine previously implemented

capsule neural networks and convolutional neural networks implemented for similar problems.

By founding the most basic areas of the design from models shown to previously have success,

the research can establish the model on a stable basis from which to start the design process.

Most of these are open-source projects published by University researchers or Google/TF

employees on GitHub.

Week 3: Implement Data Processing: The first step within both CNNs and CapsNet Frameworks

is the processing of the input data into a format understandable by the neural networks. This is
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 19

achieved through a Python program which reads each pixel of the images from the traditional file

format into a 3-dimensional array of pixel values. Each image will be represented by one array,

with height, width, and RGB making up the 3 dimensions. The final input dimensions would be

HeightxWidthx3. The pre-annotated datasets utilized by this research have this data already

established with labels and ground truths within a JSON, config, or similar file. The program will

read the labelling and truth information from the file and send it to the neural network for the

training portion of the network.

Week 4-12: Implement, Train, and Test Capsule Neural Network: The second step of the design

process is to create the actual Capsule Network. This network will have the implementation in

Python for both the training and the validation portions of the Neural Network done using the TF

framework. At the end of the validation, there will be implementation to produce and plot an

accuracy and loss curve as well as to record the data into a csv or similar data file. If the

objective of 85% accuracy is not reached, the researchers will analyze further where losses in

performance could have occurred and renovate the CapsNet, iterating the design until 85%

accuracy is reached.

Tools: The entire project is done within Python, a popular machine learning language.

Additionally, the research utilizes the TF Python framework, which implements a large number

of classes, functions, and objects for ML. The TF framework provides simple, pre-implemented

methods for developing the ML algorithm, measuring the change in the dependent variable

(accuracy), and recording the artificial neural networks every few number of steps.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 20

E. Data Tables

CapsNet Accuracy

Epochs
1 2 3 4 5 6 7 8 9
Training
Accuracy (%)

Loss
Testing
Loss
Note: The actual data table will have more than this number of epochs, depending on the

iteration amount chosen and number of data samples. The whole table is not shown for ease of

viewing.

CapsNet Loss

Epochs
1 2 3 4 5 6 7 8 9
Training
Loss (No

Loss
Units)

Testing
Loss
Note: The actual data table will have more than this number of epochs, depending on the

iteration amount chosen and number of data samples. The whole table is not shown for ease of

viewing.
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 21

References

Fairfield, N., & Urmson, C. (2011). Traffic light mapping and detection. IEEE International

Conference on Robotics and Automation. doi:10.1109/icra.2011.5980164

Ghahramani, Z. (2015, May 28). Probabilistic machine learning and artificial

intelligence. Nature,521(7553), 452-459. doi:10.1038/nature14541

Hinton, G. E., Ghahramani, Z., & Teh, Y. W. (2000). Learning to parse images. Advances in

Neural Information Processing Systems. Retrieved from the NIPS Proceedings database.

Hinton, G. E., Krizhevsky, A., & Wang, S. D. (2011). Transforming auto-encoders. Lecture

Notes in Computer Science Artificial Neural Networks and Machine Learning – ICANN

2011,44-51. doi:10.1007/978-3-642-21735-7_6

Hinton, G. E. (2018, April 12) What's wrong with convolutional nets? [Video File]. Retrieved

from https://techtv.mit.edu/collections/bcs/videos/30698-what-s-wrong-with-

convolutional-nets

Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., . . . Murphy, K. (2017).

Speed/accuracy trade-offs for modern convolutional object detectors. 2017 IEEE

Conference on Computer Vision and Pattern Recognition (CVPR).

doi:10.1109/cvpr.2017.351

Kumar, A. D., Arthika, R. K., & Parameswaran, L. (2018). Novel deep learning model for traffic

sign detection using capsule networks. International Journal of Pure and Applied

Mathematics,118(20), 4543-4548. Retrieved from the Academic Publications database.


USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 22

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D.

(1989a). Handwritten digit recognition with a back-propagation network. In D. S.

Touretzky (Ed.), Advances in neural information processing systems, 2(pp. 396–404).

Cambridge, MA: MIT Press.

LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D.

(1989b). Backpropagation applied to handwritten zip code recognition. Neural

Computation, 1(4), 541–551 doi:10.1162/neco.1989.1.4.541

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to

document recognition. Proceedings of the IEEE, 86(11), 2278–2324.

doi:10.1109/5.726791

Lim, K., Hong, Y., Choi, Y., & Byun, H. (2017). Real-time traffic sign recognition based on a

general purpose GPU and deep-learning. Plos One,12(3).

doi:10.1371/journal.pone.0173317

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., & Berg, A. C. (2016). SSD:

Single shot MultiBox detector. Computer Vision – ECCV 2016 Lecture Notes in

Computer Science,21-37. doi:10.1007/978-3-319-46448-0_2

Nilsson, N. J. (2010). The quest for artificial intelligence: A history of ideas and achievements.

Cambridge: Cambridge University Press. Available from Google Books database.

Ranzato, M. A., Huang, F. J., Boureau, Y., & LeCun, Y. (2007). Unsupervised learning of

invariant feature hierarchies with applications to object recognition. In Proceedings IEEE

Conference on Computer Vision and Pattern Recognition,1-8.

doi:10.1109/CVPR.2007.383157
USING CAPSNETS FOR TRAFFIC LIGHT IMAGE RECOGNITION 23

Rawat, W., & Wang, Z. (2017). Deep convolutional neural networks for image classification: A

comprehensive review. Neural Computation,29(9), 2352-2449.

doi:10.1162/neco_a_00990

Sabour, S., Frosst, N., & Geoffrey, H. E. (2017). Dynamic routing between capsules. 31st

Conference on Neural Information Processing Systems. Retrieved from the NIPS

Proceedings database

Stone, P., Brooks, R., Brynjolfsson, E., Calo, R., Etzioni, O., Hager, G., … Teller, A. (2016,

September). Artificial intelligence and life in 2030." One Hundred Year Study on

Artificial Intelligence,1-52. Retrieved from http://ai100.stanford.edu/2016-report

Tyukin, I. Y., Gorban, A. N., Sofeykov, K. I., & Romanenko, I. (2018, August 13). Knowledge

transfer between artificial intelligence systems. Frontiers in Neurorobotics,12.

doi:10.3389/fnbot.2018.00049

You might also like