You are on page 1of 7

52 pages including cover

Knowledge Digest for IT Community


50/-
0970-647X

Volume No. 41 | Issue No. 9


7 | December
October 2017
2017
0970-647X
ISSN ISSN
www.csi-india.org
www.csi-india.org

COVER STORY RESEARCH FRONT


Cyber Physical Systems (CPS) and Enterprise Information Security Risk
COVER STORY RESEARCH FRONT
its Implications 8 Management 20
CSI Nihilent eGovernance Awards 7 Remote Monitoring and Localization
using Sensors: Tools for e-Governance 17
ARTICLE
TECHNICAL TRENDS Application Security using Blockchain in
Machine Learning in Physical System 25
Cyber ARTICLE
TECHNICAL TRENDS
Advanced Python 11
Meri Sadak 2.0 : Ontology
SECURITY Modeling in E-Governance for a
CORNER
One step closer to SMART CITY 15 Semantic
Security Digital
Issues in CyberIndia 25 Systems 31
Physical
CSI  COMMUNICATIONS VOLUME NO. 41 • ISSUE NO. 9 • DECEMBER 2017

Contents
Chief Editor
S S AGRAWAL
KIIT Group, Gurgaon

Editor Cover Story


PRASHANT R. NAIR 8
Cyber Physical Systems (CPS) and its Implications
Amrita Vishwa Vidyapeetham, Coimbatore
S. Suseela and T. Kavitha
Published by
A. K. NAYAK Technical Trends
Hony. Secretary Machine Learning in Advanced Python 11
For Computer Society of India Suchithra M S and Maya L Pai
Blockchain: A Primer 15
Editorial Board: Durgesh Barwal, Rajat Kumar Behera and Abhaya Kumar Sahoo
Arun B Samaddar, NIT, Sikkim
Bhabani Shankar Prasad Mishra, Research Front
KIIT University, Bhubanewar Enterprise Information Security Risk Management 20
Debajyoti Mukhopadhyay, MIT, Pune K. Srujan Raju and M. Varaprasad Rao
J. Yogapriya, Kongunadu Engg. College, Trichy
M Sasikumar, CDAC, Mumbai, Articles
Application Security using Blockchain in Cyber Physical System 25
R Subburaj, SRM University, Chennai Poonam N. Railkar, Sandesh Mahamure and Dr. Parikshit N. Mahalle
R K Samanta, Siliguri Inst. of Tech., West Bengal Cyber Physical Systems and Smart Cities 29
R N Behera, NIC, Bhubaneswar Nishtha Kesswani and Sanjay Kumar
Sudhakar A M, University of Mysore
Security Corner
Sunil Pandey, ITS, Ghaziabad
Security Issues in Cyber Physical Systems 31
Shailesh K Srivastava, NIC, Patna
Swati Maurya and Anurag Jain
Vishal Mehrotra, TCS
Cyber Security and Human Rights 34
Subrata Paul, Anirban Mitra and Brojo Kishore Mishra

Design, Print and Dispatch by Practitioner Workbench


GP OFFSET PVT. LTD. Fun with Digital Image Processing in PHP on Windows and Linux Platform 36
Baisa L. Gunjal

Please note:
CSI Communications is published by Computer
PLUS
Society of India, a non-profit organization. Know Your CSI 2nd Cover
Views and opinions expressed in the CSI
Communications are those of individual ICANN|60 6
authors, contributors and advertisers and they
may differ from policies and official statements CSI Patna Chapter Report 7
of CSI. These should not be construed as legal
or professional advice. The CSI, the publisher, Report on CSI Student Conventions : 40
the editors and the contributors are not Karnataka & Haryana State Level convention
responsible for any decisions taken by readers
on the basis of these views and opinions. State Student Convention 2017, West Bengal 41
Although every care is being taken to ensure
genuineness of the writings in this publication,
Latex Workshop & Workshop on Python - Programming Tool for Data Science 41
CSI Communications does not attest to the CSI Reports 42
originality of the respective authors’ content.
© 2012 CSI. All rights reserved. Student Branches News 44
Instructors are permitted to photocopy isolated
articles for non-commercial classroom use
CSI Calendar 2017-18 3rd Cover
without fee. For any other copying, reprint or
republication, permission must be obtained
The 2017 India-Africa ICT Summit Back Page
in writing from the Society. Copying for other Printed and Published by Prof. A. K. Nayak on behalf of Computer Society of India, Printed at G.P. Offset Pvt. Ltd.
than personal use or internal reference, or of
articles or columns not owned by the Society
269 / A2, Shah & Nahar Industrial Estate, Dhanraj Mill Compound, Lower Parel (W), Mumbai 400 013 and published from
without explicit permission of the Society or Computer Society of India, Samruddhi Venture Park, Unit-3, 4th Floor, Marol Industrial Area, Andheri (East), Mumbai 400 093.
the copyright owner is strictly prohibited. Tel. : 022-2926 1700 • Fax : 022-2830 2133 • Email : hq@csi-india.org

3
CSI COMMUNICATIONS | DECEMBER 2017
Editorial
Dear Fellow CSI Members,
The theme for the Computer Society of India (CSI) Communications (The Knowledge Digest for IT
Community) December 2017 issue is Cyber Physical Systems.
“Cyber-Physical Systems or “smart” systems are co-engineered interacting networks of physical and
computational components. These systems will provide the foundation of our critical infrastructure,
form the basis of emerging and future smart services, and improve our quality of life in many areas.”
Prof. (Dr.) S. S. Agrawal Prof. Prashant R. Nair
Chief Editor National Institute of Standard & Technology (NIST), USA Editor

After a series of thematic issues focusing on ICT in applications such as education, governance,
agriculture and health, CSI Communications is focusing on cyber physical systems in this issue after
an issue on the research topic of machine learning. The next issue is also based on research theme,
Machine Intelligence.
Cyber Physical Systems (CPS) is poised to bring advances in personalized health care, emergency
response, traffic flow management, and electric power generation and delivery. This technology
builds on embedded systems, computers and software embedded in devices whose principle mission
is not computation, such as cars, toys, medical devices, and scientific instruments. CPS integrates
the dynamics of the physical processes with those of the software and networking, providing
abstractions and modeling, design, and analysis techniques for the integrated whole
The Cover story in this issue is “Cyber Physical Systems (CPS) and its Implications” by S. Suseela &
T. Kavitha. In the cover story, the authors have traced the evolution and described the architecture,
applications, platforms and functions of CPS.
The technical trends showcased are “Machine Learning in Advanced Python” by Suchithra M.S. &
Maya L Pai and “Blockchain: A Primer” by Durgesh Barwal Rajat Kumar Behera & Abhaya Kumar
Sahoo
In Research front, we have “Enterprise Information Security Risk Management” by K. Srujan
Raju & M. Varaprasad Rao, who throw light upon current research and approaches for enterprise
information security risk management.
Other articles in this issue on CPS provide us information on its applications in smart cities by
Nishtha Kesswani & Sanjay Kumar and Application Security using Blockchain in CPS by Poonam N.
Railkar Sandesh Mahamure & Parikshit N. Mahalle
The Security Corner has 2 contributions, “Security Issues in Cyber Physical Systems” by Swati
Maurya & Anurag Jain and “Cyber Security and Human Rights” by Subrata Paul, Anirban Mitra &
Brojo Kishore Mishra.
We have revived the Practitioner’s Workbench in this issue with “Fun with Digital Image Processing
in PHP on Windows and Linux Platform” by Baisa L. Gunjal
This issue also contains collage of ICANN 60 participation by CSI, MoU with Cisco, CSI activity reports
from chapters & student branches and calendar of events
We are thankful to entire ExecCom for their continuous support in bringing this issue successfully.
We wish to express our sincere gratitude to the CSI publications committee, editorial board, authors
and reviewers for their contributions and support to this issue.
We look forward to receive constructive feedback and suggestions from our esteemed members
and readers at csic@csi-india.org.

With kind regards,

Prof. (Dr.) S. S. Agrawal, Chief Editor Prof. Prashant R. Nair, Editor

www.csi-india.org
4
CSI COMMUNICATIONS | DECEMBER 2017
TECHNICAL TRENDS

Machine Learning in Advanced Python


Suchithra M S Maya L Pai
School of Arts & Sciences, Amrita University, Kochi, India. School of Arts & Sciences, Amrita University, Kochi, India.
Email: suchithrams194@gmail.com Email: mayalpai@gmail.com

Machine learning is a growing field and a motivated developer can quickly learn it up and start making
very real and useful contributions. Machine learning algorithms are a big part of machine learning.
Machine learning algorithms contain a lot of mathematics and theory. But we do not need to know
about algorithm’s work to be able to implement them and apply them to achieve real and valuable
results. This is achieved through different machine learning tools. In this study, we explain about
machine learning and machine learning algorithms. The usage of machine learning tools like Weka, R
and Python and a review on recent trends of machine learning is also given due attention.

Index Terms - machine learning, algorithms, tools, python.


I. Introduction understanding algorithms. There is a might be worth spending some time on
A machine learning developer much easier way by using the language tuning. Test Harness algorithm is used
is a developer that built machine and methods that developers already to evaluate different methodologies on
learning systems. These systems know: the same problem by comparing the
contain algorithms that could learn ¬¬ Simple and clear algorithm results from different techniques.
from data. Applied machine learning descriptions. 2. Linear Algorithms:
can be overwhelming. There are so ¬¬ Code examples without libraries. ƒƒ Simple Linear Regression [3]:
many things to try and explore on a
given problem. The developer can use We can build up functions to It is used for numerical value
a structured process, just like using a evaluate predictions, estimate the prediction and the dataset contains only
structured process to develop software skill of models and even implement a single input.
[1]. The template for a multi-step the learning algorithms themselves. ƒƒ Multivariate Linear Regression:
process when using machine learning A machine learning professional uses It is also used for numerical value
to address a complex problem is machine learning to solve real-world prediction and the dataset contains
1. Define the problem. problems more than one input. It is trained by
2. Prepare the data. II. Applied machine learning using Stochastic Gradient Descent.
Understanding of the following four
3. Spot check various learning ƒƒ Logistic Regression:
areas are needed for designing applied
algorithms. This method is used for class value
machine learning projects [2]. prediction on two class problems and
4. Tune well-performing learning 1. Data Preparation: it is trained by Stochastic Gradient
algorithms. In this method, the developer loads Descent.
5. Visualize the results. the data from standard CSV file format
ƒƒ Perceptron:
for manipulation and prepares the data
To speed up the process, The easiest model of neural
for machine learning algorithms. The
understand the problem a little bit from network for classification problems is
performance of algorithm on testing
many different perspectives. perceptron and it is trained by using
data can be estimated using algorithm
ƒƒ What is the problem? Stochastic Gradient Descent.
evaluation techniques. To evaluate
ƒƒ Why does the problem need to be the efficiency of predictions made on 3. Nonlinear Algorithms
solved? unseen data the scoring methods are ƒƒ Regression and Classification
used. The best worse case results are Trees:
ƒƒ How would I solve the problem?
analyzed through Baseline Modeling These are decision trees and
This last step helps us to techniques to improve on a problem. that are applied to regression and
understand why the problem is complex Once we have a test harness that we classification problems.
and requires a machine learning based can trust, select and evaluate 5 to
solution. To get the best results, we 10 standard workhorse algorithms. ƒƒ Naive Bayes:
must understand how algorithms work. This gives us an idea of how difficult It is an application of Bayes’
Mathematics plays an important role in our problem is and which algorithms Theorem for classification problems.

11
CSI COMMUNICATIONS | DECEMBER 2017
TECHNICAL TRENDS

The theory of probability is the base for user. That is by giving an utterance propagation.
Naïve Bayes. from a user, it identifies the specific Our goal is to effectively use time
request made.
ƒƒ Backpropagation: to process algorithms. That is to build
The commonly used method of IV. Machine learning algorithms a robust test harness so that we can
artificial neural network and it is widely Machine learning is closely related throw algorithms in and very quickly
applicable to supervised learning or to many fields, i.e., it is a multidisciplinary learn what works and what doesn’t.
classification that roots the broader field. It is very difficult to differentiate There are 2 concerns when building
field of deep learning. machine learning from related fields. a test harness:
Machine Learning is built on the field ƒƒ What is the performance measures
ƒƒ k-Nearest Neighbors (KNN):
of Computer Science and mathematics. used to evaluate algorithms?
These algorithms are used for
Knowing these foundational fields ƒƒ What data to use to train and test
predicting categorical or numerical
can help us to understand why certain our algorithm?
outputs directly from the training data.
mathematical language is used when ƒƒ Once we have a test harness that
ƒƒ Learning Vector Quantization we can trust, select and evaluate
describing algorithms, such as vectors,
(LVQ): 5-to-10 standard workhorse
matrices, functions and distributions.
A widely used method of neural algorithms. This gives us an idea
Three specific foundational fields
network is LVQ which is more efficient of how difficult our problem is and
include:
than KNN. which algorithms might be worth
ƒƒ Probability: It is the study of
4. Ensemble Algorithms characterizing the possibility of spending some time on tuning. This
random events. technique is called spot-checking.
ƒƒ Bootstrap Aggregation:
ƒƒ Statistics: It is the study of There are two main tactics that
It involves an ensemble of decision
processes to collect, analyzes, we can use to get the most out
trees and also known as bagging.
explain and present data. of machine learning algorithms:
ƒƒ Random Forest: ƒƒ Artificial Intelligence: It is Algorithm tuning and Ensembles.
This is an extension of bagging the construction and study of Generally, machine learning
which results in faster training and computational intelligent systems. algorithms can be explained as
better performance. Machine learning also has sibling learning a output function (f) that
fields that sit alongside. These special perfectly maps input variables (P)
ƒƒ Stacked Aggregation:
fields give context to machine learning to an output variable (Q).
This method learns how to combine
algorithms. These include: Q = f (P)
the predictions from multiple models in
ƒƒ Computational Intelligence: It Our goal in evaluating different
an efficient method. It is an ensemble
is the study and construction of algorithms and even different
method and also known as blending or
complex systems. configurations of an algorithm is to find
stacking.
a good approximation for the output
Many complex machine learning ƒƒ Data Mining: It is the construction function (f) to get really good predictions
problems can be reduced to one of and study of computational systems (Q) [5].
four core problem types: Classification, that discover useful relationships We can often get a boost in
Regression, Clustering and Rule and patterns from large data sets. performance by combining the
extraction. If we can map everyday
A useful way to group algorithms predictions from multiple well
problems to one of these problems,
is by their similarity in structure or performing models. These techniques
we can then find and start testing
learning style [4]. The five classes of are called ensemble machine learning
algorithms that can address those
machine learning algorithm that can be algorithms and are often internally
problems. Examples of machine
used to group algorithms by structure simpler than we first think. When
learning problems:
and learning style are: investigating how machine learning
1. Spam Detection: To identify the
given email message in a mail 1. Regression: linear regression, algorithms work, there are two
inbox as spam or not. logistic regression and stepwise ensemble methods I would recommend
2. Credit Card Fraud Detection: To regression. looking into:
2. I n s t a n c e - b a s e d   M e t h o d s : 1. Bagging (e.g.: Random forest)
identify the credit card transactions
k-nearest neighbor, learning vector 2. Boosting (e.g.: Adaboost)
that were not made by the customer
quantization and self-organizing These are two very simple
by the giving the transactions for a
map. foundations of very powerful ensemble
customer in a month.
3. Decision Tree Learning: C4.5, CART machine learning algorithms [6].
3. Digit Recognition: To identify
the digit for each handwritten and ID3. V. Machine Learning Tools
character by giving the handwritten 4. Kernel Methods: support vector
1. Weka Tool
zip codes on envelopes. machine, radial basis network and
The best machine learning tool for
4. Speech Understanding: To identify linear discriminant analysis.
beginners is Weka. There are three main
the specific request made by the 5. Artificial Neural Networks:
reasons to use Weka for beginners:
Perceptron, Hopfield and back-
www.csi-india.org
12
CSI COMMUNICATIONS | DECEMBER 2017
TECHNICAL TRENDS

ƒƒ It has a graphical interface, among experts of data scientists. ƒƒ SciPy: The basic library for
which means that there is no We cannot get started with machine scientific computations
programming. learning in Python until we have access ƒƒ NumPy: It is based on
ƒƒ It offers a suite of state-of-the- to the platform. We must download n-dimensional array package.
art machine learning algorithms, and install the Python 2.7 platform on ƒƒ Matplotlib: It is used for complete
including ensemble methods. our computer. We also need to install 2D/3D plotting
ƒƒ It is free and open source software. the SciPy platform and the scikit- ƒƒ Pandas: It can be used as an
Weka platform allows us to quickly learn library. We can install everything effective data analysis and
design and run experiments. We must at once with Anaconda. Anaconda is structuring tool.
experiment to discover how to get good recommended for beginners. We can ƒƒ Sympy: The symbolic mathematics
results. The Weka experimenter allows load our own data from CSV files. is represented by this method.
us to do this. The general structure for working ƒƒ IPython: It is an enhanced
1. Start Weka through a machine learning problem interactive console used in
2. Design a new experiment in Python with Pandas and scikit-learn computing environment
ƒƒ Select a Dataset can be divided into 6 steps: The modules or extensions for
ƒƒ Select one or more algorithms 1. Install the Python and SciPy SciPy are commonly named as SciKits.
or algorithm configurations platform. A Python library called Theano is used
3. Run the experiment 2. Load a standard dataset. for fast numerical computation and
4.  Review the results and use 3. Summarize the data using it helps in the development of deep
statistics to check for significance statistical functions in Pandas. learning models [8]. Theano library
With a few clicks we can quickly 4. Visualize the data using plotting is used in Python as a compiler for
design experiments to test our ideas function in Pandas. mathematical expressions. Another
and intuitions on our problem. It is a 5. Evaluate machine learning Python library called TensorFlow [10]
very powerful feature that few machine algorithms in scikit-learn. is also used to develop deep learning
learning platforms offer. 6. Develop a final model and make models. It is a platform that cannot be
2. R Tool some predictions on new data. ignore by machine learning experts. It is
R is a platform that is used by The better we can understand our used by the Google DeepMind research
some of the best data scientists in the data, the better and more accurate the group. It is used in some of Google’s
world. The reason is not the strange models that we can build. The first step production systems with the backing
scripting language. It is because of the to understanding our data is to use of Google. The capability to run on
vast number of techniques available. descriptive statistics. To learn how to CPUs, GPUs and large clusters is the
Academics that develop new machine use descriptive statistics to understand advantage of Tensor Flow. Because of
learning algorithms use R, meaning our data, the helper functions provided this it does have more of a production
that often new algorithms appear on the Pandas Data Frame. A second focus. The necessity to take a lot of
on R platform before any other. With way to improve our understanding of code to develop even very easy neural
packages like caret, we can access our data is by using data visualization network models is the difficulty of both
hundreds of the top machine learning techniques (e.g. plotting). We can Theano and TensorFlow. This problem
algorithms in R through a consistent use plotting in Python to understand is addressed by the Keras library and it
interface, ideal for spot checking attributes alone and their interactions. is concerns with providing a package for
techniques on our dataset. Data visualization is the fastest way to both Theano and TensorFlow. To define
1. Python learn more about our data. Pandas in and evaluate deep learning models
Python cannot be ignored in Python use number of ways to effectively in just a few lines of code is possible
machine learning. It is rapidly catching understand our machine learning data. with clean and simple API provided by
up to platforms like R in terms of The different types of methods used to Keras library., it dominances the power
capability and adoption. The cause is the plot our data in Python is as follows: of Theano and TensorFlow because
scikit-learn Python library for machine ƒƒ Box and Whisker Plots of the ease of use. For applied deep
learning that is built on top of the SciPy ƒƒ Histograms learning, Keras is quickly becoming the
stack, harnessing the speed and power ƒƒ Correlation Matrix Plot prominent library. The life-cycle of a
of Python libraries such as Numpy for ƒƒ Density Plots model can be summarized as follows:
fast data manipulation at C-like speeds. ƒƒ Scatterplot Matrix 1. Define our Sequential model
The scikit-learn library is fully featured, The consistent interface in Python 2. Add configured layers.
offering a suite of algorithms to choose uses Scikit-learn to provide a range 3. Compile our model.
from as well as data preparation of supervised and unsupervised 4. Fit our model.
scheme and clever Pipelines that allow learning algorithms. The library must 5. Make predictions.
us to design how data flows from one be installed before we can use scikit-
element to the next. learn [9]. The Library is built upon the V. Conclusion
Python is the fastest-growing Scientific Python (SciPy). This library From this paper, we will be able
platform for applied machine learning stack includes: to understand the machine learning

13
CSI COMMUNICATIONS | DECEMBER 2017
TECHNICAL TRENDS

concepts and different types of machine 50%


learning algorithms. This paper
concludes how can we select machine 42% 41%
learning algorithms based on the 40%
problems and will be able to understand 36%
how python helps to solve machine 34%
learning problems. The impressive
growth of python is illustrated in figure 30%
1. It highlights the most advanced
techniques in python to support
machine learning. 20%
References
[1] Brownlee, Jason. “Machine 16%
learning mastery.” URL: http:// 12%
10% 11%
machinelearningmastery. com/
discover-feature-engineering-how- 8.5%
toengineer-features-and-how-to-get-
good-at-it (2014). 0%
[2] Brownlee, Jason. “A tour of machine Share in 2016 Share in 2017
learning algorithms.” Machine Learning
Mastery (2013). Fig. 1 : Share of Python, R, Both, or Other platforms usage for Analytics, Data Science,
[3] Brownlee, J. “Linear Regression for Machine Learning, 2016 vs 2017 [7]
Machine Learning-Machine Learning
Mastery.” Machine Learning Mastery
(2017). Unsupervised Machine Learning mathematical expressions.” arXiv
[4] Brownlee, Jason. “How to Prepare Algorithms.” Machine Learning Mastery preprint (2016).
Data for Machine Learning.” Machine (2016). [9] Raschka, Sebastian. Python machine
Learning Mastery 25 (2013). [7] https://www.kdnuggets.com/2017/08/ learning. Packt Publishing Ltd, 2015.
[5] Brownlee, J. “Machine Learning python-overtakes-r-leader-analytics- [10] Abadi, Martín, et al. “TensorFlow:
Algorithms.” Machine Learning Mastery data-science.html A System for Large-Scale Machine
(2015). [8] Al-Rfou, Rami, et al. “Theano: A Python Learning.” OSDI. Vol. 16. 2016
[6] Brownlee, Jason. “Supervised and framework for fast computation of n

About the Authors


Dr. Maya L Pai born on July 21, 1961. She received the M.Sc. and Ph.D. degrees from Cochin University of
Science and Technology (CUSAT), Kerala, India in 1983 and 2016, respectively.
In 2000, she joined the Amrita Institute of Computer Technology, Kochi, India, as a Senior Lecturer. In 2003,
Amrita Institute of Computer Technology became Amrita University. Now she is working at Amrita University
as Assistant Professor (Senior Grade) and HOD, Department of Computer Science and IT. She has published
papers in referred national and international journals. Her research interests include Data Mining, Machine
Learning and Discrete mathematics.
Suchithra M S born on March 20, 1989. She received the M.E degree in Computer Science and Engineering from
Anna University, Chennai, India in 2013.
She has worked as Assistant Professor in Computer Science and Engineering from 2014 to 2016 in colleges
under Calicut University. In 2016, she joined the School of Arts and Sciences, Amrita University, Kochi, India,
as a Research Scholar. She has published papers in referred national and international journals. Her research
interests include Data Mining, Machine Learning and Soil Science.

www.csi-india.org
14
CSI COMMUNICATIONS | DECEMBER 2017

You might also like