Professional Documents
Culture Documents
FINAL REPORT ON
ENERGY BIG DATA
BY
H.JIANG et all
PRESENTED
BY
TONY KIEMA
3116999147
tonykiema@ymail.com
INTRODUCTION
Overview of the industry
When it comes to human life, one of the most important aspect of it is energy . Smart
Grid which is a very significant application of energy, is a power grid which integrates
conventional power grid and management of advanced communication networks and
pervasive computing abilities to improve control ,efficiency , reliability and safety of the
grid . Smart grid is a form of electric power network. where it combines all the power
utilities resources, both renewable and non-renewable. Smart grid is a very important
innovation where it provides automatic monitoring, protection and optimization of the
operation of the interconnected systems in addition to serving industrial consumers and
home users in aspects of thermostats, electric vehicles and intelligent appliances .
Characterized by the bidirectional connection of electricity and data sets, the smart grid
has formed an automated, widely distributed delivery power network. Taking advantages
from modern communications, it can not only deliver real-time data and information but
also be able implement the instantaneous management balance of demand and supply .
Operations on energy storage, demand response and communication between terminal
users and power companies will make a real large scale data-communicated grid system.
This adds complexity in the conventional grid and promote offering sustainable
progresses to utilities and customers . In order to match its complexity, many
technologies like wireless networks in telecommunications, sensor networks in
manufacturing, and big data in computer science are exploited in the smart grid domain.
Source Of Data
There are various sources that huge amount of data can be generated through diverse
measurements acquired by Intelligent Electronic Devices (IEDs) in the smart grid: Data
from power utilization habits of users; Data from Phasor Measurement Units (PMUs) for
situation awareness; Data from energy consumption measured by the widespread smart
meters; Data from energy market pricing and bidding collected by Automated Revenue
Metering (ARM) system; Data from management, control and maintenance of device and
equipment in the electric power generation, transmission and distribution in the grid;
Data from operating utilities, like financial data and large data sets which are not directly
obtained through the network measurement
Data Quality
A Data Quality (DQ) Dimension is used to describe a feature of data that can be
measured or assessed against defined standards in order to determine the quality of data.
Completeness can be defined as the proportion of stored data against the potential of
"100% complete". completeness can only be assessed. Uniqueness can be described as
to no item will be recorded more than once based upon how that thing is identified and
its measurement form is discrete. Timeliness refers to the degree to which data represent
reality from the required point in time and its measure is both assessment and continuous.
Validity; Data is termed valid if it conforms to the syntax (format, type, range) of its
definition and its measure conforms to Assessment, Continuous or Discrete. Accuracy
refers to the degree to which data correctly describes the "real world" object or event
being described. It takes all the forms of measurement Assessment, e.g. primary research
or reference against trusted data, Continuous Measurement, e.g. age of students derived
from the relationship between the students dates of birth and the current date. Discrete
Measurement, e.g. date of birth recorded. Consistency is the absence of difference, when
comparing two or more representations of a thing against a definition and can be either
measured through Assessment or Discrete.
DATA STORING AND ROUTING- The large amount of data generated by the
widespread sensor networks in the smart grid requires scalability, self-organization,
routing algorithms and efficient data dissemination. Data-centric storing and routing
technologies have been developed where the data is defined and routed referring to their
names instead of the storage nodes address. The data can be queried for users after
defining certain conditions and adopting certain name-based routing algorithms, like the
Data-Centric Storage (DCS) and the Greedy Perimeter Stateless Routing (GPSR)
algorithm.
DATA VISUALIZATION- The data visualization of big data has three main
components: historical information visualization, geographical visualization, and 3-
dimensional visualization, which can intuitively and accurately express information of
electric power system. The visualization technologies are used for monitoring real-time
power system status, which heighten the automation degree of power system. If
combined with complex system network theories, it also can identify potential
relationships and patterns in smart grid.
Data Quality Issues
Its a condition of data that is an obstacle to a consumers use of that data.
Data security
Since the large amount of data is stored in multiple distributed grids, it is difficult to
monitor the security of data storage in real time. and since it involves personal
privacy ,financial information ,it causes the problem of data larceny. The main
vulnerabilities in the security include ;