You are on page 1of 18

Geographical Information System (GIS) Introduction to GIS: Introduction A large variety of information systems are available for various

applications. Figure gives different types of such systems. This module will focus on Geographical Information System (GIS), one of the important spatial (Pertaining to or involving or having the nature of space) information systems with a capability to handle spatial information (information distributed in space).

Definitions:

Experts of many disciplines such as geodesy, remote sensing, engineering, cartography, geography, geology, environment, planning are the users of spatial data. When spatial data is used in an information system, one tends to speak of spatial information system. Several informal definitions conveying capabilities of GIS are available as given below: o A map with database(s) o Cartographic features represented as points, lines or polygons with attributes describing these features o A system with ability to store and retrieve data display particular attribute characteristics spatially identify locations by querying attribute data identify spatial patterns of attribute occurrence

describe features by spatial query analyze spatial and attribute data and relationships among several features through system operations generate new information through system operations automate procedures On more formal front, a typical GIS can be understood with various other definitions as given below (Ref):

Toolbox-based definitions:

A powerful set of tools for collecting, storing, retrieving at will, transforming and displaying spatial data from the real world. A system for capturing, storing, checking, manipulating, analysing, and displaying data which are spatially referenced to earth. An information technology which stores, analyses, and displays both spatial and non-spatial data

Database definitions:

A database system in which most of the data are spatially indexed, and upon which a set of procedures operated in order to answer queries about spatial entities in the database. Any manual or computer based set of procedures used to store and manipulate geographically referenced data.

Organisation-based definition

An automated set of functions that provides professionals with advanced capabilities for the storage, retrieval, manipulation and display of geographically located data. An institutional entity, reflecting an organisational structure that integrates technology with a database, expertise, and continuing financial support over time. A decision support system involving the integration of spatially referenced data in a problem solving environment

Some other definitions

An information system that is designed to work with data referenced by spatial or geographic co-ordinates. In other words, a GIS is both a database system with specific capabilities for spatially-referenced data, as well as a set of operations for working with the data. In a sense, a GIS may be thought of as a higher-order map. A system of hardware and software that links mapped objects with text information that describes them and provides tools for the storage, retrieval and manipulation of both types of data. A system of computer hardware, software and procedures designed to support the capture, management, manipulation, analysis, and display of spatially referenced data for solving complex planning and management problems. A system that contains spatially referenced data that can be analyzed and converted to information for a specific set of purposes, or application. The key feature of a GIS is the analysis of data to produce new information. Thus GIS is an information system that is designed to work with data referenced by spatial or geographical coordinates. In other words, GIS is both a database system with specific capabilities for spatially referenced data as well as a set of operations for working with the data. It may also be considered as a higher order map. There are various terms that have been used to describe systems designed to address specific GIS terms, such as: o LIS Land information system - concerned primarily with the management and administration of land parcels. o FIS Facilities information system - concerned primarily with the management of transport, communication, and service facilities such as roads, railways, sewage, water, power and telephone lines, although these may also be concerned with area's features such as buildings, and their characteristics, and their distribution.

NRMIS or RIS Natural resource management information system - concerned primarily with the management of natural and other areas resources such as water, soil, and vegetation resources.

Why GIS is needed: Use of GIS is advocated on account of following observations:

Poorly maintained geospatial data Out of date maps and statistics Inaccurate data and information Absence of data retrieval service Absence of data sharing Digital format data is compact and large quantities can be maintained and retrieved at a greater speed and lesser cost Planning scenarios, decision models and interactive process are normal functions of GIS Ability to perform complex spatial analysis rapidly Ability to manipulate different types of data efficiently

Benefits of GIS

Geospatial data better maintained in a standard format Revision and updating easier Geospatial data and information easier to search, analyze and represent Value added products can be generated Geospatial data can be shared and exchanged freely Productivity and efficiency of staff is improved Saving in time and money Better decisions making

It should be noted that GIS has benefited from technical and conceptual developments over time in various areas such as surveying, cartography, photogrammetry, remote sensing, CAD, spatial analysis, etc GIS Synonyms Many terms related with geographical information are also frequently used as given in the following figure

4Ms of GIS

In a GIS, we measure environmental parameters, develop maps portraying earth characteristics, monitor changes in surrounding space and time, and also model alternatives of actions and processes operating in the environment. These are called four Ms of GIS. A typical analysis methodology in a GIS is provided in the Figure

The figure shows that the data is collected from the real world. The data is input to GIS in computer compatible format and subjected to variety of queries and transformation. This data can also be used for modelling the real world using which certain decision can be taken about the real world. Commonly asked Questions in GIS: GIS enables user to ask variety of questions. A typical sample of questions are given below (Burrough, 1986): 1. 2. Which data are related? ( Relational question: Analyzes the spatial relationship between objects of geographic features) What if? ( Model based question: Compute and display an optimum path, a suitable land, risky area against disasters etc. based on model)

Due to its strong analysis, modelling and visualization capabilities, GIS is able to provide answer to variety of questions. An indicative list of such questions is given below (Burrough, 1986; Holdstock, 1998):

Aspatial queries - attribute: o What is the value of function Z at position X? o How large is B (area, perimeter, etc.)? Condition : Where is object A? Spatial queries - topological: o How many occurrences of type A are there within distance D of B? o Where is A in relation to B? o What objects are next to objects having certain combination of attributes? Location: What is present at points X1, X2, etc.? Trends: What has changed since ...? Patterns: How much growth has occurred? Proximity: What are the characteristics of an area surrounding a feature? Reclassify: Regroup objects having certain combination of attributes. Networking: Which is the path of least cost, resistance or distance along the ground from X to Y along pathway P? Logical operations: o What is unique about a feature A? o What is the result of intersecting various kinds of spatial data?

Modelling: o Using digital database as a model of the real world, simulate the effect of process P over time T for a given scenario S.

Components of GIS: GIS consists of the following three components Hardware Used to store, process and display Software Used to control and perform operations Expertise Human element required to drive system to meet requirements Major Hardware Components: The general hardware components of GIS comprise a computer or CPU. It is connected to a disk drive unit, which provides space for storing data from maps and documents into digital form and send them to the computer. A plotter is used to present the results of the data processing, and a tape or CD/DVD drive is used for storing data or programs or for communicating with other systems. The user controls the computer and the peripherals via a visual display unit (VDU) or terminal. A scanner or digitizer is required to convert the analogue data into soft form.

Major software components:

GIS software comprises several functionally related components to carry out variety of operations. These can be grouped as follows (Table 1): o Data acquisition/Input o Data processing and preprocessing o Database management (storage and retrieval) o Spatial data manipulation and analysis o Product generation: output and visualization

Table 1: GIS Software Functional elements Components Data acquisition/Input . Digitizing . Editing . Topology building . Projection transformation . Format conversion . Attribute assignment, etc. Sub-components

Data processing and preprocessing

. Data archival . Hierarchical modelling . Relational modelling Database management (storage and retrieval) . Relational modelling . Attribute query . Object oriented database Spatial manipulation and analysis . Measurement operations . Buffering

. Overlay operations . Connectivity operations Graphical output and visualization . Scale transformation . Generalization . Topographic maps . Statistical maps

a) Data input:

Data input is the operation of encoding the data and writing them to the database and creates the foundation for useful GIS. However, the process of good database creation is very time consuming and complex operation upon which the usefulness of the GIS depends. Data input involves data acquisition including identification and collection of data required for applications. It covers all aspects of transforming data captured from existing maps, field observations, and sensors into a compatible digital form. A wide range of computer tools is available for this purpose, including the digitizer, lists of data in text files, scanners and the devices necessary for recording data already written on magnetic media such as tapes, drums and disks. Various sources for data input may be: o text files o existing maps o aerial photographs o satellite imagery o airborne scanners o field measurements o other GIS databases

Data to GIS can be input in the following stages (a) Entering the spatial data (digitising) (b) Entering the non-spatial, associated attributes; and (c) Linking the spatial to the non-spatial data.

Two principal approaches for inputting data are by manual cartographic digitizing and automatic scanning. Non-spatial associated attributes are those properties of the entity that need to be handled in GIS, but which are not themselves spatial in nature. For example, a road can be digitised as a set of continuous pixels, or as a vector line entity. The road can be represented in the non-spatial part of the GIS by a certain kind of a colour, symbol or data location. The data about the kind of road can be included in the range of cartographic symbols normally available. Linking the spatial data to the already existing digitised points, lines, and areas can be done using a special program that requires only the digital representation of the points, lines and areas themselves, carrying unique identifiers as part of normal digitising.

b) Data Pre-processing/processing

Data input in a computer compatible form may also involve several steps known as pre-processing. This involves manipulation of data in different ways to prepare it for further modeling. For example, it may involve converting format i.e. georeferencing which consists of geometric correction and resampling This process establishes a consistent system for recording. It results in a data type, georeferencing system and data structure that is compatible with the system. The end result of the pre-processing phase is a coordinated set of thematic data layers. The essential pre-processing procedures include: o format conversion o data reduction and generalisation o error detection and editing o merging of points into lines and of points and lines into polygons where appropriate o edge matching o rectification/registration o interpolation o photo-interpretation

(c) Data storage and database management It concerns the way in which the data about the position, linkages (topology), and attributes of geographical elements (points, lines, areas, and more complex entities representing objects on the earth's surface) are structured and organised, both with respect to the way they must be handled in the computer and how they are perceived by the users of the system (Figure 35.6). This provides consistent method of data entry, update, deletion, and retrieval. The computer program used to organise the database is known as database management system (DBMS)

d) Data analysis and modeling

It is the most important capability of GIS as far as the user is concerned and facilitates spatial analysis using spatial and non spatial attributes. It involves working within database to derive new information using several basic and advanced tools for statistical analysis and modeling. This is achieved through a set of functions or software modules as given below. o Retrieval, (re)-classification, and measurement functions o Overlay functions o Neighbourhood functions including search operations, topographic function, interpolation o Connectivity operations including contiguity, proximity, network, and spread operators Modelling involves simplified representation of reality. It is one of the strongest capabilities of GIS and facilitates better decision making.

e) Data output It concerns the ways the data are displayed and the results of analyses are reported to the users Data may be presented as maps, tables, and figures in a variety of ways, ranging from the ephemeral image on a cathode ray tube through hard copy output drawn on printer or plotter to information recorded on magnetic media in digital form. There are several professional GIS packages in the market such as

ARCGIS ILWIS ERDAS IDRISI MAPINFO GRASS The widely used database systems are Oracle and dBase

Introductory concept of geospatial data

Now a few commonly used terms in GIS will be introduced. A real world inside a computer is represented in the form of spatial or geographical objects. Spatial objects are delimited geographical areas, with a number of different kinds of associated attributes or characteristics (Figure below). Any given spatial object will be one of the following types: o Point o Line o Polygon The data can be arranged in a raster (grid form) or vector form. Details of these will be explained later.

Point It is a spatial object with no area. One of the key spatial attribute of a point are its geodetic location, often represented as a pair of coordinates (such as latitude-longitude, or northing-easting). The non-spatial attributes may include range of information associated with a point, depending on the application. Line It is a spatial object, made up of a connected sequence of points. Lines have no width, and thus, a specified location must be on one side of the line or the other, but never on the line itself. Polygon It is a closed area. Simple polygons are undivided areas, while complex polygons are divided into areas of different characteristics. Nodes These are special kinds of points, usually indicating the junction between lines or the ends of line segments. Chains These are special kinds of line segments, which corresponds to a portion of the bounding edge of a polygon. Applications of GIS Facility management Locating underground cables Planning facility maintenance Telecommunication network services Energy use tracking and planning Agriculture crop suitability Management of forests, agricultural land, water resources, wetlands, etc Environmental Impact Analysis (EIA) Disaster management and mitigation Waste dumping sites location Car navigation Locating houses and streets

Environmental and natural resource management

Street Networking

Site selection Ambulance services Transportation planning Planning and engineering Urban planning Regional planning Route selection for highways Public facility development Cadastre administration Taxation Zoning of land use Land acquisition

Land information system

GIS data In order to efficiently manipulate geographical data in a computer, it needs to be stored in a structured manner in the computer database. 1. Organization of data in an information system is referred to as data structure.

2.

Different kind of spatial data are arranged in the form of theme or data layer or data plane. Spatial data in each layer can be reduced to three basic spatial entities: 1. Point : Location of tube-wells, water tanks, rain gauge stations, etc.

2. 3.

Line : Road, railway line, canals, streams Polygon : Reservoir, lake, district, state, country boundaries, etc.

Thus every geographical phenomenon in principle is a point, line, or area plus a label saying what it is. The labels could be the actual names or they could be numbers that cross-reference with a legend, or they could be special symbols. All these techniques are used in conventional mapping. Layer concept in GIS: Concept of layer or theme is very important in GIS In a topographic map all components are seen on a single sheet In a GIS each of the components are presented in separate layers. For example, geometrically registered layers of building, topography, land use, etc. This facilitates switching off or switching on a layer while preparing final maps. This aids decisions on spatial queries

Geospatial data: Geospatial data in each GIS layer are described in the following terms Location: It is specified with reference to known coordinate system. This is sometimes known as spatial attribute. Attributes: These indicate characteristics which are not inherently related to geographical location. This is also known as the non-spatial attribute. Topology: It indicates spatial and adjacency relationship among the geographical features which are characterized by location and attributes. It may be noted that paper maps convey information by graphic symbolization (points, lines, and polygons), convey attributes by color, symbolization and pattern. On the other hand, GIS conveys information by graphic symbolization (points, lines, and polygons), and retains spatial relationships mathematically through the concept of topology.

Typical non-spatial data for a district database Sectoral data 1. Demography sector (a) Population data 2. Economic profile (a) Occupational data (b) Income data (c) Accessibility data 3. Agriculture sector (a) Crop data (b) Soils data Type acreage, season, yield per hectare, fertiliser usage, load grain production. Types, texture, depth, fertility, permeability, drainage, etc. No. of cultivators, agriculture laborers, workers in servicing repairs other workers in household industries, manufacturing. Servicing / repairs other workers. Income from agriculture sector in different years, Income from nonagriculture sectors in different years, total income different years. Type and length of approach road to village. Village population, Town/City population, male population, female population area, schedule tribe population and schedule caste population. (Latest data to be considered first. Past data would also be useful) Content

(c) Facilities 4. Forestry sector (a) Forest product data (b) Natural mishap 5. Mineral resources (a) Reserves data 6. Water resources (a) Water bodies 7. Power development sector (a) Power pattern data (b) Consumption data 8. Health sector (a) Facility data

Non-irrigated area, wells, tanks, fertiliser, outlets, banks, agricultural markets, co-operative societies, seed centre, Agriculture implement centres etc. Type and quantity of major and minor forest produce. Fire incidents, rest incidents etc. data (if any) Type of mineral resources, quality, quantity of mined and unmined minerals, value Depth of lakes, pond, and withdrawal and well discharge data Power station location, type, rated capacity, total installed capacity Energy consumed by domestic, industrial agricultural, commercial purpose, no. of houses electrified etc. No. of hospitals. primary health centers, sub-health dispensaries other facilities, no. of doctors, nurses, beds, etc. centre,

(b) Usage of facility 9. Education sector (a) Facility area

No. of patients in various health centre, no. of out patients, no. of births, no. of deaths (infant others), no. of people vaccinated, no. of people affected by cholera, typhoid etc. Movement of people (as availability). No. of primary, secondary, higher secondary, colleges, and technical schools, no. of teachers at various levels, capacity of each facility. No. of literate male, no. of literate female, no. of. students enrolled and its attendance - primary, secondary, higher secondary, technical school and colleges, no. of students drop out at various levels, migration/movement for education etc. Retail shop, cottage industry, bus stop, railway station, oil depot, petrol pump, banks - both number and distances, movement of people etc.

(b) Usage of facility 10. Settlement and service centre (a) Service centre data

Spatial object is an important concept in GIS. It represents geographic area with associated attributes and characteristics. There are three fundamental types of spatial objects - point, line and polygon. These are briefly described below: Point It is a spatial object with no area

o o

Attributes: geodetic location (spatial attribute) and other data (non-spatial attribute) based on application. Some representative non-spatial attributes for water well may include ownership, depth (deep/shallow, quantitatively), quality of water, pumping volume and rate, date of boring, expected life, etc.

Line This spatial object is made of connected sequence of points Attributes : no width, spatial location on either side of the line Polygon This spatial object is a closed areal entity.

o o
Nodes Chains

Simple polygons indicate undivided area Complex polygon indicate divided area of different characteristics

Intersection of line or end of line segment Special kind of line segment which corresponds to a position of boundary edge of a polygon.

Raster and Vector data structure There are two fundamental ways of representing topological data which can be summarised as follows

o o

Raster Vector

(i) Raster representation This representation consists of a set of cells located by coordinates, each cell is independently addressed with the value of an attribute. Various raster data structures are cellular or grid based organization of spatial data Map is represented as an array of rectangular or square cells. The image is arranged in form of cells at regular interval and the parameters of interest are arranged in these cells. Data is arranged in form of rows and columns o Rows: in East - West direction o Columns: in North - South direction Origin of raster image is generally at top left corner: Position (0, 0) Distance between cells in row and column directions is constant. Volume of raster data is quite large

o Raster cells or pixels (short form for Picture Elements) in image o In satellite data of IRS-1C PAN sensor 2048X2048 pixels cover an area of 5.8 x 5.8 x 2048 x 2048 m2, for 7 bit data
total bits = 7x2048X2048 bits Due to large volume of raster data, data compression techniques are used such as o quad tree o run-length encoded o block coded o chain coded a) Run length encoding

ii) Vector representation

A vector is defined with reference to its o Starting point, Associated Displacement and Direction (or bearing) Vector representation consists of three main geographical entities - point, lines and areas; lines and areas are the sets of interconnected coordinates (sequence of points) that can be linked to the given attributes. Entities may be associated with a value or attribute identifying the type of entity it is. For example, forest, temple, well, etc. Most computer graphics and CAD systems use vector data structure Various common vector data structures are o whole polygon structure o dual independent map encoding (DIME) o are-node o relational structure o digital line graph (DLG) o topologically integrated geographical encoding and referencing system (TIGER) o Triangulated Irregular Network (TIN) Vector data may have two main representations (Kersey, 1991): o Entity-by-entity structures Lists of entities with no relationship information o Topological structures Record entity-to-entity adjacency relationship Entity-by-entity representation Original data is replaced by data pairs or tuples For example, if the following data is given in a row

12, 12, 15, 15, 15, 15, 17, 17, 17, 17, 17, 17, 17, It can be encoded as (2, 12) (4, 15) (7, 17), where the first number within brackets indicate the length of the value (i.e. the frequency of the value without break) and the second term is the value itself. This coding has resulted in reduction from 16 elements (16 data values) to 6 elements (total three pairs with two elements in each pair; each pair indicates combination of frequency and data value) and yields in good compression when repeating data are available. The coding is more efficient only if substantially large areas have repeating values. Chain code Map is considered as spatially referenced object placed on a back ground Storage of areas is in the form of record in which the first two points indicate the origin point on the border of the object. The remaing information indicates the sequence of cardinal directions of cells indicating the boundary of object. Freeman codes indicate the four directions (0, 1, 2, and 3).

Starting from the object origin at (3, 3), the object boundary could be encoded with the position of the origin followed by principal directions as (3, 3, 1, 2, 1, 0, 1, 1, 0, 1, 0, 3, 0, 3, 0, 3, 0, 3, 2, 3, 2, 2, 2, 2), where the first two elements indicate the origin of the object and the remaining elements indicate the four cardinal directions (0, 1, 2, or 3) in the form of freeman codes. Block codes This coding scheme breaks down a homogeneous region into squares and stores block dimension and its origin. For example, the following figure, stating with the address of the bottom left cell as (1, 1), the objects in the raster form can be coded as (2, 4, 10), (3, 6, 7), (1, 6, 10), and (1, 9, 8). The first number within brackets indicates the size (in square form) and the remaining two numbers indicate the origin of the first cell in the square block (first the x coordinate then the y coordinate).

Quadtrees

It is special type of raster structure in which the information is stored in inter-related multiple layers (also understood as Pyramidal Structure. This arrangement subdivides a given area into quadrants. Further subdivision takes place if the quadrant contains more mixed values. Higher level pixels has twice the width and height of the previous level (area is four times) and four-fold reduction in number of pixels in each layer.

Structure represents a Tree in the form of nodes and leaves and results in better o Sorting and searching o Saving of computational time as some processing steps are not required Consider a raster data with 32 size. This dataset requires 32x32 = 1024 cells. Considering higher cells, we require: Layer Cell width Total cells 1 2 3 4 5 6 32 16 8 4 2 1 1024 256 64 32 16 4

You might also like