You are on page 1of 67

Socio-Economics Survey Using a Structured Questionnaire

Hue, 3/2006

Table of contents
ABBREVIATION...................................................................................................................................... 3 STRUCTURE OF THIS MANUAL.......................................................................................................... 4 I I.1 I.2 I.3 I.4 I.5 I.6 II INTRODUCTION ON QUESTIONNAIRE SURVEY .................................................................. 5 Concept of questionnaire survey ............................................................................................... 5 Why using questionnaire? ......................................................................................................... 5 Sampling .................................................................................................................................... 5 Descriptive and analysis design of question .............................................................................. 6 Quantitative and qualitative questions...................................................................................... 7 Questionnaire survey and PRA ................................................................................................. 7 QUESTIONNAIRE SURVEY PROCESS...................................................................................... 8 II.1 Preparation stage....................................................................................................................... 8 II.1.1 Goals and objectives identification.................................................................................... 8 II.1.2 Process identification...................................................................................................... 11 II.1.3 Stakeholders identification .............................................................................................. 11 II.1.4 Study areas and study sites identification......................................................................... 13 II.1.5 Contact and consult with local authority ......................................................................... 15 II.1.6 Define the parameters and sub-parameters...................................................................... 16 II.1.7 Identify the assessment team............................................................................................ 19 II.2 Plan the survey ........................................................................................................................ 20 II.2.1 Assess secondary data..................................................................................................... 20 II.2.2 Plan the survey ............................................................................................................... 22
II.2.2.1 II.2.2.2 II.2.2.3 Decide sampling unit ............................................................................................................... 22 Decide the key informants, respondents (number of sample) ..................................................... 22 Build up the questionnaire ....................................................................................................... 23

a. Build up the draft questionnaire .................................................................................. 23 b. Pre-test of questionnaire ............................................................................................. 24 c. Adjust the questionnaire ............................................................................................. 24
II.2.2.4 Data tracking ........................................................................................................................... 24 II.2.2.5 Develop a coding system ......................................................................................................... 25 II.2.2.6 Define plans for analysis.......................................................................................................... 25 II.2.2.7 Establish the field survey teams ............................................................................................... 25 II.2.2.8 Define the schedule for field data collection ............................................................................. 26 II.2.2.9 Train field teams in the data collection methods using questionnaire and skill of survey using questionnaire ............................................................................................................................................ 26 II.2.2.10 Provide a summary on locality (culture, custom...).............................................................. 26 II.2.2.11 Logistics arrangement ........................................................................................................ 26

II.3 Conduct survey........................................................................................................................ 27 II.3.1 Principles ....................................................................................................................... 27 II.3.2 Conduct survey ............................................................................................................... 28 II.4 Data management and analysis ............................................................................................... 29 II.4.1 Basic principles of data analysis ..................................................................................... 29 II.4.2 Conduct the data analysis ............................................................................................... 29 II.5 Reporting ................................................................................................................................. 30 II.5.1 Field report .................................................................................................................... 30 II.5.2 Outline the final report.................................................................................................... 30 II.5.3 Final report .................................................................................................................... 30 III DATA ANALYSIS USING MS EXCEL ...................................................................................... 32 III.1 Data entering in MS Excel....................................................................................................... 33 III.2 Data analysis in MS Excel ....................................................................................................... 33 III.2.1 Attention in data analysis using MS Excel ....................................................................... 33 III.2.2 Descriptive statistics ....................................................................................................... 34 III.2.3 AVERAGE function......................................................................................................... 35 III.2.4 COUNT function............................................................................................................. 36 III.2.5 COUNTA function........................................................................................................... 36 III.2.6 COUNTBLANK function................................................................................................. 36

III.2.7 COUNTIF function ......................................................................................................... 36 III.2.8 RANK function................................................................................................................ 36 III.2.9 SUM function.................................................................................................................. 37 III.2.10 SUMIF function......................................................................................................... 37 III.3 Exporting results ..................................................................................................................... 37 IV DATA ANALYSIS USING SPSS ................................................................................................. 38 IV.1 Variable ................................................................................................................................... 38 IV.1.1 Concept .......................................................................................................................... 38 IV.1.2 Variable types................................................................................................................. 38 IV.1.3 Data coding and entering................................................................................................ 39 IV.1.4 Missing data ................................................................................................................... 41 IV.2 Data analysis ............................................................................................................................ 41 IV.3 Exporting results ..................................................................................................................... 43 ANNEXES ............................................................................................................................................... 44
i. ii. iii. iv. v. vi. vii. viii. ix. Sampling and sample selection ...................................................................................................... 44 Structure of a questionnaire........................................................................................................... 47 Types of question........................................................................................................................... 48 Principles in designing question..................................................................................................... 49 Individual interview and Public interview ...................................................................................... 49 Interview methods Difficulties and Advantages ............................................................................ 50 Statistics ....................................................................................................................................... 51 Tips in conducting interview .......................................................................................................... 62 Tips for field data collection .......................................................................................................... 62

GLOSSARY............................................................................................................................................. 63 REFERENCES........................................................................................................................................ 65

List of Table:
Table 1. Suggested sampling method................................................................................. 6 Table 2. Total sample of Households being interviewed and the number of valid questionnaire used for the analysis within the IMOLA socio-economic baseline study ................................................................................................................................ 6 Table 3. Differences between descriptive survey and analysis survey ................................ 7 Table 4. Differences between questionnaire survey and PRA ............................................ 7 Table 5. Five main purposes corresponding to objectives and examples............................ 9 Table 6. Some common list of parameters and complex variables ....................................17 Table 7. Example of PRA Co-ordination schema..............................................................18 Table 8. Example of a Co-ordination schema in a questionnaire survey...........................18 Table 9. Typical sources of secondary data in Vietnam ....................................................21

List of Figure:
Figure 1. General stages of a questionnaire survey process ...........................................................8 Figure 3. Data analysis using MS Excel.......................................................................................32 Figure 4. Descriptive statistic in MS Excel...................................................................................34 Figure 5. Descriptive statistic in MS Excel (continued) ................................................................35 Figure 6. Data analysis using SPSS .............................................................................................38 Figure 8. List of variables............................................................................................................41

ii

Abbreviation
IMOLA: IMOLA Hue Project GCP/VIE/029/ITA NACA: Network of Aquaculture Centres in Asia-Pacific QS: Questionnaire survey PRA: Participatory Rural Appraisal DOFI: Department of Fisheries NPD: National Project Director DPC: District People Committee PPC: Provincial People Committee GSO: General Statistic Office DSO: Provincial Department Statistic Office MOLISA: Ministry of Labour, Invalid and Social Affairs DOLISA: Provincial Departments of Labour, Invalid and Social Affairs GoV: Government NGO: Non-Government Organisation H.H: Household

Structure of this Manual


There is no best way to conduct a socioeconomic assessment the steps involved may be conducted in many ways. This manual arranges these steps in the most likely order, and organises them into four chapters including the data management and analysis and introduction of using MS Excel and SPSS. The first chapter includes the general concepts of a questionnaire survey and the related issues such as sampling, sample size, type of question or distinction of questionnaire survey and PRA. The second chapter is the main part of this Manual where introduce the detail steps of conducting a questionnaire survey. In general, this chapter is a whole process of questionnaire survey and it is divided into 5 stages: Preparation, Plan, Field Data collection, Data management and Analysis and Report. To facilitate the reader, each of these stages is divided into smaller steps of which each step is presented with its objectives, contents and method. The third chapter is to introduce about the use of MS Excel in data management and analysis. This is just very simple skills in using MS Excel including descriptive analysis to describe data and several key functions to calculate and simply analyse data. Each of these functions is presented with its definition, purpose of use and method (path in MS Excel). The fourth chapter is to introduce about the use of SPSS in data management and analysis. This is most popular software in statistic analysis and thats why this chapter is a bit more complicated in compared with the third chapter. However, this chapter also could introduce the key functions of SPSS such as frequency, descriptive or crosstab etc. but not get the people into deeper statistics to avoid making them confuse. This chapter also mentions about the variables, missing data and how to encode data. The statistics are presented following each function which is closed with its contents, path in SPSS and examples are also provided. Moreover, the real examples which are draw from the survey of IMOLA project in Hue, 2006 are very useful and also provided in this Manual to illustrate the issues further. The manual also concludes with references, a glossary to clarify various terms to enable the reader to access more in-depth information.

Introduction on Questionnaire survey

At present, survey and assessment are being increasingly conducted in every sector to collect, update the data serving in time for the decision making process. To obtain the precise results, surveys need to be formed in such a way that it could meet all the requirements on data collection and overcome the limitation of time, financial and human resources. There are many methods for setting up a survey, however this Manual concentrates on the method of survey using questionnaire.

I.1

Concept of questionnaire survey

Simply, survey using questionnaire means the survey conductor uses a set of questions to collect data. The questionnaire could include several issues related to a community, location or an individual or it could focus only on an issue which is of need for assessment. The questionnaire is normally used for data collecting with households or individuals as sample representing the general community condition. The questionnaire usually uses the closed-ending and structured questions with closed answers or answers with given options such as yes/no, increased/decreased, true/false The survey using questionnaire does not encourage the open questions or explanation questions.

I.2
-

Why using questionnaire?


Survey using questionnaire is aimed to provide the quantitative data for statistical analysis following the given issues; Survey using questionnaire is aimed to provide the statistical data representing a larger group in a community or the whole community depends on the selected sample size; Survey using questionnaire is aimed to help to define and describe the variables as characteristics or attributes of groups or communities like education, age, income or population... Survey using questionnaire is aimed to help making comparison within and between bigger population and find out the correlation between variables.

I.3

Sampling

Sampling is the process of selecting units (e.g. people, organizations) from a population of interest. Sampling is very useful tool for assessments when there are not enough of resources (time, finance or human etc.) In general, there are two ways of sample size selection: Random selection: fully randomly selection in the community where the survey is to be conducted and it is not based on any concrete criteria; Non-random selection: it is also random selection but based on some stratification criteria related to the main objectives of the survey like occupation, wealth With this method, it is necessary to distribute relatively balance the number of sample for ensuring the representation of sample.

Sample selection principle: sample size should reach at least 10% of the total population where the survey is conducted. In case of the total population has number of sample unit less than 1000, it needs to increase the number of sample (more than 10%) in survey to increase the precision of result analysis. However, sometimes the number of samples needs to be considered with other factors such as the availability of resources of survey (human, finance, time...). The survey sometimes needs to accept lower precision due to those factors. Sample size depends on 4 main factors:

Basic requirements in statistics (regulation on minimum sample size in general statistical process or specific regulation of computer programs as SPSS, STATA, SAS) Number of different small groups in community which have different benefit or attribute, in fact, the number of those group is relatively direct proportional to the sample size; The level of complexity of the groups is also direct proportional to the sample size; And the last one is limitation of resources of survey.

In the data collection process, missing data is unavoidable and the interviewer needs to be aware of this issue to avoid the lacking of information. For example, the required sample size is 30 and balanced distribute to 3 occupations in the areas then it requires to take more than 30 interviews in the field (40 or 50) to ensure the adequateness of data for analysis afterwards. This also depends on the experience of interviewers: the lesser experienced in questionnaire survey the greater chance to have invalid questionnaires partially or wrongly filled and therefore not usable for the analysis. Table 1. Suggested sampling method Population/total number of H.H Sample size (person/H.H) 100 20 25 200 30 40 300 50 60 400 60 70 500 70 80 1000 90 100 Notes: the sample size selection as described in above table is just a reference. For the specific surveys which require data collection from many different groups will need different sample size to ensure the representation.

Table 2. Total sample of Households being interviewed and the number of valid questionnaire used for the analysis within the IMOLA socio-economic baseline study
Commune Dien Hai Huong Phong Quang Cong Quang Loi Quang Phuoc Vinh Xuan Vinh Hien Vinh Hung Loc Binh Loc Tri Vinh Phu Grand total Aquaculture 30/35 30/38 30/39 32/35 27/28 30/46 31/38 47/59 33/40 32/41 32/37 354/436 Input / total data samples Capture Fishery Agriculture Total 31/34 43/52 104/121 29/29 49/56 108/123 30/49 30/48 90/136 30/40 30/40 92/115 31/32 32/32 90/92 28/30 31/34 89/110 29/36 32/45 92/119 29/31 35/36 111/126 30/34 27/37 90/111 34/39 34/49 100/129 30/35 31/40 93/112 331/389 374/469 1,059/1,294 Percent 86.0 87.8 66.2 80.0 97.8 80.9 77.3 88.1 81.1 77.5 83.0 81.8

I.4
-

Descriptive and analysis design of question

Descriptive question commonly started by How many

For example, How many people are there in your family? Or How many fishermen are there in the community? These questions are used for quick answers. These are closedended questions. Analysis question commonly started by Why For example, Why did you loss in the last shrimp crop? Or Why the people in the community do not like to conduct the agriculture activities? These questions require the

respondent takes long time to answer and sometime they provide not precise information or could not answer. These questions are open-ended questions. In fact, people used to uses both above types of question to diversify the collected information.

Table 3. Differences between descriptive survey and analysis survey Descriptive design Could use only for fixed issue Respondent could answer only, not explanation Save time Collected information is relatively homogeneous More precise information because of it is not depend on the bias of respondent Analysis design Could use for any issue Participatory level of respondent is very high Take long time It could receive different answer with different interviewers Information could be incorrect due to bias of respondent

I.5

Quantitative and qualitative questions

In general, it is more interested in these criteria (quantitative and qualitative) when establishing the questionnaire because they will directly influence the result of analysis. Qualitative question is question with answer which shows the characteristics of that issue. For example, How have you intended to develop your production in next year? Or Why fishes in your cages are frequently diseased? Quantitative is question with answer which could be quantifiable. For example, In the next year how many cages are you intend to develop? or How frequently the fish in your cages got diseased? * In fact, it is commonly combined both quantitative and qualitative question in building a questionnaire to meet the much diversified demand of assessment and it depends on specific case to use which type of question to fit the requirement. For example, the descriptive and quantitative questions are usually used for issues of population, production, areas... and analysis and qualitative questions are usually used for the issues of disease, market or source of credit... There is no any fixed regulation for every cases but the organizer needs to base on the specific conditions of survey to decide which type of question is the best choice.

I.6

Questionnaire survey and PRA

Survey using questionnaire and participatory rural appraisal (PRA) are the most popular methods used in the present assessment especially in several sectors like agriculture, fisheries or environment... Although they are the 2 separated methods with different objectives and way of conducting but they are frequently together used in the assessments to get more comprehensive and precise data and sometime to economize resources. In the real surveys especially the surveys related to the development issues, the combination of those 2 methods is very effective with the general, open information and the recommendations, comments or discussion are effectively collected through PRA and detail, independent and specific information is collected through questionnaire. However, it needs to keep in mind that the questionnaire in this combination has to be simplified as much as good to avoid the waste of resources because of most of general information is collected by PRA with the same goals. Table 4. Differences between questionnaire survey and PRA Survey using questionnaire Use the closed-ended questions => information focuses in concrete objectives One way information Quantitative information PRA Use open-ended questions => information Could have multi way information Qualitative information open

Survey using questionnaire Interview is boring when a question has to repeat so many times It takes time and effort to get information but it is more scientific Information sometime is not representative due to it depends on specific respondent The respondent not feel much comfortable due to they have to answer to many questions

PRA PRA guide is more interesting It takes lesser time and effort to get information Information is more representative due to it is discussed by groups Participants will feel more comfortable due to they are free in discussing

II

Questionnaire survey process

Survey using questionnaire is very structured and includes many continuous stages and activities. In general, there are 5 main stages: 1. Preparation stage 2. Plan the survey 3. Conduct survey 4. Data management and analysis 5. Reporting
Figure 1: General stages of a questionnaire survey process Preparation stage 1. Goals and objectives identification 2. Assessment process identification 3. Stakeholder identification 4. Study areas and sites identification 5. Contact and consult with local authority 6. Define the parameters and sub-parameters 7. Define the assessment team Plan the survey 1. Assess the secondary data 2. Plan the field survey Conduct survey 1. Principles 2. Conduct the survey using questionnaire Data management and analysis 1. Basic principles of data analysis 2. Conduct the data analysis Report 1. Field report 2. Outline the final report 3. Final report

II.1 Preparation stage


This is the very important work in developing a survey. This could ensure the convenience for the both team and local stakeholder and strongly impact to the successful of the assessment. This stage could include: II.1.1 Goals and objectives identification Objective:

This help to identify how to set up an assessment and also decide the complexity and scale of survey as well as help to select the right stakeholders. Content: As a general Manual, this has 5 types of goals and objectives: 1. Management the assessment may be designed to study the potential socioeconomic impacts of resources management strategies intended to protect and conserve the fisheries resources. 2. Research the assessment may aim to increase knowledge about the social and economic conditions of stakeholders and to show how the condition of the resources is directly linked to human activities. 3. Development the assessment may aim to identify socioeconomic issues that need to be addressed during development activities to improve the conditions of stakeholders. 4. Monitoring the assessment may help establish a baseline for assessing socioeconomic changes over time in communities linked with fisheries. 5. Policy the assessment may be designed to provide socioeconomic information and make recommendations to guide decision-makers and policy-makers. After having goals identification, the objectives can be defined to clarify the focus of the assessment. These objectives should be identified based on the interests and needs of the stakeholders and end-users. A general guide to objectives as they relate to the goals is included as Table 3. The organizer should consider the plan for using the assessment findings to draft the form of final product of the assessment because of it will be defined by using objective or who is the end-user. Moreover, it should also need to determine whether quantitative or qualitative results are preferred, and how much descriptive socioeconomic background information on the stakeholder groups is required.
Table 5. Five main purposes corresponding to objectives and examples

Goals Management

Objectives To collect information to design a management scheme appropriate to the local socioeconomic conditions

To establish a process of participatory management

Research

To identify and understand socioeconomic issues and stakeholders To collect information to design strategies to mitigate the socioeconomic impacts of development

Development

Examples A manager wants to increase the effectiveness and acceptability, of management measures on Hue lagoon by adapting them to local conditions, taking into account the culture, tradition and patterns of resource use. A socio-economic assessment to describe those local conditions and to identify ways of making management more appropriate was conducted there in March 2006. In that survey, the ownership, association membership etc. were mentioned to gather information to support for the later proposal management model and may include participatory management. A socioeconomic assessment is commissioned on behavior and attitudes of primary stakeholders of fisheries resources as the basis for putting pressure on policy makers to make fisheries resources protection a priority. New plan is established for Hue lagoon but the development will displace local fishermen and may also the fisheries resources. A socioeconomic assessment is planned to identify ways of limiting the negative impacts and providing compensation

Goals

Objectives To establish a process of analysis and planning to identify and understand socioeconomic issues relating to resources, and to collect information to help planning of appropriate development activities To establish baseline data for monitoring socioeconomic impacts of development activities To establish baseline data to monitor the socioeconomic impacts of management strategies To identify and understand socioeconomic issues relating to resources use to guide wider policy development

Examples and alternative income for local people. To alleviate pressure on the resources by helping local people identify and initiate alternative activities that are not resource dependent and are less damaging, a socioeconomic assessment is carried out under co-operation of NGOs and Hues institutions to catalyze this process with the full involvement of resources stakeholders.

Monitoring

Policy

The socioeconomic information is required before start new planning in Hue lagoon, they commission a socioeconomic assessment that will include data on local incomes and livelihood patterns so that they can monitor impacts over time. A baseline socioeconomic assessment is done to learn how fishermen use the resources and the benefits they get from it. This information will be compared with future data to determine changes in activities, income, and well-being etc. IMOLA Hue project and NACA commission socioeconomic assessments in various resources areas around the areas to collect basic information on resource use to help policy development.

Method: Assessment team including the organizer should identify the need and benefit of the assessment as well as the end-user of this assessment results to define the goals and objectives of assessment. Ex: A product of goals and objectives defining for an assessment Goals Objectives Specific objectives Management: To collect information To deeper study on the awareness to help establishing the of stakeholders related to To help to define the better appropriate management strategy management in Tam Giang management to local To deeper study on the conflicts lagoon for all stakeholders conditions and also resources between stakeholders related to lagoon, especially between Provide guide and awareness To establish the fishermen and how to solve it for other manager in other participatory based on management strategies locations with similar management in lagoon conditions areas Monitoring: To establish a database for To establish a database for monitoring monitoring the changes on socio-economic conditions in the future To survey on the contribution of fisheries to the actual local economy and its potentials to future local economy To survey on the gender issue with the stakeholders To understand about the property system

10

II.1.2 Process identification Goals: To clarify and schedule the steps in assessment to help the manager could easily control the assessment process and ensure the effectiveness as well as the quality of collected information. Content: After setting the goals, the next step is to set up the process of conducting the socioeconomic assessment for each stakeholder. One more important thing is to determine what resources and how much time are needed for the socioeconomic assessment. The following list is a general guide and the organizer should add or remove some of them depends on the real conditions: Car rental, boat rental, other transportation to/from sites (e.g. buses, taxis); Consultant fees (e.g. economist); Accommodation for non-resident team members; Camera, binoculars, tape recorder, video camera; Maps, nautical charts, Global Positioning System (GPS); Copying and other office-related expenses; Notepads, flipcharts/poster board, pens/pencils, markers; and Expenses related to hospitality for the communities (e.g. drinks, small gifts usually food items).

The time required for each socioeconomic assessment varies depending on the size of the area, number of stakeholder groups, the parameters included and interviewers experience. For example: IMOLA Hues survey each group of 5-6 interviewers took 3 days/commune/90-110 samples (QS). Method: The organizer should decide this issue with consultation of the whole team and experts of related sectors and also should try to get consultation of the local stakeholders to ensure everything is fit with local conditions. The organizer also needs to clarify the main activities in stages which focus on the field survey stage because of the result and quality of field survey stage will directly impact the overall final result of the assessment. The organizer also needs to consider who is able to work on the assessment, how much time is available to conduct it and what resources are available. Based on this information, the organizer could set a timetable and allocate the funds and other resources needed. Ex: A result of assessment process identification Stages Time Things need to be prepared Preparation 1 week Notebook Field survey planning 2 weeks Office, notebook, notepad, logistics Field survey conducting 2 weeks Office, notebook, means of transportation, logistics Data analysis and final report 3 weeks Office, computer, notebook, notepad, logistics II.1.3 Stakeholders identification Goals: Personnel required Head of groups 3-4 persons 10-12 persons 3-4 persons

11

To exactly and adequately identify the main stakeholders need to be surveyed including local organizations, NGOs, local authority and key informants, interviewer... however, the primary and most important stakeholder is individuals/households and local authority and organizations related to the concerned issues. Contents: The organizer needs to identify the lagoon stakeholder groups to determine which ones should be the focus of the assessment. Stakeholders may be listed in three groups: Primary stakeholders people who directly depend on the resources for a living and who make direct use of the resources (e.g. fishermen, aquaculturist, farmer...); Secondary stakeholders people who do not use the resources directly but make use of products or services source from the sector which is being assessed (e.g. fish traders) or whose actions may affect the sector which is being assessed (e.g. farmers, salt maker, handcrafter...) in the areas where are being assessed; Relevant organizations organizations with direct responsibility for managing activities affecting the sector which is being assessed or with an interest in the primary or secondary stakeholders, including government agencies, informal or traditional organizations, universities, and nongovernmental organizations (NGOs). These might also be the end-users as working teams, associations... This identification should be made based on the activities of the individuals or groups affecting the sector which is being assessed whether directly or indirectly and in some cases, a stakeholder could fall into several groups then the organizer has to base on the specific objective of each assessment to locate them to a suitable group for interviewing. When it is not possible to study all the stakeholders on the list, it may be necessary to set priorities for which stakeholders to study. This can be done by noting three main factors: their proximity to the areas concerned; the impact that their activities may be having on the areas; and their relative levels of dependence on related areas and sector.

In fact, the organizers knowledge may be very limited about these and the selection of these could be not exact just in the beginning of assessment but they could be refined and adjusted when assessment teams learned more about the locality. Therefore, the local knowledge is very helpful in this stage to avoid the wrong decision in selection. Method: It is to list every subject in the areas no matter what is individual or organization and identify their correlation to the sector which is being assessed and set priority for selection. The number of individual and organization is depending on the dependence level of these subjects to the sector which is assessed and also the resources of assessment. Ex: Result of stakeholders identification for the IMOLA baseline survey Related groups/activities Main stakeholder Related organization Capture Fisheries Fishermen Fisheries Department, Fishermen Association... Aquaculture Aquaculturist Fisheries Department, Aquaculture Association... Agriculture Farmer Agriculture Department, Co-operative, Farmer Association... General (Management, Peoples Committee at all level, Education, Research...) Specialized Management Department... Identification of level of stakeholder participation

12

This is needed to clarify which step the stakeholders should be involved and this is also provide information for the manager to decide who could be considered as end-users of the assessments products. This identification is very useful for the assessments manager in saving time and budget. There are many levels of participation depend on the occupation, location... of stakeholders: Informed People who are made aware of the assessment and its goals and objectives, but are not involved in determining the goals and objectives or implementing them; Consulted People who are directly involved in discussions on the goals and objectives of the assessment and may contribute to its design. These people may be impacted by management decisions arising from the assessment and therefore need to be fully aware of how the assessment was designed Partnership People who are working closely with the assessment team such as staff of organizations assisting in the assessment. These people could take part in the assessment planning and also data collection process; Ownership In most cases, the organizations that initiated the assessment will own its results. Ownership is depending on various factors like objectives of assessment, contents of assessment or even the awareness of the community...

The level of participation should be defined based on the social status (woman, youth, authority...); level of interesting (farmer may not interest in fisheries environment or resources); resources potentials (each group has different level of participation based on their availability on time and finance); and political context (some groups will be hesitate in expressing their opinions due to their habit, awareness or political reasons...) Ex: Identification of level of stakeholders participation Stakeholders Informed Consulted Main Coastal population, stakeholder aquaculturist, farmer (fishermen) Secondary Village/commune Aquaculturist, farmer stakeholder management board Institutions Local authority, Local agriculture Coastal Guard department, universities, researchers, international organizations II.1.4 Study areas and study sites identification Goals: To clearly identify the sites and areas where the assessment team should focus while doing survey to avoid waste resources and also improve quality of collected information. This step would ensure the significance of the assessment. Contents: The study areas could be fixed but sometime the stakeholders may be highly mobile and spread far wider and in this case the organizer should consider their resources (time, human and finance...) to decide a reasonable study areas. Study sites can be selected using the following approaches: Random selection the organizer decides on the number of study sites that can be assessed, each small area or community is numbered and the required number picked at random; Convenience selection the organizer bases the decision on the convenience of access and other logistic considerations. If study sites are defined in this way, the organizer should be aware that the factors making these sites more accessible for the assessment team may Partnership Fishermen, fisheries trader, fisheries association Ownership

Local fisheries department, extension, local authority, local NGOs

Assessment organizers, DOFI

13

mean that these sites have social and economic characteristics different from sites that are harder to access; Purposive selection the organizer selects study sites deliberately according to factors such as the diversity of conditions in the area (i.e. sites where all the main types of related stakeholder and their activities are represented), the willingness of the communities to cooperate and whether issues of particular interest to the assessment occur at the site.

Method: This selection should include all three above criteria (random, convenience and purposive) to meet all the requirements of the assessment on time, finance and human resources and also to increase the quality of collected information. This step must to involve the local authority, related management Department and related local association, people... if available to have enough of knowledge in decision making. Ex: IMOLA Hue projects selection of sites for survey, 2006 In order for the QS to be significant it was to take place in at least 10 communes rather than the five plus one PRA communes. Since more than 10 communes had already been selected on the basis of criteria that were agreed upon by the Project Management Unit, it was decided that, besides the already selected PRA communes, additional communes were to be selected from this list. This seemed the most practical approach rather than identifying and involving communes that were not yet involved at this stage. An additional reason was to avoid disappointment and ensure continued support from the communes whose staffs were at that time already involved in the training. Selected QS Communes District Phong Dien Quang Dien Dien Hai Quang Phuoc Quang Cong Quang Loi Huong Phong Vinh Phu Vinh Xuan Loc Binh Vinh Hung Loc Tri Vinh Hien Commune

Huong Tra Phu Vang Phu Loc

This selection was made from the communes that were selected by the IMOLA National Project Director (NPD) and the Lagoon Districts Peoples Committees (DPCs).

14

Figure 2: Location of selected communes for QS in T.T.Hue

Legend: = PRA-training / QS commune = PRA / QS communes = Additonal QS communes

II.1.5 Contact and consult with local authority Goals: To ensure the assessment is legal and is agreed by every stakeholders To ensure the concerns and priorities of as many stakeholders as possible are included in the planning of the assessment; To ensure the co-operation of stakeholders, particularly the relevant organizations, in implementation of the assessment; To increase the stakeholders sense of ownership of the assessment and eventual findings; To i ncrease the stakeholders understanding of, and commitment to, the assessments recommended actions; To provide access to local knowledge, resources and assistance, which is particularly useful to managers with limited resources; and To increase public and political support for the assessment and management measures in general.

Method: This step should be conducted by several ways: One meeting between the assessment organizer and representative of stakeholders and local authority; or Discussions through existing forums, such as periodic planning meetings held by local authorities or co-ordination meetings involving different non-governmental organizations.

15

In fact, this is the first step of field study. It is 1-2 days trip including the making appointments with local authority, social associations, key informants, respondents... who will take part in the coming survey. It also includes the logistics arrangement if available. The last thing is to visit some households, places... to have an overall idea on geographic, natural, social or environmental conditions of the field. If possible, it should take the full list of population/household of the site to help to select the appropriate people taking part in the survey and make agreement with local authority on the teams schedule in sites. And if there is not yet PRA implementation, team should gather secondary data from the local related institutions to support for the afterwards survey. Ex: IMOLA Hues project, 2006 Working team has decided to organise a 1-2 days field trip to: discuss and settle all administrative procedures with local authorities collect the list of commune Households identify size of study site should be village collect or define list of activities identify the venue where conduct the Pre-test and QS (only if it is decided to call HHs in one place instead of going from HH to HH) decide with local authorities how, who and when inform villagers about the forthcoming QS

II.1.6 Define the parameters and sub-parameters Goals: To help to clearly, exactly and adequately identification the parameters and sub-parameters which will be used in data collection process to serve for the objectives and goals of assessment. In fact, the parameters seems to be very similar between surveys but there is exactly a distinction because each survey has its own goals and objectives and they would have different information has to be gathered. Therefore, for each survey the manager should review and decide the parameters and subparameters which are needed to involve following the specific goals and objectives of the survey by using the common given scheme and fit it into the real condition of the survey. Contents: After defining the objectives of the socioeconomic assessment, the organizer should decide which socioeconomic parameters to assess. These determine the substance of the assessment and form the basis for deciding what questions will be asked in the field. The organizer needs to identify those relevant for the socioeconomic assessment depending on the goals and objectives, the situation and the interests and needs of the end-users and other stakeholders, e.g. if the objective is to establish a participatory process, the organizer may focus on understanding parameters such as perceptions, rather than market and prices. Where most stakeholders are relatively new to the area, it may not be useful to try collecting information on traditional knowledge. Method: Team could base on the common list of parameters which is usually used for the socio-economics assessment and add in or remove several parameters to fit in the real conditions. The main steps should be: Identify the main variable in corresponding with surveys assessment Set up a Co-ordination schema Groups meeting to fill up the Co-ordination schema

16

Cross-check whether needed information is adequate or not and the question to gather that information.

A co-ordination schema can be defined as a plan to facilitate the establishing of linkages between each identified parameter under research and its lowest-order measurements in a stepwise specification and definition process. The main purpose is to consolidate the survey design, to establish order, to ensure coherence and completeness, and to avoid leaving gaps as well as overlapping or even duplicating. A co-ordination schema is based on the identified problem areas, each of which is broken down into parameters that are rather broad. Under each parameter, the related complex and/ or simple variables are identified. Each complex variable is then broken down into simple variables which could be further distinguished into qualitative and quantitative variables (Weber and Tiwari, 1992).
Table 6: Some common list of parameters and complex variables

Parameter Production pattern

Characteristics of stakeholders

Gender issues Stakeholder perception

Organization governance

and

resources

Traditional knowledge

Community services and facilities

Market attributes

Complex variable Related activities Stakeholder Technical aspect of related activities Use rights Location of activities and stakeholder Timing and seasonality Inhabitants and households Characteristics Residency status Ethnicity, caste and religious Background Age and gender Education Social status Household economic status Community livelihoods Stakeholder livelihoods Practical gender issues Strategic gender issues Resources conditions Threats to the resources Management Stakeholder Culture and belief Political context Governance administrative structure NGOs Use rights and property rights Management efforts Traditional Folk Taxonomy Knowledge Local knowledge of resources Variations in knowledge Community Services Medical services M E and Facilities Educational and religious Public utilities Communication facilities Markets and Retail Outlets Transportation Other facilities Supply Demand

17

Parameter

Complex variable Market prices Market structure Market infrastructure and operation

It is confirmed that each survey has its parameter schema and they are all different making different variables will be used in each survey. Table 5 and 6 below are the two different example schemas which are used to define the variables to be used in the PRA and questionnaire survey, the corresponding way to collect information is also provided.
Table 7: Example of PRA Co-ordination schema

Parameter

Complex variable

Simple variable Fisheries capture Aquaculture Aquatic trees

Information collection method Secondary data Resources maps Semi-structure interview Seasonal calendar Time schedule Diagram Related subjects analysis Semi-structure interview Secondary data Resources map Semi-structure interview Seasonal calendar Time schedule Map Related subjects analysis Semi-structure interview

Water resources

Irrigation water Use rights of water surface

Natural resources

Forestry land Residential land Land resources Agricultural land

Land use rights Human resources Financial source Social resources Material resources

Table 8: Example of a Co-ordination schema in a questionnaire survey

Parameter Production activities and socio-economic pattern

Complex variable Socio-economic pattern

Simple variable Common information

Question

Fisheries capture

Aquaculture

Name? Number of member in family? Who participate in which production Socio-economic activity in H.H? conditions Income of each activity? Fishing gear? Common Fisheries species? information Transportation means (boat)? Quantity of each fishing gear? Yield Trend over years? Credit? Services and Training? Extension Possibility to take part in social organization? Common Aquaculture species? information and Production technologies?

18

Parameter

Complex variable

Simple variable production activities Services extension Common information Disease?

Question

Credit? and Training? Possibility to take part in social organization? Species and cultivation form Credit? and Training? Possibility to take part in social organization?

Agriculture

Service extension

II.1.7 Identify the assessment team Goals: To identify who can participate in the assessment team until complete the final report. This will ensure the best human resources for the assessment and also the best quality assessment report. Contents: Ideally, assessment team should involve the scientist, experts from all sectors relate to the assessed issue. In general, there should be three groups in the assessment team which are natural scientist, social scientist and assessment expert. There will be 2 stages of assessment of which one is implemented during the field survey and the other one is implemented after field survey and before final analysis and reporting. May be the member of assessment team in those 2 stages could be different: In the first assessment stage on the field, team should try to involve some representatives of local stakeholder to fully use their knowledge about locality; In the later assessment stage, it needs only the scientists, specialists and researchers in the assessment team to analyze data on using tools like Excel or SPSS...

Method: The assessments organizer needs to base on the assessments objectives as well as financial sources to select the specialists for the assessment team. The main steps should be: In brief: The effectiveness and quality of an assessment is much depending on the experiences of the assessment organizer and also assessment team or interviewer who help to collect data. A survey could source from the special requirements like to test a theory or a hypothesis in research or to complete and update data to describe about results of a previous PRA. By using questionnaire survey, the manager could quantify the identified issues as well as figure out its reasons and influences and the assessment could support policy-makers in develop the strategies, policies and planning to improve the situation. However, the PRA normally has faster influence to the situation than the questionnaire survey because the people, resources user could be more involved and their To review the objectives; To review the available resources; To review the schedule of conducting; To contact with specialists; To identify who can take part in assessment team.

19

awareness is faster improved to make the changes by themselves in their daily production and life while most of changes source from questionnaire survey just become after long process of administration. An important tip should be emphasized here is PRA is usually an effective tool to develop a feasible plan at the commune and community levels. It is the most effective and reliable way in collecting information from community and possible to describe the common needs of the community. The analysis result of PRA such as Problem tree could help to identify the key problems, interventions, supports and investment level (finance, human resources and time...) to build up a feasible survey plan.

II.2 Plan the survey


This step is to decide what data should be collected including secondary and primary data. This is also aimed to define sampling unit, key informant and build the questionnaire. The pre-test of questionnaire should be done in this stage and the assessment team has to decide the method of data tracking to ensure the data is sufficiently collected. The team also has to define the coding system to enter the data into computer for analysis and the schedule for data management and analysis is also decided in this stage. II.2.1 Assess secondary data Goals: To identify gaps in existing knowledge in preparation for the field data collection; To ensure the field data collection does not collect information that has already been collected; To provide a basis for cross checking information collected during the field data collection; To provide supporting documentation for field data collection (e.g. maps of the study area); and To refine the lists of objectives, stakeholder groups, study sites and parameters.

Contents and conducting method: Team should start by reviewing available secondary data which is appropriate to the identified parameters. The secondary data is normally collected, analyzed and published under various forms, including: official and unofficial documents; statistical reports; reports of previous assessments and surveys; research reports; documentation of previous or ongoing projects, including monitoring and evaluation reports; maps; aerial photographs and satellite images; historical documents and accounts; and websites on the internet...

Assessing secondary data includes compilation, assessment and review the data related to the identified criteria of the assessment.

20

The collected secondary could be used throughout the field survey process and in the final data analysis. In case of secondary data includes the detail description and complex information team should compile them. The assessment team should read and review through the secondary data to identify information related assessments criteria, including information on the basic characteristics of the stakeholders, such as size of groups, their location and types of use. This information will be particularly useful during the reconnaissance survey and while planning the field data collection. The assessment team should also consider whether the objectives, stakeholder groups, study sites and criteria need to be modified. Not all secondary data will be of the same quality. Therefore the team should assess the reliability of secondary data sources. This can be time-consuming, however it is necessary to ensure that important documents and information have been generated from reliable sources using reliable methods.
Table 9: Typical sources of secondary data in Vietnam

General Sources Government agencies Local councils

Specific Sources Regional and local and institutions elected bodies, administrative offices (PPCs, DPCs, CPCs) Technical services (agriculture, fisheries, forestry, enterprise records, extension services) at all levels Health and social services, Enforcement agencies (police, coastguard, fisheries & environmental protection) Land Registries (Provincial, District Land Office) Statistics Services (GSO, DOLISA) NGOs offices Project offices Religion organizations Natural Departments Social Departments Libraries Sites for organizations

Types of Secondary Data Voter lists, development plans at local and regional levels

Project reports, monitoring & evaluation reports, activity minutes of planning & co-ordination meetings, reports on enforcement activities Population data, health reports, Records of conflicts, legal action, enforcement activities Land use surveys, records of auctions & leasing of government lands, land value assessments Census data, statistical survey data Needs assessments, poverty assessments, records on assessment and monitoring Project reports, needs assessment and assessment and monitoring reports List of population and religious associations

Non-governmental organizations (environmental organization, fisheries cooperative, tourism association...) Universities

Web-sites

the

Maps, satellite pictures, research reports Research reports, social impact assessment reports Historical documents, research reports above Maps, satellite pictures and background information

The main indicators of the quality of a secondary data source are: 1. The source should have a description of how the information was obtained or generated. 2. The source should have a description of the sources for information and how they were selected (sampling strategy).

21

3. Where there are statistical data, there should be some indication of the degree of variability in that information. 4. Where a source contains descriptive or qualitative information, there should be some indication of what level of variability there was in that information. 5. The source should discuss possible biases that could have affected the information generated and how these were overcome. 6. Where the source includes accounts based on work in the field, it should indicate that the researchers spent sufficient time in the field. 7. The source should describe the background of the researcher, which should include sufficient experience to have conducted the field data collection. 8. Research documents should include complete literature reviews. II.2.2 Plan the survey Goals: To identify the steps of conducting field data collection. This planning is also aimed to standardize the field data collection following a reasonable schedule and ensure the representative and precision of the collected data. Contents and conducting method: II.2.2.1 Decide sampling unit The team should define the basic sampling unit, which is the type of person(s) the team plans to interview and survey. The sampling unit could be individuals, households or some other unit e.g. the crew of commercial fishing boats could be the sampling unit for commercial fishers; owners or managers of the aquaculture activities could be the sampling unit for aquaculture businesses. The assessment team should carefully define the sampling unit since some terms have different meanings in different areas and cultures e.g. in some places household is a nuclear household of parents and their children, whereas in other cultures it refers to a much more extended unit with a range of related people living together in a compound, cooking and eating together and sharing certain resources and tasks. These bias if is not aware could lead to the mistakes in calculation, analysis and statistics etc. This problem is quite popular in Vietnam between regions. Traditionally, Vietnamese family includes many generations in a same house, sometime there could be up to 5 generations. However, in that house they are separated into smaller families; these small families have different/separated production activities and just contribute their part for eating together in big family. So, they must to be considered as a sample in questionnaire survey e.g. household. For example, a fisherman has 5 sons and all of their sons still live in a same house or compound with him but their life are entire independent particularly in economics although they still eat together and stay in same house; so when surveying that family must to be considered as 5 samples which are interviewed with 5 questionnaires and if sampling unit of that survey is household then they are 5 interviewed households. It always needs to keep in mind that household in survey is a unit of statistics but not a house or family as defined in social way, especially in Vietnamese context. II.2.2.2 Decide the key informants, respondents (number of sample) Next, the team should determine who to interview and survey, including how many informants within each stakeholder group they should contact and how to select those people. It is usually not possible to interview and survey all of the stakeholders because of the time and resources required, unless the stakeholder groups are small. Also, this may not be desirable since the team may get more in-depth information from interviewing a few key informants than from interviewing everyone. Therefore, the team should select a sample of the group, which will be used to

22

understand the entire group; e.g. if fishing operations are the target, then a sample of fishermen, boat owners or fishing labors... should be interviewed or surveyed. Ex: Sample size selection of IMOLA-Hue project, 2006 For IMOLA QS the criteria used are Activity Type: (i) Aquaculture, (ii) Capture Fisheries, and (iii) Agriculture. A census is meant to give a complete enumeration of an entire population in a given country (expensive & difficult). A representative sample provides an accurate picture of the entire population. The Sampling Fraction is defined as the number of Household interviewed divided the total number of Households in that place. There are about 900 Households in a commune; if we take 30 Households for each activity then we do need to interview: 30 HHs * 3 Activities = 90 HHs per commune Therefore IMOLA sampling fraction is equal to: 90 HHs / 900HHs * 100 = 10% For large areas the Sampling Fraction should be much lower e.g. 1% or even less. The most important factor in the selection of respondent process is that it should be a random selection. A selection can be considered random when every member of a population has a statistically equal chance of being selected. II.2.2.3 Build up the questionnaire a. Build up the draft questionnaire The selection of information/variables needs to base on: o o Studied results from PRA Parameters need to be assessed following the projects objectives

Combine variables in to separated parts Select which type of question should be used (quantitative/ qualitative...) Identify the number of question in each part Identify the analysis method which will be used during the assessment

Information from PRA is very important for the questionnaire designing because it could help to reduce the number of question in questionnaire and avoiding the coincidental data collection. Purpose of the socio-economic survey is tried to collect as much information as good. Therefore, in the past the designer usually establishes the large questionnaire with many questions and various of information needs to be collected. However, those types of questionnaire are presently very rare used because it is very easy to bore both interviewer and respondent and it could lead to bad collected information. The appropriate questionnaire should be designed by the team based on the information which was collected through PRA and secondary data. Questionnaire includes the identified parameters, added information from PRA and it also needs to remove the issues which are already stated in secondary data; Questionnaire should show all the surveys objectives (to explain to the respondents) and this part should be set as beginning part of questionnaire; Questionnaire should be clearly, simply presented to gain the confidence from the respondents; Reserve enough of space for open-ended questions because sometime the informants would like to provide the explanation or description information and it should be fully took note for later data analysis.

23

b. Pre-test of questionnaire Pre-test of questionnaire should be implemented before real survey to verify the questions and also the feasibility of the whole questionnaire in collecting data; Identify the average time needed to fill-in a questionnaire; Identify the missing and mistakes in questionnaire including the form, type or accessing method of question Go to the field and conduct survey with 10-15 questionnaires.

Representatives of stakeholders always are the best option for the pre-test because they know very well about the stakeholder groups and they could provide their insight on the form of answer which used to be used by respondents. In case of could not have these people the team should random select some other local people to test the questionnaire. In the pre-test, team should try to find out the feedback on the above points and bases on those to adjust the questionnaire. c. Adjust the questionnaire Adjust the questionnaire based on the results of pre-test; Discuss those changes with the related people; Print out the questionnaire corresponding with the sample size; Deliver questionnaire to the interviewers; Finalize the schedule of official survey: o o With local authority With the interviewers

Pre-test the interview guides and questionnaires Before using the interview guides and questionnaires in the field, the assessment team needs to test them to ensure: The questions are easily understood, are not confusing and are not too long; The questions flow naturally from one to another; The questions are culturally and politically sensitive; The questions elicit the desired type of response; Responses can be recorded quickly and clearly; and The survey or interview takes an acceptable length of time (max 45 minutes is highly recommended). The time needed to fill-in a questionnaire depends on the length and complexity of it and the experience of the interviewers. Moreover, it should be kept in mind that there is a learning curve for the interviewer which would shorten the time to fillin one questionnaire after some had been filled. Therefore, if for the first 3-4 questionnaires the average time is about 60 minutes each, we may expect that for the followings it would take about 45 minutes each.

II.2.2.4 Data tracking Because of there is many people included in the assessment team for data collecting, they should develop a system to keep track of all the information that is being collected to ensure its adequacy and also not waste resources for not necessary information. Each sub-group should have time to summary their results every day to catch the process and combine the collected data as well as to identify the missing and unreasonable things to adjust the questionnaire and survey. This should be done after each day of field data survey by each field data collection group.

24

For most of questionnaire survey, this step is very important and need to be seriously implemented to get the best information. II.2.2.5 Develop a coding system A coding system is a summary of questions which the team used to gather information on the field during questionnaire survey. Those questions are briefed and named as variables to enter to the computer establishing a database. This coding system is much needed to synchronyse surveyed data in an understandable computer language to serve for the data management and analysis in computer system. This could be done in two ways which is in the first stage before field survey conduct or it could be done after the field survey was finished and the coding system is only needed before entering data into computer. However, to avoid the misunderstanding and too complex concept for field survey teams with too many people at different education levels, awareness, and knowledge this coding system should be done after and separately with the field survey process. This step will be detail described in the Data Management and Analysis. II.2.2.6 Define plans for analysis It is important to deeply understand how the data will be analyzed before starting field data collection. More detail planning is needed for analyzing quantitative data, which often involves designing a database to analyze the data. The team should determine what type of information they expect to produce from the analysis and decide how they will use the results. The team should consider: what kind of analyses will be done, including simple calculations, descriptive statistics and more advanced statistical analyses such as inferential statistics. It depends on the objective of the survey. what tables, figures and graphs will need to be produced; and how these tables will be used to explain which parameters and which stakeholders.

The team should design a database to record, analyze and produce the required sets of information. There are many electronic programs for creating databases, such as Excel, Access, SAS, dBase, Lotus 1-2-3, SPSS, E-views, or LIMDEP... The team should select the program that will allow the data to be manipulated and analyzed to fit their needs. All the team normally should be very clear about the database structure but sometime the organizer could also split groups of collecting data away from entering and analyzing data group. II.2.2.7 Establish the field survey teams This is very important to ensure the successful of the field survey using questionnaire. Each interviewer need to understand about: o o o o o Surveys objectives Meaning of each question and answer The way of asking question i.e. avoiding leading questions 1th way: often be applied to the surveys which have surveys members are the specialist of Research Institutions or a private consultancy company. This way is usually more scientific and academic; 2th way: often be applied to the development project with the field teams members are projects staff, local officer or local representative... therefore, they need to be trained on needed skills before conducting field data collection. This is one of the most 25

Field survey members selection could be conducted in 2 ways:

effective ways in capacity building for the local people and also officers as in IMOLA project which employed GoV staff and representatives of TT Hue province local authorities to implement its socio-economic baseline survey. II.2.2.8 Define the schedule for field data collection The assessment team should prepare a schedule for conducting the field data collection, including a timetable and allocation of tasks to team members. This will help the team determine whether the data are being collected on time or if they need to modify plans; e.g. if the assessment team finds that it takes longer than they expected to conduct the interviews, they may decide to cut out one study site, reduce the number of informants and/or change the questionnaire. The assessment team should also consider issues like seasonality and local events. For example, it may be better to wait until the end of the fishing season to interview the fishers and they will have more time to talk and it will not be an imposition. When making schedule, assessment team need to involve the problems of geography, access possibility... to avoid the lacking of all related resources for the remote and hard to access areas and vice versa. Example of schedule of a survey Location Interviewee groups Huong Phong Farmer, aquaculturist Quang Loi Fisherman Interviewer group Nhu, Dao, Tu Tin, Nhat. Tao Time 1 week 1 week

II.2.2.9 Train field teams in the data collection methods using questionnaire and skill of survey using questionnaire All field team members should be trained to conduct the range of questionnaire survey techniques and to ensure they understand the goals and objectives of the socioeconomic assessment. By this way, team members will be able to follow the interview guides and, more importantly, ask followup and exploratory questions. It is essential that team members understand why the questions are being asked, what they mean and the expected type of responses. The level of training will vary depending on the experience and background of the team members. This should have been determined when selecting the team members. This training is very essential for inexperienced team members, particularly to learn how to design and administer questionnaires. More experienced team members should take the lead and explain the various methods and techniques and also work with inexperienced team members. However, in most cases the best training is by practicing the methods and techniques in the field. II.2.2.10 Provide a summary on locality (culture, custom...) The assessment and field teams should understand as much as possible about the local culture before starting the field data collection. They should be briefed on local customs, their treatment with outsiders, and forms of respect to the opposite sex, elders and figures of authority. Many rural communities (both coastal and inland) have particular customs, traditions and behaviors that need to be followed, especially by visitors. Similarly, there may be particular etiquette regarding hospitality, e.g. some people will be offended if the guest does not accept a drink. This is very essential for the teams of outsiders and could be provided by the stakeholder representatives or key local informants who are very familiar with the stakeholder and locality. II.2.2.11 Logistics arrangement This is also a very important step and could strongly impact the field surveys results. Therefore, it should be very carefully prepared. In fact, there are two cases which could have a small different logistics arrangement. It is interview in house of informants and the other one is the informant come to one venue where interviews are conducted.

26

In case of informant are invited to one venue, arrangement could include the selection of base for operations, accommodation and transportation and it could be a hard work if some of locations are too large, hard to access and the number of participants is too big. If possible, field teams should be arranged close with communities to get more information through the discussion with the local people around a tea table or in the drinking time. The groups of 4-5 persons or more should have one person whos familiar with locality to do this work in case of could not come back every day to stay together with the whole team in the data collection process. In case of teams member has to go to every house of informants to interview, the different thing is only base for operation is not necessary but the other accommodations are still needed particularly the transportation if informants houses are hard to access. Groups should inform to the local authority and stakeholders' representatives about their schedule to get the official and also unofficial permission of working in the locality. The last thing has to be arranged is transportation which normally is car and motorcycle but it could be boat in case of that location is only reached by boat. So, all is in place to start the field data collection.

II.3 Conduct survey


This is the main part of the survey. With all the above preparation, the team will travel to the sites as assigned to conduct survey using questionnaire. There are several principles as mentioned below which the team should have to follow to ensure the information could be gathered and it is sufficient and confidential as required by the goals and objectives of the assessment. II.3.1 Principles Respect the stakeholders and communities: it needs to respect the stakeholders, particularly their knowledge, time and habit to put all the participants at ease. Time for interview also should be reasonably arranged (30-45 minutes is acceptable). Team should try to avoid influence the habit or tradition... of the local people to endure the survey could collect enough of data with good quality. Clarify goals and objectives of survey: to avoid the hesitance of respondent, especially with the sensitive questions related to finance, management method or property rights etc. On the other hand, team also needs to affirm that all the information they ask are absolutely not disclosed to anyone which could influence to the informant. Develop a good relationship with stakeholder: to increase the trust between the interviewer and respondent and also encourage the respondent to express their idea. The good relation between sides will be a key issue ensuring the adequacy and precision of collected data. Recognize the limitation of data: to remind the team that data always is inadequate and imprecise regardless of efforts of the field teams. It is due to the limitation of resources for assessment and also the information is provided by the memory of the respondent must have error and bias. The assessment team should be aware about this to avoid the mistakes in conclusion. Thats reason of there is always a confidential interval in statistics and this interval is small or large much depends on the requirements of assessment; it could be 1%, 5% or 10% etc. Recognize the bias: is also a measure to reduce error in data collection due to some informants providing imprecise information base on their own benefits or due to some other sensitive reasons. To recognize these issues, the team needs to clearly understand the informants and also needs to conduct reconnaissance survey or cross-check data. In fact, the bias could occur right to the team due to the information provided before field survey. Therefore, the team has to be very watchful over this problem and frequently doing data cross-check or using different way of asking for the same question to ensure the quality of collected data.

27

II.3.2 Conduct survey In fact, there is various ways of doing survey with questionnaire e.g. by mail, telephone or direct interview etc. but here is focused only on the direct interview. The team will travel to the fields and meet the households representatives to direct interview. Information will be gathered through the questionnaires which are already prepared and the answer of informants will be directly noted in the questionnaire by interviewer. Household survey: Definition: Surveys use questionnaires with highly structured, close-ended questions. The questionnaire has specific questions with limited answers (e.g. multiple choice, true/false) resulting in quantitative data that can be analyzed statistically. This method does not encourage the explanation answer or follow-up questions. Goals: Generates quantitative data on specific topics; Generates data that can be statistically representative of the larger stakeholder group or community, depending on the sample size; Helps determine the distribution of variables (e.g. education levels, income) between and within stakeholder groups and the larger community; Helps draw comparisons between and within stakeholder groups and the larger communities and examine correlations between parameters.

Requirements:

Questionnaire is built up based on the needed assessment parameters; Field survey team (interviewers) is trained; List and addresses of respondents; Local people to introduce field survey team.

Method: Select the respondent based on the local key informants, main activity carried out by the households (data from CPC and heads of villages), and/ or any other stratification criteria deemed relevant to the survey; Arrange time and location to conduct interviews. Decide if adopt survey house to house or gathering respondents together; Self introduction and objectives of the survey to each respondent; Conduct interview following the questionnaire.

Strengths: generates information statistically representative of the larger stakeholder group or community if a statistically representative sample is used; generates quantitative data amenable to statistical analyses; does not require a highly trained person to administer the questionnaire; generates data targeted to the needs of the assessment (i.e. little extraneous data); relatively easy to administer; relatively easy to code and interpret data; and requires little time of informants compared to interviews.

Weakness: 28

has limited boundaries of inquiry which discourage informants from raising relevant issues that the team member doesnt know about; is difficult to determine error and bias; discourages local people from becoming involved in data collection due to the rigid nature of the survey; is difficult to ask questions about sensitive issues; and provides ready-made answers that may not reflect completely what the informant thinks.

Notes: Simply and short design of questionnaire to avoid unexpected things created by both interviewer and respondent; Does not ask many questions at the same time; May not follow the order in the questionnaire to easier collect data following the memory flow of respondents; Need pre-test to ensure the questions are designed in best way and respondents could feel free to provide information.

II.4 Data management and analysis


Normally, there are 2 stages of data management and analysis: one is in field data collection process and the other is after finishing data collection. The data analysis on the field is aimed to briefly assess the adequacy and precision of collected data. It helps the team could immediately self-adjust to better collect data. This step could occur during the whole field data process. The final analysis is aimed to figure out the detail lessons learnt, conclusion etc. based on the statistical quantitative and qualitative analysis through the computer software as Excel or SPSS etc. The using these soft-wares for data management and analysis will be more detail described below this part here is only described as one of the stages of assessment process with its principles and how to conduct it. Please also be noted that there are many methods of analysis data for statistics. However, this Manual is only focused on descriptive analysis but not regression, chi-square or T-test etc. because they are too complex methods and should be presented in another advanced Manual for statistics. II.4.1 Basic principles of data analysis Involve all the team members in this stage to ensure all the data was collected, compiled and analyzed including the information which is only kept in mind of all participants; Prioritize quality, not quantity is one of the most important rules of data analysis to ensure the validation and usefulness of final report. Quality is judged by the: extent that the reported findings reflect the collected information; and usefulness of the findings to endusers; It should focus on the main research objectives rather than just description of collected data to save the resources as well as generate the attraction for the final report because of any policy-maker or manager or scientist etc. could be interested in a report with only figures. Those figures must be used to explain, describe or prove issues which are already identified by the goals and objectives of the assessment before starting.

II.4.2 Conduct the data analysis As mentioned above, the data analysis is implemented in two stages. The field analysis should be conducted every day including the preliminary analysis of quantitative and qualitative data to evaluate the precision of information, missing data, appropriateness of questions or new learning in

29

survey process or in other word it is to check whether the questionnaire is correctly and adequately fill in or not. The full data analysis before making conclusion will mainly base on the computer programs which are detail introduced in Part Data Management and analysis using MS Excel and Part Data Management using SPSS of this Manual.

II.5 Reporting
This is the last thing has to be done to complete an assessment. The report will include the analysis base on the collected data and experiences. It also includes conclusion, lesson learnt etc. of the assessment team to describe and assess the study areas. On the other hand, the assessment team should also present their comments and recommendations to improve the situation of the study areas. II.5.1 Field report After completion of data collection, each field team should have a field report to summarize the data collection process, its results and conclusion or new lessons learnt of the team about location and sector which team is assigned. This report is aimed to truly describe the whole data collection process of every field survey teams and will provide an overview for the assessment team on the entire survey and its findings. This report should be short and simple and has to be contributed by all the team members to ensure all the data and findings are included. The field report also needs to be completed as soonest after the field data collection process is finished to help assessment team has base and references for data analysis and report finalize. II.5.2 Outline the final report Assessment team should base on the main objectives of the survey to define the main contents of the final report. The team also needs to decide which type of report is most appropriate and useful for the end-users. For example, the end-users, such as senior policy or decision-makers, may have little interest in a general description of the area and communities studied, but may be interested in issues, problems and potential solutions. Other end-users, such as researchers, scientist, and development agencies planning to work in the area may want detailed descriptions of all socioeconomic conditions and factors relating the stakeholders. Common report will include following sections: Introduce the main and specific objectives of assessment; Introduce the main issues related to the socio-economic parameters; Discuss the key lessons learnt; Methodology; Detail information and analysis could be shown in the Annex; Summary of conduct process should be involved in the first part of report.

II.5.3 Final report Final report is built based on the main contents which are being agreed in the report outline; The key learning, finding should be explained by the data, figures, description or analysis results; The analysis and description related to the surveyed variables should be made in the separate parts or presented in the Annex; It needs to combine results from various sources such as secondary data, PRA, survey... to increase the validation of the report.

30

In brief: To successfully organize and conduct a questionnaire survey is a long and hard process which needs many efforts and requirements as skilled organizer and conductor, researcher or specialist and related tools. Moreover, questionnaire survey is quite resources consumed and normally limited extensive use. Questionnaire survey needs to be carefully limited in research scale to avoid waste resources in data collecting. Effectiveness of questionnaire organization and conduct will be improved in parallel with the skills of conductor. To organize and feasible plan for questionnaire survey at community level, it needs to use the information on demands of community which is gathered by PRA: the survey will be easier if local people perceived that is necessary for their life, their production activities and they also are more opened in providing information and supporting the team in survey process.

31

III

Data analysis using MS Excel

As we known, this is a very complicated process but has to be done with the collected data from the survey to reach the right conclusion based on the data analysis. Simply, it is the data entering into the computer under software e.g. MS Excel and analyser will use that software to analyse the entered data to get the analysis results. These results will help the experts in assessing situation following the given objectives of assessment and policy-makers or managers could use assessment for their works to serve for the development purposes. There are many statistic terminologies need to be clarified and they are relatively difficult to understand particularly with the people not professionally works in statistics sector. Please refer to Annex vii to be clearer on these issues. MS Excel is very strong software in calculation to serve for the comparison, assessment and statistics objectives in different sectors, particularly socio-economic sector. MS Excel manages data by row and column. MS Excel accepts all types of data (numeric and letter) to be easier for the user in data entering but not in data controlling.
Figure 3: Data analysis using MS Excel

Data entering in MS Excel

Data analysis in MS Excel

Attention in data analysis MS Excel Descriptive statistics AVERAGE function COUNT function COUNTA function COUNTBLANK function COUNTIF function RANK function SUM function SUMIF function

Export results

32

III.1 Data entering in MS Excel


Data is entered into MS Excel following rows (each survey sample is one row); Normally, the first column should be used for defining name, address... of the respondent and his/her family; Due to the fact that there is no function of definition for variable, it should use the first row to do this work (clarify the content of each column); Answers of the questions are the values which will be entered into MS Excel by exactly what it is showed in the questionnaire regardless of it is digit or letter...

Notes: due to MS Excel accept all the entered values and does not require coding data, the user should be very careful in data entering particularly with the characters to avoid the MS Excel will make mistake in analyzing.

III.2 Data analysis in MS Excel


This is mainly focused on the key functions in MS Excel which could be used in data management and analysis data from questionnaire surveys. III.2.1 Attention in data analysis using MS Excel All function in MS Excel is kept in the Insert toolbar. Path: MS Excel => Insert => Function (fx) Option All should be selected in select a category to present all the functions including in MS Excel then the user could select to use function as it shows in the below box; Filter: because there are too many data which is entered into MS Excel then the user should use Filter tool to refine what exactly data they need to work with. For example, to select the aquaculture households from a coastal community or to select the surveyed households of one commune in the general surveyed samples... Path: MS Excel => Data => Filter => Auto Filter When done, there will be a small sign (triangle) in the first cell of the column allow the user to change the filter criteria following objectives of analysis. Chart and graph: allows the user to describe data in form of chart and graph. This function is one of the strengths of MS Excel and the user could access many different type of chart and graph corresponding with the demand of description. Path: MS Excel => Insert => Chart The user should base on their own needs to select the type of graph and all other related options in chart option. There are many functions in MS Excel which are related to many sector like calculation, prediction or probability... but here is just focused on the simple analysis including several simple and common functions. On the other hand, MS Excel also provides many different statistical analysis functions like ANOVA, regression and even ARIMA... and similarly here is just focused on the simple descriptive statistics; Descriptive statistics could provide the main descriptive parameters like mean, mode, sum, range or standard deviation... but if the user needs more detail calculation it should be used several built-in functions. For example, to calculate the number of households which have 5 members and above it needs to use function COUNTIF to count the number of households under that condition,

33

use function COUNT to count the total number of households and finally use normal division to calculate the needed percent. III.2.2 Descriptive statistics To use this function, tool Data Analysis must be activated. The user should check this in toolbar Tools whether it is presented or not. To activate Data Analysis tool: MS Excel, click Tools => select Add-Ins => select Analysis ToolPak => OK; MS Excel will require MS Office installation CD or sometime not=> The user could insert this CD into the drive and let computer automatically search path and needed files => click OK. MS Excel now is ready with Data analysis function in the Tools;

Descriptive statistics allow the user summary all the entered data in MS Excel. Descriptive statistics in MS Excel is formed in a table where each variable in database is summarized in one column with mean, mode, sum, and standard deviation... values Notes: this function is only applied for the numeric variables but could not be implemented with variables where include characters. Ex: Descriptive statistics Path: MS Excel => Tools => Data Analysis => Descriptive Statistics Input range: select the range of data where want to be described Do not click in Label in first row Output range: select the cell where want to show the description
Figure 4: Descriptive statistic in MS Excel

Select Summary statistics

34

Figure 5: Descriptive statistic in MS Excel (continued)

Column in table is named Column 1, column 2... corresponding with the variables which they are described from left to right in MS Excel database. These names should be changed following their contents for easier understanding. E.g.
Age Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 55.33 4.59 55.00 55.00 13.77 189.75 -0.95 0.30 39 37 76 498 9 Gender 1.11 0.11 1.00 1.00 0.33 0.11 9 3 1 1 2 10 9 member 6.89 0.79 6.00 6.00 2.37 5.61 1.76 1.47 7 5 12 62 9 Commune 2 0 2 2 0 0 #DIV/0! #DIV/0! 0 2 2 18 9 Activities 1 0 1 1 0 0 #DIV/0! #DIV/0! 0 1 1 9 9

The user could use all the criteria which are provided in the above table use some of them depending on the objectives of assessment. III.2.3 AVERAGE function To help to calculate the average value of a variable (one column in MS Excel) and so this function is only valued with numeric variables but could not implemented with variables include characters. For example, to calculate the average value of turnover of the surveyed households or to calculate the average number of member in surveyed households... Path: MS Excel => Insert => Function => AVERAGE Number 1: select the entire data column where want to calculate the average value, e.g. A1:A52 Click OK and MS Excel will show the average value

35

III.2.4 COUNT function To help to count the number of cell in a row or column, in other word, it is to count the number variable in questionnaire or count the number of household in survey. Path: MS Excel => Insert => Function => COUNT Value 1: select all the data column or row where want to count, e.g.: A2:A52 or A2:AB2. Click OK and MS will show the number of household (sample) in survey or the number of variable in questionnaire. Notes: COUNT function count only the cell contains numeric value but not character. III.2.5 COUNTA function Similar, to help to count the number of cell in a row or column, in other word, it is to count the number variable in questionnaire or count the number of household in survey Path: MS Excel => Insert => Function => COUNTA Value 1: select all the data column or row where want to count, e.g.: A2:A52 or A2:AB2. Click OK and MS will show the number of household (sample) in survey or the number of variable in questionnaire. Notes: COUNTA function could count all the cell regardless of that is numeric or character cell. III.2.6 COUNTBLANK function To help to count the blank cell in the database, in other word, it is to count the missing data in database. The calculation will be implemented following row or column. Path: MS Excel => Insert => Function => COUNTBLANK Range: select all the data column or row where want to count, e.g.: A2:A52 or A2:AB2. Click OK and MS Excel will show how many cells are blank it means the number of questions which havent yet answer or number of missing data in the variable. III.2.7 COUNTIF function To help to count the number of cell which meet some conditions, e.g. to count the number of household which has average income per capita upper than 2 million VND/month or to count the number of household which has productivity of shrimp culture under 1 ton/ha/crop... Path: MS Excel => Insert => Function => COUNTIF Range: select all the data range where want to count Criteria: set the condition for counting (it could type directly in this box or select a cell includes condition which was already set). Click OK and MS Excel will show the number of cells meet requirements. III.2.8 RANK function To help to rank a value in a variable in database in a fixed order (ascending to descending or vice versa) e.g. rank the income of 10 million VND/crop in variable income of household. Path: MS Excel => Insert => Function => RANK Number: value which want to be ranked or cell includes value which want to be ranked Ref: data range where in reference for the ranking (whole variable) Order: 0 is to rank in descending and all other value is to rank in ascending order. Click OK, MS Excel will show the rank of the value in variable

36

III.2.9 SUM function To help to calculate the total of the values in range, this function is only applied to the numeric variables. In case of variable includes some character values they will be omitted. E.g. to calculate the total areas of fish aquaculture water surface of surveyed household or to calculate the total turnover of agricultural activities of surveyed household... Path: MS Excel => Insert => Function => SUM Number 1: data range where want to be calculate, e.g. D3:D24 Click OK, MS Excel will show the total number III.2.10 SUMIF function To help to calculate the total of values in range which meet certain condition and this function is only applied to numeric variables and also bypass the cells includes character. E.g. to calculate the total areas of water surface of fish aquaculture households who own areas bigger than 1 ha or to calculate the total turnover from agriculture of households who have more than 10 million VND/household/crop... Path: MS Excel => Insert => Function => SUMIF Number 1: data range where want to be calculated, e.g. D3:D24 Criteria: requirement (condition) to select the value to be calculated in variable, e.g. bigger, smaller or equal with... It could be directly typed into this box or select a cell where includes the condition has to be met. Click OK, MS Excel will show the total value of data range under that condition.

III.3 Exporting results


The results in MS Excel are very easy to use because of the compatibility of MS Office. They could be transferred to MS Word or PowerPoint... by simply using cut and paste tool.

37

IV

Data analysis using SPSS

SPSS is very popular software in statistics sector. This is very strong in helping people to analyse data for statistical purposes in showing statistical indicators as mean, standard deviation or frequency etc. and the popular statistic functions are also included. SPSS manage data following row and column like MS Excel with row is samples and column is variables.
Figure 6: Data analysis using SPSS

Variable

Data analysis

Exporting results

Concept

Variables types

Data coding and entering Missing data

IV.1 Variable
IV.1.1 Concept A variable is simply understood as an attribute of a research sector. A variable in a socio-economic study could be an individual, a farmer household, a school, city or nation... In case of the attributes are not diversified it will become constant, e.g. if all the organizations have the same number of man/woman ratio and the variable/attribute sex or gender will be considered as constant and not included in the study. Variables: age, gender, population, yield, price, disease in aquaculture, number of fishing boat... IV.1.2 Variable types Questionnaire survey usually uses the questions like closed-ended questions, encoded questions, optional questions... These different types of questions will formulate the type of variables when entering into computer. Numeric variable: age, time, population, area... Binomial variable: allows only to select one of the two given values such as yes/no, male/female, true/false...

Ordinance variable: values of this variables is under ordinance format (question with various given options): e.g., arrange age into groups from 20-30, 31-40, 41-50 and above 50 thats mean transforming the numeric variable to ordinance variable or question about time for feeding shrimp will give answers are 1 time/day, 2 times/day, 1 time/week or 2 times/month...

38

Dummy variable: is similar with ranking variable but including also the variable which is not in order. Usually, the dummy variable is used only in case of building up the test or appraisal models (e.g. ARIMA) to be more precise in identifying correlation level between variables. However, this manual will not try to deeper introducing about statistical analysis and just stopping in a very simple analysis in order to not make it too difficult to understand; Two common types of variable which are used in SPSS are numeric variable and string variable;

Quantitative variable: is variable with values could be measured by units like kilogram, square meter, hectare...; Qualitative variable: is variable with values could no be measure and normally has to use qualitative value like good/bad, increase/decrease, many/few... IV.1.3 Data coding and entering As mentioned above, data will have to be entered into SPSS following row and column. However, to serve for the statistic purpose the data must to be encoded given series of value. In fact, due to real data is much different in a large range depending on their attributes so if they are not encoded it will be very difficult for SPSS could do the analysis and the reader also very difficult to read the analysis results. Thats why the simple series values are frequently used to be representative for the real data in SPSS - it is the coding data process. * There is 2 ways of coding data. The first one is to code data before entering data into computer and the other one is to code data after entering data into computer. However, it usually needs to use both ways due to the flexible demands of analysis. Coding data based on the questionnaire: To give a name to each question in the questionnaire and these names will be considered as names of variables which are used by SPSS to enter data in. This name must to not include space e.g. nuoiTS not nuoi TS; To give possible values to the answers of those questions. These values will be assigned to numeric value like 1, 2 or 3... to meet the requirement of SPSS in data analysis.

For example: Does your household conduct the aquaculture? Variable name: NuoiTS Assigned values of answer: 1=yes; 2=no Variable name: LoaiTSnuoi Assigned values of answer: 1=shrimp; 2=tilapia; 3=grass carp; 4=common carp...

If yes, which species are you grow up?

Data entering: o o o

Notes: To facilitate the data entering and avoid mistakes it needs to unify the principles of data coding (giving name to variable, assigning codes to values...) Variable establishment: Name: the user should give a name which is easiest to remember and know about contents of variable; Label: to help the user in recognition of variable as well as it is easier in reading the results of data analysis; Size: is not very important in establishing a variable. Normally, it is automatically set by SPSS;

39

Type: as mentioned above, there are 2 common types of variable which are used in SPSS are numeric and string types. However, the numeric type is more used to meet the requirements of SPSS in data analysis and string type is used only in case of name, location, address...

Figure 7: Define variable in SPSS

Data entering:

Data entering into SPSS is very simple put the values (answers) of the questionnaire into corresponding cells. Similar with MS Excel, SPSS also manages database following row and column of which one row is corresponding with one sample (H.H) and one column is corresponding with one variable. o The data entering is normally conducted following row thats mean it goes through all the variables in database to enter data for one sample (a questionnaire). The data entering for the next sample just be started after all the data of the last sample was entered to ensure there is not any mistake between the two; For the encoded values, it needs to base on the coding list to enter the data to avoid the mistake in data entering. For the variables which are not yet encoded will be entered by their own original values and could be encoded when needed. SPSS accept all the values from questionnaire. Just remember to set the type of variable as mentioned above and enter character or number following the type of string or numeric.

o o

Notes: after establishing and encoding variables, it needs to print out the list of variables including all the related information of variables to support the data entering process. This list is very important because it provides to the user a complete description on every variables will be worked with in data entering process. At the same time, this list will also help the user in data entering with encoded values. The list of variables could be have by doing as follows: SPSS --> File -->Display Data File information --> Working File then from the Output SPSS Viewer --> File --> Print.

40

Figure 8: List of variables

IV.1.4 Missing data When the interviewer omits a question or respondent bypass an answer or do not want to answer etc. will create a missing data and that value should be encoded as 0 or any value which is never used in database for easier remembering e.g. if data is encoded from 1-8 only then missing data should be set as 10 or 100 to avoid misunderstanding or repeat. It also could simply leave the cell blank and SPSS will consider that is missing data. It is easier for understanding and entering.

IV.2 Data analysis This part is much closed with the statistic terminologies. So, to easier understand please also refer to the Annex vii Statistics to perceive the basic concepts of statistics which will be mentioned below.

Frequency: to show the frequency of each value in a specific variable Path: analyze descriptive statistics frequencies Frequency table: to show the frequency of each value in a specific variable under table format; Frequency graph: to show the analysis results under graph format (pie, bar or line... types); Mean: to show the arithmetic mean of all values of a variable (e.g. mean of age, mean of income...) and this is not meaningful with dummy variable or string variable; Median: is the mid-point value in the value chain of variable when ascending sorted. If sample size is an even number, median value will be mean of the two values at the centre of value chain; Mode: is value which has highest frequency in the distribution or it could be understood as value which is most appeared in the variable. Mode could be used with every variable regardless of it is numeric or string variable; Sum: is total value of all value in the variable. Normally, it is valuable only in statistics e.g. the total of sample but it is also valuable in arithmetic e.g. total areas, total population... Example: Frequency table
N Respondent's Sex Valid 1517 Missing 0 1.58 2.00 2

Mean Median Mode

41

Respondent's Sex Frequency 636 881 1517 Percent 41.9 58.1 100.0 Valid Percent 41.9 58.1 100.0 Cumulative Percent 41.9 100.0

Valid

Male Female Total

Descriptive analysis: Path: analyze descriptive statistics descriptive Standard deviation: is method of measuring the dispersion of a variable. In a normalized distribution variable (please see Annex vii Statistics), 68% of number of samples will be involved in the range of mean plus standard deviation (meanstd) and 95% of number of samples will be involved in range of mean plus 2 standard deviations (mean2std) and 99% of number of samples will be involved in range of mean plus 3 standard deviations (mean3std). For example, if average age of a group of fishermen is 45 and standard deviation is 10 then 95% of number of samples is involved in range of 25 and 65. Range: is the difference between maximum and minimum values of a numeric variable. Min value: the minimum value of a variable Max value: maximum value of a variable Besides, in the descriptive analysis there are several other parameters e.g. variance, standard error of mean... Example: descriptive analysis
Descriptive Statistics N Age of Respondent Valid N (listwise) 1514 1514 Range 71 Minimum 18 Maximum 89 Mean 45.63 Std. Deviation 17.808

Crosstabulation: Path: analyze descriptive statistics crosstabs This is one of the most important analysis functions in the socio-economic analysis and assessment. Crosstabulation table shows the correlation level between variables (e.g. age and education level, gender and health or occupation and income). Of which, crosstabulation table will present correlation level of each value of a variable with each value of the other variable. The percentages of values of this variable in correlation with values of other variables and with the total sample are also provided. This function also shows the portion of each value in a variable. The analysis of this table will provide the difference of correlation level of each value of variable with the other and also help the user recognizes the most important value of a variable in that correlation. For the time-series variables, it could show the development, declining or increasing trends of the values of variables. Example: Crosstabulation Table
Most Important Problems in Last 12 Months * Respondent's Sex Crosstabulation Respondent's Sex Total Male Female Count 35 57 92 % within Most Important Problems in Last 12 Months % within Respondent's Sex % of Total Finances Count % within Most Important Problems in Last 12 Months % within Respondent's Sex 38.0% 26.5% 10.4% 56 42.1% 42.4% 62.0% 27.9% 17.0% 77 57.9% 37.7% 100.0% 27.4% 27.4% 133 100.0% 39.6%

Health

42

% of Total Lack of Basic Services Count % within Most Important Problems in Last 12 Months % within Respondent's Sex % of Total Family Count % within Most Important Problems in Last 12 Months % within Respondent's Sex % of Total Personal Count % within Most Important Problems in Last 12 Months % within Respondent's Sex % of Total Miscellaneou s Count % within Most Important Problems in Last 12 Months % within Respondent's Sex % of Total Total Count in Most Important Problems in Last 12 Months % within Respondent's Sex % of Total

16.7% 2 50.0% 1.5% .6% 15 31.3% 11.4% 4.5% 9 47.4% 6.8% 2.7% 15 37.5% 11.4% 4.5% 132 39.3% 100.0% 39.3%

22.9% 2 50.0% 1.0% .6% 33 68.8% 16.2% 9.8% 10 52.6% 4.9% 3.0% 25 62.5% 12.3% 7.4% 204 60.7% 100.0% 60.7%

39.6% 4 100.0% 1.2% 1.2% 48 100.0% 14.3% 14.3% 19 100.0% 5.7% 5.7% 40 100.0% 11.9% 11.9% 336 100.0% 100.0% 100.0%

Beside the tables which are considered as the main results of analysis, SPSS could also provide a series of graph in descriptive statistics function; All the analysis result including frequency, descriptive or crosstabs... will be saved in viewer files of SPSS (*.spo)

IV.3 Exporting results

To export the viewer file (*.spo) to MS Word (*.doc) Path: SPSS viewer (output) File export select all object and Word/RTF file (*.doc) To export the viewer file (*.spo) to MS Excel (.xls) Path: SPSS viewer (output) File export select all object and Excel file (*.xls) Notes: Options should be set with Output document and Browse button is to set the path of saving file.

43

ANNEXES
i. Sampling and sample selection
As mentioned above (see I.3 Sampling), there are two ways to use to select people to sample: random sampling; and nonrandom sampling. Approaches for sampling, including advantages and disadvantages of each Sampling Method Non-Random Sampling Use Oral history, focus group, observation, survey, semistructured interview Advantages Relatively inexpensive, not time-consuming, uncomplicated, does not require a well defined stakeholder group, helps achieve better representation of diversity in the group Survey, semi- Data are statistically structured survey representative of the stakeholder group Disadvantages Resulting data are not statistically representative of the stakeholder group

Random Sampling

Expensive, time-consuming, complicated, requires a well-defined stakeholder group (e.g. list of all stakeholders)

NON-RANDOM SAMPLING In this approach the team selects specific people as informants to gain a better understanding of the different viewpoints, attitudes, perceptions and concerns of the whole group. Because the informants are selected and not taken randomly from a clearly defined group, the information is not representative of the whole group (i.e. the information is not statistically representative). To overcome the statistical weakness of non-random sampling, the team should select people who can represent different perceptions and viewpoints. These people can help the team understand the complex patterns of how different people view local conditions, and particular issues. By crosschecking information from these different people, the team can increase their confidence that the information represents the whole group. The team can be reasonably confident that these opinions and perceptions are held by the whole group, but these will be impressions and not statistically sound findings. When to use non-random sampling Non-random sampling is typically used when: the team does not have the resources to conduct a full, statistically representative sample; the team wants to interview specific people; the stakeholder group is not well enough defined to select people at random; or the team does not expect to analyse the data statistically (e.g. qualitative information).

This approach is most useful for focus group interviews which involve interviewing particular people or observing specific events. This produces qualitative information, which usually cannot be analysed statistically. Non-random sampling is often used for semi-structured interviews because these interviews can be time-consuming and the results are usually qualitative. This approach can also be used for surveys when there is not enough time or resources to survey a statistically representative group, or the team wants a rapid overview of the stakeholder group. The main advantage of non-random sampling is that it is less expensive, takes less time and is less complicated than random sampling. It also does not require a well-defined stakeholder group and

44

can help gain a better representation of diversity in the group. However, the resulting data cannot be statistically analysed and cannot, therefore, be taken as necessarily representing the perceptions of the whole stakeholder group. How to select informants The most common approach for non-random sampling is purposive sampling in which team members use their judgment to select the stakeholders to sample. Usually these stakeholders are key informants, who can provide insights about the larger stakeholder group e.g. the oldest fisherman in fishing community or the header of fishing association may be selected as key informants for the fisheries capture industry. This approach is most valuable for focus group interviews which involve interviewing particular people. A common type of purposive sampling is snowball sampling in which the selected informants are asked to give the names of other key informants in the same stakeholder group. Each new informant is asked for the names of other key informants until the team keeps hearing the same names at which point the group can be regarded as fully sampled. The snowball approach is best used when the group being sampled is small enough to have almost complete coverage. Another type of purposive sampling is sidewalk sampling (or convenience sampling) in which the team interviews people who pass by and are willing to participate in the study. This allows the team to assess a large number of people at minimal cost. This approach is most useful for conducting semi-structured interviews, observations and rapid surveys. The team should be sure that the full range of perceptions is represented when using any of these non-random sampling approaches e.g. older fishermen may have different perceptions of the cultural value of fishing than younger ones. So the team should interview both older and younger fishermen about the cultural value of fishing. Important factors to consider in identifying this range of perceptions include:

Gender; Age (e.g. young fishermen, older fishermen); Socioeconomic levels (i.e. wealth, education, social standing); Occupational group (e.g. small-scale farmers, plantation farmers); Ethnic group; and Location (e.g. fishermen living by landing site, fishermen living inland)

The team should use all they have learned during the previous steps to identify those people who would provide the full range of perceptions. It may also help to make simple sampling rules, such as selecting every fifth person coming out of a shop. This helps ensure the range of people are surveyed, not just particular types of people e.g. only wealthy tourists, or older people. The number of stakeholders who need to be assessed should be based on the teams best judgment. A general rule is to interview people until the answers become repetitive and no new information is being generated. RANDOM SAMPLING If the assessment team feels that it is important to be highly confident that the results of their assessment are statistically representative of the whole group, then they should select a random sample of informants. A random sample means that the people talked to have been selected without bias influencing the teams selection - the probability of each person being selected as an informant is equal. In random sampling the team assesses a statistically representative sample of the group. So the data are statistically representative of the whole group. When to use random sampling

45

Random sampling is typically used when the team wants statistically representative data and has the time and resources to conduct this intensive approach. This approach requires that the stakeholder group is well-defined so that the team can randomly select people. The group can be defined in a comprehensive list of all stakeholders e.g. list of fishermen or list of aquaculturist registered with the Fisheries Department. Alternatively a map of their locations can allow the team to randomly select their sample and then locate those people to interview. This approach is most appropriate for surveys, which are designed to gain quantitative data for statistical analysis. Observations can also be conducted using random sampling. For example, if the assessment team is interested in the percentage of fishing boats that influence fisheries resources when fishing, then the team could select a statistically representative number of fishing boats and observe their fishing practices. Informants for semi-structured interviews can also be selected using random sampling; however, the results from semi-structured interviews are typically qualitative due to the exploratory nature of the questions, which usually cannot be analysed statistically. The main disadvantages to this approach are expensive, time-consuming and complicated and it requires a well-defined stakeholder group; in addition, determining the appropriate sample size often requires a statistician because of the statistic principles and requirements here are quite strict to obtain relatively accurate statistic analysis results. However, the advantage is the data are statistically representative of the whole group because there are not bias happens by the subjective views of the surveys organiser when non-randomly selecting samples. How to select informants As a general rule, when selecting informants for random sampling, the larger the sample size, the greater the level of accuracy and the more certain the assessment team can be that the results represent from the sample represent the whole group. To determine how many informants to interview, the assessment team must first decide on two interrelated factors their confidence interval and their level of confidence. The confidence interval indicates the accuracy of the results e.g. if the confidence interval is 10%, then the results are accurate +/- 10%. If the average age is 50 and the confidence interval is 10%, then the average age is considered to be 50 +/- 10% or between 45 and 55. The level of confidence is the level of error the assessment team is willing to accept in the results e.g. if the level of confidence is 95%, the team can be 95% certain the results, including the confidence interval, are correct. Putting these two factors together, if the team selects a 95% level of confidence and a 10% confidence interval, then they can be 95% certain their results are representative of the whole group plus or minus 10%. Therefore, if the sample informants average age is 50 and the informants were selected using a 95% level of confidence and a 10% confidence interval, then the team can be 95% certain that the average age of the larger stakeholder group is between 45 and 55. There is no rule for selecting a level of confidence or a confidence interval. The team should determine these factors on a case-by-case basis taking into consideration the specific goals and objectives of the study as well as time and budget constraints. The team should consider the sensitivity of the study results, including the potential consequences of these results if they are incorrect. If the study is particularly sensitive, the team may decide to use a high level of confidence and a high confidence interval (e.g. 99% level of confidence and a 1% confidence interval). In general 99% considered a high level of confidence, 95% is average and 90% is low. Similarly a 1% confidence interval is high, 5% is average, and 10% is low. In most situations it is widely accepted to use a 95% confidence level and a 5% confidence interval. Table below lists the sample sizes for various stakeholder group sizes for confidence intervals of 5% and levels of confidence of 95% and 99%. In general, the larger the group, the larger the sample size. However, the smaller the group, the larger the portion of people that should be interviewed. This is because the smaller the sample size, the greater the effect of biases on the

46

results. To prevent a small number of people from biasing the results, the sample size should be as large as possible for small groups, especially if biases are known to be present in the group. In general, for groups of less than 500 people, no more than half of the group should be interviewed. The exact sample sizes for these small groups vary depending on several factors particular to the situation and beyond the scope of this manual (see Rea and Parker 1997 for more information). Table: Number of informants to interview for various stakeholder group sizes (Rea and Parker 1997) Stakeholder group size 95% Level of Confidence 99% Level of Confidence Less than 500 5% Confidence Interval 5% Confidence Interval Generally no more than half the Generally no more than half the group group 500 218 250 1,000 278 399 1,500 306 460 2,000 323 498 3,000 341 544 5,000 357 586 10,000 370 622 20,000 377 642 50,000 382 655 100,000 383 659 Having determined how many people to survey, the team now needs to determine who to survey. The assessment team can use the simple random sampling approach or the systematic random sampling approach. In the simple random sampling approach the team numbers all the stakeholders either on the list of stakeholders or on the map of their locations and then selects stakeholders by:

selecting numbers from a table of random numbers (e.g. the first 2 digits of phone numbers
in a telephone book); or

putting the numbers on small cards in a bowl or a hat and pulling a number, making sure to
replace the card chosen so as to maintain the probability of choosing any card with each draw. This selection process should be repeated until the desired sample size is reached. Systematic random sampling is used when the stakeholder group is large, making it difficult to assign numbers to people for simple random selection. In this approach, the team selects informants from the list at fixed intervals. The informants are selected in proportion to the percent of the group the sample should represent. For example, if the assessment team has identified 1000 fishing households and has determined that the sample size should be 400, then the assessment team should survey 400/1,000 households, or 1 in 4 households. The team would then randomly choose a starting point between the first and fourth household on the list, and work their way down the list selecting every fourth name to survey. In the case of a map, the team could walk through the area selecting every fourth household to survey. This approach can be made more random by combining by selecting the house on the left or right based on the flip of a coin.

ii. Structure of a questionnaire


There is not any fixed form of questionnaire for every survey. The following structure is a very general form to make a questionnaire and people who makes questionnaire should have to change it both in form and contents to fit in the real conditions and requirements of his/her survey. The structure of a questionnaire depends mainly on the requirements of the research objectives. However, there are normally 5 main parts, including: - First part usually mentions about the general information of the household like name, age, address, education level, number of family member, overall living conditions....

47

Second part will mention about the possibility and capability on technique, capital or production experiences. This includes the questions related to the training activities, credit accessibility, interest rate... Third part will directly mention about the production activities which are held by the household. This includes: Basic investment: is all the initial cost which allows production could be operated. It includes the cost of land, pond building, equipment purchasing... Cost of production: including fixed cost and variable cost Fixed cost is the cost which producer has to pay regardless of whether the production is operated or not e.g. depreciation cost, maintenance cost, interest payment... Variable cost is cost which producer has to pay depend on the scale of production like cost for fuels, fingerlings, feed... Yield and turnover: includes questions about yield and average price of products. It should not mention about the lost or benefit because they are the sensitive issues and should be gather information by another way or self-calculated by the team. Fourth part: Market and Environment; Fifth part: related to the advantages and disadvantages and recommendations of the people to local community and local authority; At the end, there should be a sentence of thanks if it is questionnaire which is sent to respondent by post or the interviewer has to say thanks to the respondent. Closed-ended questions: are the questions which have been established to provide to the informants with given or fixed answers (yes/no, true/false...) and the informants could simply mark or X on the answer they select. The informants also could provide concrete data but it will take time because they have to remind on many things which usually have not taken notes e.g. daily cost of production or daily yield.... The closed-ended questions are usually easy to answer. Open-ended questions: are the questions which are established with the suggestion to provide to the informants for their answers. It is easy to ask an open-ended question but it is quite difficult to answer it and analysis it. The open-ended question needs a long answer for description, explanation... so it needs to give enough space for the answer when designing questionnaire. Optional questions: are the questions which its answer is the options are already given by the designer to try to obtain the statistic conclusion for qualitative questions. Of course, it is also useful for the quantitative questions. This type of question is used for the issues which their attributes are most recognized or known already and the surveys organiser just want to make a statistical analysis but not to try to understand about the related reasons or characteristics. This is a type of closed-ended question and the difference here is the general closed-ended question just gives the simple option such as yes/no, true/false etc. but optional question could give much more detail options relate to the interested issues even explanation or description options to support later statistic analysis these issues. For example, an issue is predicted to have explanation and description answers but need to be statistical analyse; so the questionnaire maker need to try to list all the possible answer to facilitate the later coding and analysing data and also keep the interviewer try to exactly understand answers of the informants, clarify and choose appropriate option in questionnaire. However, this type of question still remains one option other for answers which are not yet predicted by the questionnaire maker. Ranking questions: are the questions which aim to provide the relatively comparison between comparable items e.g. resources, production activities or income... This usually provide a hierarchy of items such as very important, important, less important or not important at all to help the informants in ranking issues. For example, assessment requires to obtain description of importance of income from fisheries in total income of household in Tam Giang lagoon areas then questionnaire maker should give the scale very important, important, less important or not important at all to question Importance of income from fisheries to your family?

iii. Types of question


-

48

iv. Principles in designing question


Use close-ended questions only, including true/false answers, range of answers, multiplechoice answers The question has to be short, simple and easy to understand. In case of the question is too long, it should be divided by several short sentences to facilitate the informants; Avoid the questions which include continuous options which are linked by conjunction or e.g. Do you like motorcycle or car? Or Do you like to study in Hanoi or in HCM city?... Then it is very difficult to understand if the answer is yes because the analyser could not know which option the informant selected; Avoid using proverb or too popular sentences, especially in case of assessing the view of informant on an issue because it may make the informant easy to agree with the common view but not saying their true thinking; Avoid the two times negative question. E.g. It should not to discharge waste to the lagoon, isnt it? and if the answer is Disagreed then it should be understood as the informant agree to discharge waste to the lagoon. Thats mean the opinion of informant will be misunderstood or at least it is very difficult to define. Therefore, it is better if designer uses the affirmative questions to gather information. Avoid the question of not know and not reasonable because of there are many issues that the informant surely dont know about or can not answer it. E.g. Do you agree that the Central Government is playing very well their role? then most of informants particularly the local rural informant could not answer this question. It is realistic about what informants know; Use unambiguous wording; use clear and simple syntax and use local vocabulary, including local taxonomies and nomenclature to avoid misunderstanding; Questions should be asked in open way and it should try to use as much as good the questions with fixed answer, options or ranking question. Place those questions that will influence other questions last. The questions should be placed in a logical order (e.g. by time, subject or sector...); Put sensitive questions last (e.g. how much money do you make in a week?) or try to put the alternative related questions to understand situation; Avoid leading questions Not very many women fish in this area, do they? (e.g. How many women fish in the community?) If working in two or more languages or dialects, translate and back-translate from one to the other until all differences are resolved Question should not be too long (20-25 words/question is reasonable)

v. Individual interview and Public interview


This is gradually implemented with each informant. It is normally implemented at informants house or their productions place or a public place. However, the important point is the interview needs to be implemented in a given not too long time and not be interrupted by the externality. The comfort, privacy and quiet of the interview will much encourage the informant in providing information. The informant should have feeling that they are helping the survey team and have an interesting talk with the specialist. In case of there are many informants concentrate in the same place the interviewer should skillfully separate them for individual interview. Table 9: Home and public interview Home interview Accurate information of views of informant

Public interview Views of informant could be influenced by the others Data could be checked by real conditions of Data could not be checked by real conditions of household and added data could be provided by household and added data could be provided by the other familys members the non-familys members Require more resources (time, money...) for Save resources (time, money...) for interviewer interviewer Save time for the informant Require more time of the informant Improve the importance of survey

49

Comparison between individual and group interview Group interview Individual interview Saving money High cost Saving time Time-consuming Easier arrangement with informants Difficult to make appointment with every concentrated in a place at the same time informant at different time Require skillful and experienced personnel Difficultly conduct if there is a prominent Easily conduct with only one informant and the informant in a group or group is divided by informant also is not influenced by the others too many different views/opinions Difficult to check the quality of Can check quantitative or sensitive information of quantitative or sensitive information households when doing interview there Information could be influenced by the Information is more accurate when collected from surrounding each separate informant In brief, which type of interview is used for collecting data will much depend on the requirements and objectives of the assessment. In fact, they usually are both used to obtain further, broader and more accurate information. On the other hand, the limitations of assessment are also necessary to be considered before deciding which type of interview will be used.

vi. Interview methods Difficulties and Advantages

Main interview forms: Interview by mail Interview by E-mail Interview by telephone Direct interview Nowadays, in Vietnam the interview is usually directly conducted and a few by telephone. The interview by mail or e-mail is very rare in Vietnam, except the international organisations which located in Vietnam. Table 8: Advantages and disadvantages of interview forms Interview Disadvantages forms Interview by - Many people dont answer mail - May have mistake in address of informants - Informant may dont fully understand about question - There is no suggestion from interviewer with open-ended questions Interview by Most of Vietnamese people now have not E-mail e-mail address Interview by Many people, especially people in the telephone rural areas have not yet telephone Direct interview High cost Big human resources needed Very complex data management and analysis

Advantages Low cost May reach too many informants Needs few people to conduct

Very low cost High possible to get feedback Small human resources needed Can exchange much necessary information Reach enough of answers with good quality information Obtain more information with open-ended question based on the suggestion of the interviewer

50

vii. Statistics
Statistics

The word "statistics" is used in several different senses. In the broadest sense, "statistics" refers to a range of techniques and procedures for analyzing data, interpreting data, displaying data, and making decisions based on data. This is what courses in "statistics" generally cover. In a second usage, a "statistic" is defined as a numerical quantity (such as the mean) calculated in a sample. Such statistics are used to estimate parameters. The term "statistics" sometimes refers to calculated quantities regardless of whether or not they are from a sample. For example, one might ask about a baseball player's statistics and be referring to his or her batting average, runs batted in, number of home runs, etc. Or, "government statistics" can refer to any numerical indexes calculated by a governmental agency. Although the different meanings of statistics have the potential for confusion, a careful consideration of the context in which the word is used should make its intended meaning clear.
Parameters

A parameter is a numerical quantity measuring some aspect of a population of scores. For example, the mean is a measure of central tendency. Greek letters are used to designate parameters. At the bottom of this page are shown several parameters of great importance in statistical analyses and the Greek symbol that represents each one. Parameters are rarely known and are usually estimated by statistics computed in samples. To the right of each Greek symbol is the symbol for the associated statistic used to estimate it from a sample. Quantity Parameter Statistic Mean M Standard deviation s Proportion p Correlation r
Arithmetic Mean

The arithmetic mean is what is commonly called the average: When the word "mean" is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. The formula in summation notation is: = X/N where is the population mean and N is the number of scores. If the scores are from a sample, then the symbol M refers to the mean and N refers to the sample size. The formula for M is the same as the formula for . The mean is a good measure of central tendency for roughly symmetric distributions but can be misleading in skewed distributions since it can be greatly influenced by extreme scores. Therefore, other statistics such as the median may be more informative for distributions such as reaction time or family income that are frequently very skewed. The sum of squared deviations of scores from their mean is lower than their squared deviations from any other number. For normal distributions, the mean is the most efficient and therefore the least subject to sample fluctuations of all measures of central tendency. The formal definition of the arithmetic mean is = E[X] where is the population mean of the variable X and E[X] is the expected value of X.
Median

51

The median is the middle of a distribution: half the scores are above the median and half are below the median. The median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions. The median income is usually more informative than the mean income, for example. The sum of the absolute deviations of each number from the median is lower than is the sum of absolute deviations from any other number.

The mean, median, and mode are equal in symmetric distributions. The mean is higher than the median in positively skewed distributions and lower than the median in negatively skewed distributions.
Computation of Median

When there is an odd number of numbers, the median is simply the middle number. For example, the median of 2, 4, and 7 is 4. When there is an even number of numbers, the median is the mean of the two middle numbers. Thus, the median of the numbers 2, 4, 7, 12 is (4+7)/2 = 5.5.
Median standard error

The standard error of the median for large samples and normal distributions is: . Thus, the standard error of the median is about 25% larger than that for the mean. It is thus less efficient and more subject to sampling fluctuations. This formula is fairly accurate even for small samples but can be very wrong for extremely non- normal distributions. For non-normal distributions, the standard error of the median is difficult to compute.
Efficiency

The efficiency of a statistic is the degree to which the statistic is stable from sample to sample. That is, the less subject to sampling fluctuation a statistic is, the more efficient it is. The efficiency of statistics is measured relative to the efficiency of other statistics and is therefore often called the relative efficiency. If statistic A has a smaller standard error than statistic B, then statistic A is more efficient than statistic B. The relative efficiency of two statistics may depend on the distribution involved. For instance, the mean is more efficient than the median for normal distributions but not for some extremely skewed distributions. The efficiency of a statistic can also be thought of as the precision of the estimate: The more efficient the statistic, the more precise the statistic is as an estimator of the parameter.
Percentile rank

A percentile rank is typically defined as the proportion of scores in a distribution that a specific score is greater than or equal to. For instance, if you received a score of 95 on a math test and this score was greater than or equal to the scores of 88% of the students taking the test, then your percentile rank would be 88. You would be in the 88th percentile.
Trimean

The trimean is computed by adding the 25th percentile plus twice the 50th percentile plus the 75th percentile and dividing by four. What follows is an example of how to compute the trimean. The 25th, 50th, and 75th percentile of the dataset "Example 1" are 51, 55, and 63 respectively. Therefore, the trimean is computed as:

The trimean is almost as resistant to extreme scores as the median and is less subject to sampling fluctuations than the arithmetic mean in extremely skewed distributions. It is less efficient than the

52

mean for normal distributions. The trimean is a good measure of central tendency and is probably not used as much as it should be.
Trimmed mean

A trimmed mean is calculated by discarding a certain percentage of the lowest and the highest scores and then computing the mean of the remaining scores. For example, a mean trimmed 50% is computed by discarding the lower and higher 25% of the scores and taking the mean of the remaining scores. The median is the mean trimmed 100% and the arithmetic mean is the mean trimmed 0%. A trimmed mean is obviously less susceptible to the effects of extreme scores than is the arithmetic mean. It is therefore less susceptible to sampling fluctuation than the mean for extremely skewed distributions. It is less efficient than the mean for normal distributions. Trimmed means are often used in Olympic scoring to minimize the effects of extreme ratings possibly caused by biased judges.
Skew

A distribution is skewed if one of its tails is longer than the other. The first distribution shown has a positive skew. This means that it has a long tail in the positive direction. The distribution below it has a negative skew since it has a long tail in the negative direction. Finally, the third distribution is symmetric and has no skew. Distributions with positive skew are sometimes called "skewed to the right" whereas distributions with negative skew are called "skewed to the left."

Distributions with positive skew are more common than distributions with negative skews. One example is the distribution of income. Most people make under $40,000 a year, but some make quite a bit more with a small number making many millions of dollars per year. The positive tail therefore extends out quite a long way whereas the negative tail stops at zero. For a more psychological example, a distribution with a positive skew typically results if the time it takes to make a response is measured. The longest response times are usually much longer than typical response times whereas the shortest response times are seldom much less than the typical response time. A histogram of the author's performance on a perceptual motor task in which the goal is to move the mouse to and click on a small target as quickly as possible is shown below. The X axis shows times in milliseconds.

53

Negatively skewed distributions do occur, however. Consider the following frequency polygon of test grades on a statistics test where most students did very well but a few did poorly. It has a large negative skew.

Skew can be calculated as:

where is the mean and is the standard deviation. The normal distribution has a skew of 0 since it is a symmetric distribution. As a general rule, the mean is larger than the median in positively skewed distributions and less than the median in negatively skewed distributions. There are counter examples. For example it is not uncommon for the median to be higher than the mean in a positively skewed bimodal distribution or with discrete distributions.
Normal Distribution

Normal distributions are a family of distributions that have the shape shown below.

Normal distributions are symmetric with scores more concentrated in the middle than in the tails. They are defined by two parameters: the mean () and the standard deviation (). Many kinds of behavioral data are approximated well by the normal distribution. Many statistical tests assume a normal distribution. Most of these tests work well even if the distribution is only approximately normal and in many cases as long as it does not deviate greatly from normality. The formula for the height (y) of a normal distribution for a given value of x is:

Correlation

The correlation between two variables represents the degree to which variables are related. Typically the linear relationship is measured with either Pearson's correlation or Spearman's rho. It is important to keep in mind that correlation does not necessarily mean causation. For example, there is a high positive relationship between the number of fire fighters sent to a fire and the amount of damage done. Does this mean that the fire fighters cause the damage?

54

Or is it more likely that the bigger the fire, the more fire fighters are sent and the more damage that is done. In this example, the variable "size of the fire" is the causal variable, correlating with both the number of fire fighters sent and the amount of damage done.
Linear relationship

When two variables are perfectly linearly related, the points of a scatterplot fall on a straight line as shown below. If you know the score of a subject on one variable then you can determine the score on the other variable exactly. With behavioral data, there is almost never a perfect linear relationship between two variables. The more the points tend to fall along a straight line the stronger the linear relationship. The figure below shows two variables (X and Y) that have a strong but not a perfect linear relationship.

Pearson's correlation

The correlation between two variables reflects the degree to which the variables are related. The most common measure of correlation is the Pearson Product Moment Correlation (called Pearson's correlation for short). When measured in a population the Pearson Product Moment correlation is designated by the Greek letter rho (). When computed in a sample, it is designated by the letter "r" and is sometimes called "Pearson's r." Pearson's correlation reflects the degree of linear relationship between two variables. It ranges from +1 to -1. A correlation of +1 means that there is a perfect positive linear relationship between variables. The scatterplot shown on this page depicts such a relationship. It is a positive relationship because high scores on the X-axis are associated with high scores on the Y-axis. A correlation of -1 means that there is a perfect negative linear relationship between variables. The scatterplot shown to the right depicts a negative relationship. It is a negative relationship because high scores on the X-axis are associated with low scores on the Y-axis. A correlation of 0 means there is no linear relationship between the two variables. The second graph shows a Pearson correlation of 0.Correlations are rarely if ever 0, 1, or -1. Some real data showing a moderately high correlation are shown on the next page.

The scatterplot below shows arm strength as a function of grip strength for 147 people working in physically-demanding jobs. The plot reveals a strong positive relationship. The value of Pearson's correlation is 0.63.

55

Variables

A variable is any measured characteristic or attribute that differs for different subjects. For example, if the weight of 30 subjects were measured, then weight would be a variable.
Quantitative and Qualitative

Variables can be quantitative or qualitative. (Qualitative variables are sometimes called "categorical variables.") Quantitative variables are measured on an ordinal, interval, or ratio scale; qualitative variables are measured on a nominal scale. If five-year old subjects were asked to name their favorite color, then the variable would be qualitative. If the time it took them to respond were measured, then the variable would be quantitative.
Independent and Dependent

When an experiment is conducted, some variables are manipulated by the experimenter and others are measured from the subjects. The former variables are called "independent variables"; or "factors," the latter are called "dependent variables" or "dependent measures." For example, consider a hypothetical experiment on the effect of drinking alcohol on reaction time: Subjects drank either water, one beer, three beers, or six beers and then had their reaction times to the onset of a stimulus measured. The independent variable would be the number of beers drunk (0,1,3, or 6) and the dependent variable would be reaction time.
Continuous and Discrete

Some variables (such as reaction time) are measured on a continuous scale. There is an infinite number of possible values these variables can take on. Other variables can only take on a limited number of values. For example, if a dependent variable were a subject's rating on a five- point scale where only the values 1, 2, 3, 4, and 5 were allowed, then only five possible values could occur. Such variables are called "discrete" variables.
Continuous variable

A continuous variable is one for which, within the limits the variable ranges, any value is possible. For example, the variable

56

"Time to solve an anagram problem" is continuous since it could take 2 minutes, 2.13 minutes etc. to finish a problem. The variable "Number of correct answers on a 100 point multiple-choice test" is not a continuous variable since it is not possible to get 54.12 problems correct. A variable that is not continuous is called "discrete."
Discrete variable

A discrete variable is one that cannot take on all values within the limits of the variable. For example, responses to a five-point rating scale can only take on the values 1, 2, 3, 4, and 5. The variable cannot have the value 1.7. A variable such as a person's height can take on any value. Variables that can take on any value and therefore are not discrete are called continuous. Statistics computed from discrete variables have many more possible values than the discrete variables themselves. The mean on a five-point scale could be 3.117 even though 3.117 is not possible for an individual score.
Ordinal scale

Measurements with ordinal scales are ordered in the sense that higher numbers represent higher values. However, the intervals between the numbers are not necessarily equal. For example, on a five-point rating scale measuring attitudes toward gun control, the difference between a rating of 2 and a rating of 3 may not represent the same difference as the difference between a rating of 4 and a rating of 5. There is no "true" zero point for ordinal scales since the zero point is chosen arbitrarily. The lowest point on the rating scale in the example was arbitrarily chosen to be 1. It could just as well have been 0 or -5.
Interval scale

On interval measurement scales, one unit on the scale represents the same magnitude on the trait or characteristic being measured across the whole range of the scale. For example, if anxiety were measured on an interval scale, then a difference between a score of 10 and a score of 11 would represent the same difference in anxiety as would a difference between a score of 50 and a score of 51. Interval scales do not have a "true" zero point, however, and therefore it is not possible to make statements about how many times higher one score is than another. For the anxiety scale, it would not be valid to say that a person with a score of 30 was twice as anxious as a person with a score of 15. True interval measurement is somewhere between rare and nonexistent in the behavioral sciences. No interval-level scale of anxiety such as the one described in the example actually exists. A good example of an interval scale is the Fahrenheit scale for temperature. Equal differences on this scale represent equal differences in temperature, but a temperature of 30 degrees is not twice as warm as one of 15 degrees.
Ratio scale

Ratio scales are like interval scales except they have true zero points. A good example is the Kelvin scale of temperature. This scale has an absolute zero. Thus, a temperature of 300 Kelvin is twice as high as a temperature of 150 Kelvin.
Nominal Scale

Nominal measurement consists of assigning items to groups or categories. No quantitative information is conveyed and no ordering of the items is implied. Nominal scales are therefore qualitative rather than quantitative. Religious preference, race, and sex are all examples of nominal scales. Frequency distributions are usually used to analyze data measured on a nominal scale. The main statistic computed is the mode. Variables measured on a nominal scale are often referred to as categorical or qualitative variables.
Frequency Table

A frequency table is constructed by dividing the scores into intervals and counting the number of scores in each interval. The actual number of scores as well as the percentage of scores in each

57

interval are displayed. Cumulative frequencies are also usually displayed. A frequency table for the tournament players from the example dataset "chess" is shown below.

Cumulative distribution

A cumulative frequency distribution is a plot of the number of observations falling in or below an interval. The graph shown here is a cumulative frequency distribution of the scores on a statistics test. Thirty-eight students took the test. The X-axis shows various intervals of scores (the interval labeled 35 includes any score from 32.5 to 37.5). The Y-axis shows the number of students scoring in the interval or below the interval. A cumulative frequency distribution can show either the actual frequencies at or below each interval (as shown here) or the percentage of the scores at or below each interval. The plot can be a histogram as shown here or a polygon.
Histogram

A histogram is constructed from a frequency table. The intervals are shown on the X-axis and the number of scores in each interval is represented by the height of a rectangle located above the interval. A histogram of the response times from the dataset Target RT is shown below. Histogram

The shapes of histograms will vary depending on the choice of the size of the intervals. A bar graph is much like a histogram, differring in that the columns are separated from each other by a small distance. Bar graphs are commonly used for qualitative variables.

Stem and Leaf Plots

A stem and leaf display (also called a stem and leaf plot) is a graphical method of displaying data. It is particularly useful when the data are not too numerous. A stem and leaf plot is siimilar to a

58

histogram except it portrays a little more precision. A stem and leaf plot of the tournament players from the dataset "chess" as well as the data themselves are shown below.

The largest value, 85.3, is approximated as 10 x 8 + 5. This is represented in the plot as a stem of 8 and a leaf of 5. It is shown as the "5" in the first line of the plot. Similarly, 80.3 is approximated as 10 x 8 + 0; it has a stem of 8 and a leaf of 0. It is shown as the "0" in the first line of the plot. Depending on the data, each stem is displayed 1, 2, or 5 times. When a stem is displayed only once (as on the plot shown above), the leaves can take on the values from 0-9. When a stem is displayed twice, (as in the example shown below) one stem is associated with the leaves 5-9 and the other stem is associated with the leaves 0-4.

Finally, when a stem is displayed five times, the first has the leaves 8-9, the second 6-7, the third 45, and so on. If positive and negative numbers are both present, +0 and -0 are used as stems as they are in the plot to the right. A stem of -0 and a leaf of 7 is a value of (-0 x 1) + (-.1 x 7) = -.7. There is a variation of stem and leaf displays that is useful for comparing distributions. The two distributions are placed back to back along a common column of stems. The figure below shows such a graph. It compares two distributions. The stems are in the middle, the leaves to the left are for one distribution, and the leaves to the right are for the other. For example, the second-to-last row shows that the distribution on the left contains the values 11, 12, and 13 whereas the distribution on the right contains two 12's and three 14's. 11 4 3 7 332 3 233 8865 2 889

59

44331110 987776665 321 7

2 1 1 0

001112223 56888899 22444 69

Box Plot

A box plot provides an excellent visual summary of many important aspects of a distribution. The box stretches from the lower hinge (defined as the 25th percentile) to the upper hinge (the 75th percentile) and therefore contains the middle half of the scores in the distribution. The median is shown as a line across the box. Therefore 1/4 of the distribution is between this line and the top of the box and 1/4 of the distribution is between this line and the bottom of the box. The "H-spread" is defined as the difference between the hinges and a "step" is defined as 1.5 times the H-spread. Inner fences are 1 step beyond the hinges. Outer fences are 2 steps beyond the hinges. There are two adjacent values: the largest value below the upper inner fence and the smallest value above the lower inner fence. For the data plotted in the figure, the minimum value is above the lower inner fence and is therefore the lower adjacent value. The maximum value is the inner fences so it is not the upper adjacent value. As shown in the figure, a line is drawn from the upper hinge to the upper adjacent value and from the lower hinge to the lower adjacent value.

Every score between the inner and outer fences is indicated by an "o"; a score beyond the outer fences is indicated by a "*".

60

It is often useful to compare data from two or more groups by viewing boxplots from the groups side by side. Plotted are data from Example 2a and Example 2b . The data from 2b are higher, more spread out, and have a positive skew. That the skew is positive can be determined by the fact that the mean is higher than the median and the upper whisker is longer than the lower whisker. Some computer programs present their own variations on box plots. For example, SPSS does not include the mean. JMP distinguishes between "outlier" box plots which are the same as those described here and quintile box plots that show the 10th, 25th, 50th, 75th, and 90th percentiles. Frequency polygon A frequency polygon is a graphical display of a frequency table. The intervals are shown on the Xaxis and the number of scores in each interval is represented by the height of a point located above the middle of the interval. The points are connected so that together with the X-axis they form a polygon. A frequency table and a relative frequency polygon for response times in a study on weapons and aggression are shown below. The times are in hundredths of a second.

Lower Upper Cumulative Count Limit Limit Count 25 30 35 40 45 50 30 35 40 45 50 55 1 4 8 15 3 1

Per Cent

Cumulative Per Cent 3.12 15.62 40.62 87.50 96.88 100.00

1 3.12 5 12.48 13 24.96 28 46.80 31 9.36 32 3.12

Note: Values in each category are > the lower limit and to the upper limit.

Frequency polygons can be based on the actual frequencies or the relative frequencies. When based on relative frequencies, the percentage of scores instead of the number of scores in each category is plotted.

61

In a cumulative frequency polygon, the number of scores (or the percentage of scores) up to and including the category in question is plotted. A cumulative frequency polygon is shown below.

viii. Tips in conducting interview


1. Interview by mail: The envelop has to be good looking and solemn to respect the informant; The mail needs to clearly explain the method of sampling and the guides for answering to ensure the informant could clearly, adequately and accurately provide information; - An empty envelop with given address and stamp needs to be attached to facilitate the informant in sending back their answer. 2. Interview by telephone The introduction needs to be very simple and clear both of interviewer him/herself and objectives of interview. It should be made in the way to encourage the informant to freely provide information. 3. Interview by e-mail Beside the detail introduction, it also needs to clearly explain the method of sampling and the guides for answering to ensure the informant could clearly, adequately and accurately provide information. -

ix. Tips for field data collection


Field survey will be easier conducted if participants could be encouraged by the small gifts particularly in case of the survey is still continued or has been conducted before. However, it should be very delicate handle depending on the real conditions of the survey and local customs. Gifts should be object or souvenir but sometime could be small money depend on the donor. On the other hand, it should be very careful in doing this because it could lead to the difficulties in gathering information in the next survey when there is not any gift; The good relationship with the informants could be improved by taking part in the local events in the period of doing survey e.g. wedding, festival, sport games etc. The earnest of the field survey team when doing survey will substantial increase the importance of survey in views of the local people. The support activities e.g. PRA, RRA or reconnaissance survey... are very useful in providing support information for the real household survey after.

62

GLOSSARY
Assessment a study to collect data at one time Assessment team the people who do the socioeconomic assessment Complex variable is a main part of an issue, such as land resources and water resources are the complex variables (main parts) of issue natural resources End-users people or organisations that use assessment findings to make decisions and policy about management, identify research needs, or plan development in coastal areas

Facilitator team member who guides interviews by explaining the process, asking the questions and follow-up questions, and engaging people in discussion and analysis Field team small team to collect field data Informants people who answer surveys or participate in interviews

Key informants people with rank, experience or knowledge who can provide extensive insight on socioeconomic conditions Maximum - highest value Mean - is the sum of all the scores divided by the number of scores Median - is the middle of a distribution: half the scores are above the median and half are below the median Minimum - smallest value Mode is the most frequently occurring score in a distribution Non-random sampling - samples are selected by the discretion of the researcher Parameters the elements, components or topics that are the focus of an assessment Random sampling - A sampling procedure that assures that each element in the population has an equal chance of being selected is referred to as simple random sampling Range - difference between the least to greatest value of data Skewness - A distribution is skewed if one of its tails is longer than the other Standard deviation - is a statistic that tells you how tightly all the various examples are clustered around the mean in a set of data Standard error of a statistic - is the standard deviation of the sampling distribution of that statistic Sub-Parameter the various forms of a parameter is presented in the reality Sampling - Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen Secondary data data that have been collected, analysed and published Simple variable is variables to describe the complex variable. E.g. agriculture land, forestry land, aquaculture land... are the simple variables of the complex variable land resources

63

Socioeconomic assessment study of the social, cultural, economic and political conditions of people, groups, communities and organisations Stakeholder representatives people who represent the views of stakeholders because of the positions they hold in formal or informal organisations Stakeholders people, groups, communities and organisations who use and depend on the related resources, whose activities affect that resources or who have an interest in these activities, including government agencies, non-government organisations, local users, universities and researchers Study area the area covered by the socioeconomic assessment Study sites small areas or communities within the study area Variable an attribute of a research issue

64

REFERENCES
A. N. Oppenheim, 2000. Questionnaire Design, Interviewing and Attitude Measurement. New Edition. Continuum, London and New York. Bryman, A., 2001. Social Research Methods. Oxford University Press, UK. 540 p. Chambers, R., 1981. Rapid Rural Appraisal: Rationale and Repertoire. Public Administration and Development. Vol. 1: 95-106. Demaine, H., 2000. RRA, PRA and Questionnaire Survey: Brief Notes. Asian Institute of Technology - Bangkok, Thailand. Fink, A., 1995. The Survey Handbook. Sage Publication Ed., California. 129 p. Mikkelsen, B., 1995. Methods for Development Work and Research: a Guide for Practitioners. Sage Publication Ed., California. 296 p. Oppenheim, A. N., 1992. Questionnaire Design, Interviewing and Attitude Measurement. Continuum Ed., New York. 303 p. Barton, D.N. (1994). Economic Factors and Valuation of Tropical Coastal Resources, SMR-Report 14/94, Bergen, Norway. Cohen, J. (1988). Statistical Power for Analysis for the Behavioral Sciences. Hillsdale, N.J.: Lawrence Eribaum Associates. English, S., Wilkinson, C. and Baker, V. (1997). Survey Manual for Tropical Marine Resources Townsville, Australia: Australian Institute of Marine Science. Gorman, M. (1995). Tanga Coastal Zone Conservation and Development Programme Report on Socio-Economic Study/ Participatory Rural Appraisal, pp 66. IIRR 1998. Participatory methods in community-based coastal resource management. 3 vols. International Institute of Rural Reconstruction, Silang, Cavite, Philippines. Keys/Florida Bay. U.S. National Oceanic and Atmospheric Administration, Monroe County Tourist Development Council, The Nature Conservancy, The University of Georgia and the Department of Agriculture Forest Service. http://www.orca.nos.noaa.gov/projects/econkeys/econkeys.html. Pido, M.D. (1995). The application of Rapid Rural Appraisal techniques in coastal resources planning: Experience in Malampaya Sound, Philippines. Ocean & Coastal Management 26 (1): 5772. Pido, M.D., R. S. Pomeroy, M.B. Carlos, L.R. Garces. (1996). A Handbook for Rapid Appraisal of Fisheries Management Systems. Version 1. ICLARM. Pinkerton, E. (1989). Introduction: Attaining better fisheries management through comanagementprospects, problems, and propositions. in (Pinkerton, E., Ed.) Coastal Management Report #2205 and ICLARM Contribution #1445. Narragansett, RI and Manila: Coastal Resources Center, University of Rhode Island and the International Center for Living Aquatic Resources Management. Pollnac RB, Sondita F, Crawford B, Mantjoro E, Rotinsulu, Siahainenia. (1997). Socioeconomic Aspects of Resource Use in Bentenan and Tumbak. CRC/ URI CRMP, Jakarta, Indonesia pp79, www.indomarine.or.id/pesisir/ Pomeroy R.S., Pollnac, R., Predo, C.D. and Katon, B.M. (1996). Impact Evaluation of CommunityBased Coastal Resource Management Projects in the Philippines. International Center for Living Aquatic Resources Management (ICLARM). Manila, Philippines.

65

Rea, L.M. and Parker, R.A. (1997). Designing and Conducting Survey Research: A Comprehensive Guide. Second Edition. Jossey-Bass Inc, San Francisco, CA. Townsley P. (1993). A Manual on Rapid Appraisal methods for Coastal Communities. Bay of Bengal Programme, Madras, India. Viswanathan, K.K. (1994). Enforcement and Compliance with Regulations in the Malaysian Fishery. Asian Fisheries Social Science Research Network (AFSSRN) Research Report Series No. 3-3. Manila: ICLARM http://www.ruf.rice.edu/~lane/stat_sim/descriptive/index.html http://www.accesscable.net/~infopoll/tips.htm

66

You might also like