Research Methodology in Development Studies
Research Methodology in Development Studies
Research Methodology
in Development Studies
Indira Gandhi National Open University
School of Extension and Development Studies
Block
1
FUNDAMENTALS OF SOCIAL SCIENCE
RESEARCH
UNIT 1
Social Science Research: An Overview 5
UNIT 2
Components of Social Science Research 20
UNIT 3
Research Designs 35
UNIT 4
Research Project Formulation 48
PROGRAMME DESIGN COMMITTEE
Prof. Amita Shah Prof. P. Radhakrishan
Gujarat Institute of Development Research, Madras Institute of Development Studies, Chennai
Ahmedabad Prof. Ramashray Roy (Rtd)
Prof. S. K. Bhati Centre for Study of Developing Societies, Delhi
Jamia Millia Islamia, New Delhi Prof. R. P. Singh ( Rtd)
Prof. J. S. Gandhi (Rtd) Ex-Vice-Chancellor, MPUAT, Udaipur
Jawaharlal Nehru University, New Delhi Prof. K. Vijayaraghavan
Prof. Gopal Krishnan (Rtd) Indian Agriculture Research Institute, New Delhi
Punjab University, Chandigarh Dr. Nilima Shrivastava
Prof. S. Janakrajan IGNOU, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai IGNOU, New Delhi
Prof. Kumar B. Das Dr. Nehal A. Farooquee
Utkal University, Bhubaneswar IGNOU, New Delhi
Prof. Nadeem Mohsin (Rtd) Dr. P.V.K.Sasidhar
A.N.Sinha Institute of Social Sciences, Patna IGNOU, New Delhi
PROGRAMME DESIGN COMMITTEE (REVISED)
Prof. T.S. Papola Prof. Nadeem Mohsin (Rtd)
Institute for Studies in Industrial Development, A.N.Sinha Institute of Social Sciences, Patna
New Delhi Prof. Rajesh
Prof. S. Janakrajan University of Delhi, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai. IGNOU, New Delhi
Prof. S. K. Bhati Prof. Nehal A. Farooquee
Jamia Millia Islamia, New Delhi IGNOU, New Delhi
Prof. Preet Rustagi Prof. P.V.K. Sasidhar
Institute for Human Development, Delhi IGNOU, New Delhi
Prof. Gopal Iyer (Rtd) Dr. Pradeep Kumar
Panjab University, Chandigarh IGNOU, New Delhi
Dr. S. Srinivasa Rao Dr. Nisha Varghese
Jawaharlal Nehru University, New Delhi IGNOU, New Delhi
Dr. S. Rubina Naqvi Dr. Grace Don Nemching
Hindu College, University of Delhi, Delhi IGNOU, New Delhi
COURSE PREPARATION TEAM
Unit Writer: Editors:
Dr. Nadeem Mohsin (Rtd) Prof. V.K.Jain (Rtd) (Content Editor)
A.N.Sinha Institute of Social Sciences NCERT, New Delhi
(Unit 1 and 2)
Mr. Praveer Shukla (Language Editor), New Delhi
Dr. Nisha Varghese Dr. Nisha Varghese, IGNOU
SOEDS, IGNOU (Unit 3) Prof. B. K. Pattanaik, IGNOU
Prof. Mamata Swain Prof. Nehal A. Farooquee, IGNOU
Revenshaw University, Cuttack (Unit 4) Prof. P. V. K. Sasidhar, IGNOU
4
UNIT 1 SOCIAL SCIENCE RESEARCH:
AN OVERVIEW
Structure
1.1 Introduction
1.2 The Meaning and Concept of Social Science Research
1.3 The Differences between Natural and Social Science Research
1.4 Approaches to Social Science Research
1.5 Types of Social Science Research
1.6 Let Us Sum Up
1.7 References and Selected Readings
1.8 Check Your Progress – Possible Answers
1.1 INTRODUCTION
In a world which is growing every day, in terms of population and
knowledge, it is extremely important to understand how different societies
work and influence each other. Social science research is an important tool
to understand how society functions, and how human beings in society
influence each other. Social science research deals with social phenomena
and attitudes of human beings as members of a society under different
circumstances and situations. Social science research helps every nation
in the formulation of legislations and policies, schemes and programmes
on socio-economic issues and has been an extremely essential tool for the
government and the people. Let us take an example of a social problem
which can be solved through research. The problem is the low level of
literacy in the country. If the literacy level is low in the country, the country
is not expected to flourish. Low levels of literacy will lead to lower levels
of employment and lower incomes. Families with lower incomes have a
tendency to send their children to work as child labour. Children who work
and earn incomes are not able to go to school, and, hence remain illiterate.
Illiteracy results in unemployment and unemployment causes poverty. Social
science research will help in finding out causes to problems of illiteracy,
unemployment, poverty, etc., thereby, assisting the government to formulate
legislations and policies, schemes and programmes for the eradication of
illiteracy. Social science research is oriented toward building knowledge.
It describes the methods by which results are known; it sets up the inquiry
process so that evidence from all sides of a problem can be examined;
it generalizes knowledge more broadly beyond the specific instances that
are examined. Research is one possible way through which knowledge can
be generated.
After studying this unit, you should be able to
l explain the meaning and concept of social science research
l distinguish between natural and social science research
l describe the various types and approaches to social science research
l discuss the recent trends in social science research and understand the
basic concepts related to research 5
Fundamentals of Social
Science Research 1.2 THE MEANING AND CONCEPT OF SOCIAL
SCIENCE RESEARCH
Social science research can be conducted, for example, in a school setting
where absenteeism is a major issue. What are the causes because of which:
children remain absent from school; teachers are very strict with students;
teaching in the school is not up to the mark; the principal of the school
is not able to handle the teachers well; the school is very far off from
most of the children’s homes; there are no proper toilet facilities for girls
in the school? These are some of the issues that could be addressed by
research on absenteeism. The solutions for absenteeism can also be found
from the findings of the research. Research can, thus, be an important tool
in bringing about major policy changes in the school.
What does Social Science research actually mean? This is not a very easy
question to answer. At one level, such research deals with social sciences
– Sociology, History, Geography, Psychology, Political Science, Economics,
etc., and, thus, all research based on these disciplines are social science
research. For example, a study of living conditions of tribal communities
is research in Sociology. Similarly, a study of the life during the time of
Ashoka is research related to History; voting patterns of new eligible voters
in a parliamentary election is research in the Political Science; a study on
the behaviour of adolescents during parties is Psychological research; and,
a study of income and expenditure patterns of middle class urban
households during the phase of inflation is research related to Economics.
Research, in itself, is designed as a method of enquiry into issues concerning
human beings. According to Globusz Publishers, the social science research
is a systematic method of exploring, analysing and conceptualising human
life in order to extend, correct or verify knowledge of human behaviour
and social life. In other words Social Sciences Research “seeks to find
explanations to unexplained social phenomena, to clarify the doubtful and
correct the misconceived facts of social life”. Research is not an arbitrary
activity, but follows certain rules and procedures.
Some of the factors of social science research are as follows
a) social science research is a method of enquiry to gain further
knowledge or enhance existing knowledge
b) social science research is essential to understand issues of human
concerns
c) social science research involves time and money
d) social science research is useful for the formulation of legislations,
policies, schemes, and programmes
There are broadly five types of social sciences research that are undertaken,
outlined below.
1. Applied Research
2. Fundamental Research
3. Action Research
4. Participatory Research
5. Participatory Action Research
6. Experimental Research
1.5.1 Applied Research
Applied research is an empirical research in which the goal is to contribute
to apply the research findings to solve a problem. Applied research is
designed to solve practical problems of the modern world, rather than to
acquire knowledge for knowledge’s sake. One might say that the goal of
the applied scientist is to improve the human condition. The primary
purpose of applied research is discovering, interpreting and developing
methods and systems for the advancement of human knowledge on a wide
variety of scientific matters of our world and the universe.
For example, applied researchers may be involved in
l Research that addresses the effectiveness of a fundraising technique
l Research that addresses the effectiveness of a strategy to deal with
urban solid waste
l Research that studies the effect of minimum tillage on wheat yield.
Some scientists feel that the time has come for a shift in emphasis away
from purely basic research and toward applied research. This trend, they
feel, is necessitated by the problems resulting from global overpopulation,
pollution, and the overuse of the earth’s natural resources. Applied research
is used in business, education and medical field in order to find solutions
to cure diseases solve scientific problems or develop new technologies.
As applied research delves into realistic problems, applied researchers are
more concerned about external validity of their studies. They attempt to
observe behaviours that can be applied to real life situations.
1.5.2 Fundamental Research
Fundamental research or basic research (sometimes pure research) is
research carried out to increase understanding of fundamental principles.
It is carried out with a view to acquire knowledge for knowledge’s sake,
and not for finding immediate solutions to problems. Many times, the end
results have no direct or immediate benefits. Basic research can be thought
of as arising out of curiosity. However, in the long term it is the basis
for many applied researches. Basic research is mainly carried out by
universities. Fundamental research is less oriented towards immediate
solutions to problems. It tends to deal with ideas of interest to the researcher.
Fundamental research is carried out to understand, for example, how
teenagers in a society behave or how a community in a society lives. It
does not find answers to how teenagers should actually behave or what
causes teenagers to behave in a particular way or how a particular 13
Fundamentals of Social community in a society should actually live. Any research undertaken to
Science Research
understand how a particular system works falls in the category of
fundamental research.
1.5.3 Action Research
Action research is a process of progressive problem solving led by
individuals working with others in teams or as part of community work
to improve the way they address issues and solve their own problems.
Action research can also be undertaken by larger organizations or institutions,
assisted or guided by professional researchers, with the aim of improving
their strategies, practices, and knowledge of the environments within which
they work. As designers and stakeholders, researchers work with others
to propose a new course of action to help their community improve its
work practices. Kurt Lewin, then a professor at MIT, first coined the term
‘action research’ in about 1944, and it appears in his paper in 1946, Action
Research and Minority Problems. In that paper, he described action research
as “a comparative research on the conditions and effects of various forms
of social action and research leading to social action that uses “a spiral
of steps, each of which is composed of a circle of planning, action and
fact-finding about the result of the action”. According to Nunan (1990),
a classroom action research does not require the standard formalization of
a research project with a literature search, hypothesis testing, treatment
conditions, etc. Instead, it consists of seven basic steps to investigate a
problem. They are as follows.
1. After determining that there is a potential problem, survey what is
happening through observation - via video, audio, hash marks, or
whatever relevant means are available.
2. Code the observation based on the problem, and what was seen (i.e.,
the code is created solely for that problem/session).
3. Based on the coded information, determine one change that could
impact the problem in a positive manner.
4. Implement the change in the course/classroom.
5. Observe the class/course (as in Step 1) while implementing the change.
6. Code the new observation(s) as in Step 2.
7. Finally, compare the coded sessions to determine the results of the
change.
1.5.4 Participatory Research
Participatory research is an action oriented research activity in which
ordinary people address common needs arising in their daily lives, and in
the process, generate knowledge. Participatory research differs from both
basic and applied social science research in terms of people’s involvement
in the research process, integration of action with research, and the practice-
based nature of the knowledge that is entailed. It sets itself apart even
from other forms of action-oriented research because of the central role
that non-experts play. In contrast to other forms of action-oriented research,
in which outsiders have an important role in determining what problems
to address, often taking charge of the research process and implementing
14 action, in participatory research people who share problems in common
decide what problems to tackle and directly get involved in research and Social Science Research:
An Overview
social change activities. Action minded researchers with technical backgrounds
often get involved in this process, but mainly as facilitators. The reason
for this emphasis on popular participation is that participatory research is
not just a convenient instrument for solving social problems through
technically efficacious means, but it is also a practice that helps marginalized
people attain a degree of emancipation as autonomous and responsible
members of society. It is allied to the ideals of democracy, and in that
spirit it is proper to call it research of the people, by the people and for
the people.
Participatory research deals with issues that affect classes of people in such
wide ranging areas as inner city, rural poverty, health, education, agriculture,
environment, housing, community development, mental health, disability,
domestic violence, women’s oppression, and immigration. The more
obvious purpose of participatory research is to bring about changes by
improving the material circumstances of affected people. To this end, people
engage in three different kinds of activity; inquiring into the nature of the
problem to solve by understanding its causes and meaning; getting together
by organizing themselves as community units; and mobilizing themselves
for action by raising their awareness of what should be done on normal
and political grounds. For this reason, gathering and analyzing necessary
information, strengthening community ties and sharpening the ability to
think and act critically emerge as three main objectives of participatory
research, requiring three different kinds of knowledge.
1.5.5 Participatory Action Research
Action research or participatory action research has emerged in recent years
as a significant methodology for intervention, development and change
within communities and groups. It is now promoted and implemented by
many international development agencies and university programmes, as
well as countless local community organizations around the world.
Participatory action research is a recognized form of experimental research
that focuses on the effects of the researcher’s direct actions of practice
within a participatory community with the goal of improving the performance
quality of the community or an area of concern. Action research involves
utilizing a systematic cyclical method of planning, taking action, observing,
evaluating (including self evaluation) and critically reflecting prior to
planning the next cycle. The actions have a set goal of addressing an
identified problem in the workplace, for example, reducing the illiteracy
of students through use of new strategies or improving communication and
efficiency in a hospital emergency room. It is a collaborative method to
test new ideas and implement action for change. It involves direct
participation in a dynamic research process, while monitoring and evaluating
the effects of the researcher’s actions with the aim of improving practice.
At its core, action research is a way to increase understanding of how
change in one’s actions or practices can mutually benefit a community.
Essentially Participatory Action Research (PAR) is research which involves
all relevant parties in actively examining together current action (which
they experience as problematic) in order to change and improve it. They
do this by critically reflecting on the historical, political, cultural, economic,
geographic and other contexts which make sense of it. Participatory action
research is not just research which is hoped that will be followed by action. 15
Fundamentals of Social It is action which is researched, changed and re-researched, within the
Science Research
research process by participants. It aims to be active research, by and for
those to be helped. It tries to be a genuinely democratic process whereby
those to be helped, determine the purposes and outcomes of their own
research.
PAR should not be confused with PRA - Participatory Rural Appraisal.
PRA is an assessment technique that could form part of a PAR process,
but does not encompass the full action-reflection cycle. PAR has evolved
through the 1990s and into the 21st century as it has been applied to various
fields. Other research approaches that often fall under the label of PAR
include participatory research, critical action research, classroom action
research, action learning, etc. Additionally, more methods have been
developed to add value to the PAR technique, such as Participatory
Development Communication (PDC) and Participatory Video (PV).
Practitioners have also recently tried to move away from the word
“research” because of its abstract meaning to action for change within a
community or group. Thus, new names are being used, such as Participatory
Action Learning, Participatory Learning Action, and Participatory Action
Development. PAR is a popular method used in teaching adult learners
in low income communities, and others how to explore, challenge, and
react to their own needs. It is gaining popularity among community youth
workers, as well as middle and senior high school teachers as a successful
methodology for engaging youth voice in the classroom. Youth PAR projects
identify mostly issues like educational justice, access to quality healthcare,
the criminalization of youth, gang violence, police brutality, etc. PAR is
also increasingly used in service learning projects.
1.5.6 Experimental Research
Experimental research is commonly used in sciences such as sociology and
psychology, physics, chemistry, biology and medicine, etc. In scientific
research, an experiment (Latin: ex- periri, ‘to try out’) is a method of
investigating causal relationships among variables, or scientific investigation
in which an investigator manipulates and controls one or more independent
variables to determine their cause and effects on the research. Experimental
research is also called randomized controlled research or randomized
controlled trials. In experimental research, the researcher manipulates one
variable and controls the rest of the variables. The experimental research
has an experimental group and a control group. The subjects are randomly
assigned between the groups and the researcher tests one effect at a time.
For example, suppose a researcher wants to look at the effects of a
Computer Assisted Instruction (CAI) programme on children with autism.
Suppose the researcher chooses a sample size of 50 children with autism.
In the sample, 25 children are taken in the treatment group who will be
administered CAI and the remaining 25 will be in the control group. This
kind of a research will come under experimental research.
In this section, you have studied types of social sciences research like
applied research, fundamental research, action research, participatory research
and experimental research, now answer the question given in Check Your
Progress-3.
16
Check Your Progress 3 Social Science Research:
An Overview
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Do you remember the seven basic steps of a classroom action research
as given by Nunan? If yes, please put them down point-wise.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. Write the salient features of a Participatory Action Research.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
19
Fundamentals of Social
Science Research UNIT 2 COMPONENTS OF SOCIAL
SCIENCE RESEARCH
Structure
2.1 Introduction
2.2 Concept
2.3 Objectives
2.4 Definition
2.5 Hypothesis
2.6 Variables
2.7 Let Us Sum Up
2.8 References and Selected Readings
2.9 Check Your Progress – Possible Answers
2.1 INTRODUCTION
Social science research is interesting, but requires a clear understanding
of its basic components. If the components on which social science research
are based are not understood, it will be difficult to undertake research. If
concepts like hypothesis, concept, construct, variable, etc., are not understood,
a researcher will not be able to understand the basic principles underlying
social science research. It is therefore, essential that these concepts are
understood. These concepts will be explained here in simple language,
through relevant examples wherever required, in order to have a basic
understanding of what social science research is all about.
After studying this unit, you should be able to
l explain the meaning of concept, types of concept, difference between
concept and construct;
l discuss the meaning of objectives, types of objectives, and the criteria
for judging the objectives;
l describe how to develop a theoretical and conceptual framework;
l explain the meaning and characteristics of definition, types, and the
role of definition;
l discuss the meaning and importance of hypothesis; and
l describe the meaning, types, and the role of variables.
2.2 CONCEPT
A concept is a general idea derived, or, inferred, from specific instances
or occurrences. It is something formed in the mind; a thought or notion.
A concept is a cognitive unit of meaning - an abstract idea or a mental
symbol sometimes defined as a ‘unit of knowledge’, built from other units
which act as a concept’s characteristics. A concept is typically associated
with a corresponding representation in a language or symbol such as a
word.
20
Components of Social
2.2.1 Meaning of Concept Science Research
A concept can be defined as a verbal response evoked by objects of the
class to which the concept applies, for example, light, temperature, sound,
age, sex, accidents, etc., are all class names applied to stimuli, subjects
or responses of a specific kind. These are all examples of concepts which
cannot be directly observed, but their instances can be located. There are
other concepts like mental strength, drive, attitude, motivation, etc., whose
instances too cannot be directly observed because they are presumed to
be located inside the organism. They are called hypothetical constructs.
A concept is a property, or characteristic of some case, or unit of analysis
in which one might be interested. It is essentially an idea about some aspect
of some phenomenon, for example, gender, self-esteem, bureaucracy, social
classification, etc. A case (unit of analysis) is that defined entity that is
sampled and scored, or measured, on variables of interest in a research
project. A case is defined in terms of its major characteristics and their
location in time and place. In Sociology, a case is often a human individual,
a group, an organisation or a society. It can also be social entities such
as the father-child role relationship. In research, a sample or population
of these cases is targeted for examination. Research involves special
concepts such as total family income, self-employment, and economic
returns. These are generally technical terms that point to some phenomenon
that is an important aspect of a topic to be researched. Such concepts must
be defined carefully so that others understand specifically what they mean.
The concept of total family income, for example, is defined to have a range
of possible values. Thus, it is called a variable in a given piece of research.
To be useful to researchers, however, the abstract definition of a concept
is not enough. A set of indicators must be developed in order to actually
measure or classify families in terms of their total family incomes. Families
can be classified, for example, into low, middle and high income groups.
They can further be categorised into: less than Rs.5000 per month; between
Rs.5,000 and Rs. 10,000, per month; and Rs.10,000 and above, per month.
2.2.2 Types of Concepts
Concepts are basically of two types:
1. Concepts which cannot be directly observed, but their instances can
be located. Examples of such concepts are: temperature, sound, age,
etc.
2. Concepts whose instances cannot be directly observed because they
are presumed to be located inside the organism. They are called
hypothetical constructs. Examples of such concepts are: mental strength,
drive, attitude, motivation, etc.
A concept can have a conceptual definition as well as an operational
definition. In the above example of the total family income, the conceptual
definition is the total monthly earnings of the family (that is, all earning
members in the family) through all sources. In the same example, the
operational definition would be the range of categories of income such
as less than Rs.5,000 per month, between Rs.5,000 and Rs.10,000 per month
and Rs.10,000 and above, per month.
21
Fundamentals of Social
Science Research
2.2.3 Concepts and Constructs
A concept, as already defined, is a property or characteristic of some case
or unit of analysis in which one might be interested. It is essentially an
idea about some aspect of some phenomenon, for example, gender, self
esteem, bureaucracy, social classification, etc.
A construct, too, is a verbal response evoked by objects of the class to
which the concept applies. Some concepts such as temperature, sound, age,
sex, etc. cannot be directly observed, but their instances can be located.
Other concepts such as mental strength, drive, attitude, motivation, etc.,
can neither be directly observed nor can their instances be located as they
are presumed to be located inside the organism, and are called constructs.
A construct is a concept. It has the added meaning, however, of having
been deliberately and consciously invented or adopted for a special scientific
purpose. Constructs play a very important role in theory building. Many
theories such as the memory trace theory, the frustration-aggression theory,
etc. have emerged out of this. Constructs cannot be observed and, thus,
are called non-observables. Constructs are also known as intervening
variables. An intervening variable is a term invented to account for internal
and directly unobservable psychological processes that, in turn, account for
behaviour. In other words, an intervening variable is an in-the-head variable
which cannot be seen, heard, or felt. It is inferred from the behaviour of
the individual. Hostility is inferred from aggressive acts. When we display
aggression, it reflects hostility. Learning is inferred from test scores. We
exhibit learning when we perform well in test scores. Similarly, anxiety
is inferred from skin response, heart beats, etc. When we are nervous or
anxious (maybe at the time of facing an interview, or before the
announcement of a result), the hair on our skin rises, or, our hearts start
beating faster. In research the name for these terms is ‘invented constructs’,
the reality of which is inferred from human behaviour. For example, while
studying the effect of motivation, a researcher is aware that motivation is
an intervening variable, a construct invented by men to account for
persistently motivated behaviour.
2.2.4 Role of Concepts in Research
Concepts play an important role in research. In fact, research cannot be
conducted without concepts. Every research is based on a concept, as
research tries to establish relationship between two concepts, one of which
is dependent on the other. Let us explain this through an example. ‘Vitamins
supplement the growth in babies’ is a topic of research. This is a hypothesis
which needs to be tested (as we are hypothesizing that vitamins supplement
the growth in babies). The statement could be true or false. In this topic
of research, as in any other research, we are dealing with concepts. One
concept that we have identified is ‘vitamins’ and the other concept is
‘growth in babies’. According to the hypothesis, the higher the dose of
vitamins (up to a certain level), the more healthy the growth among babies.
Here, we are dealing, as already mentioned, with two concepts. One
concept, ‘vitamin’, is an independent variable and the other concept,
‘growth in babies’ is a dependent variable. Concepts also help in
understanding the cause and effect relationship in research. Let us take an
example to understand the cause and effect relationship, as well as the
22 role of dependent and independent variables in research. Concepts are used
in all types of research. We have just given examples of the use of concepts Components of Social
Science Research
(which are also variables) in experimental research. We can also examine
the importance, or role of concept in other types of research. In case study
research, for example, the role of concept is equally important. A case study
is one of several ways of doing research, whether it is social science related,
or even socially related. It is an intensive study of a single group, incident,
or community. Case study is a method of exploring and analysing the life
of a single social unit – be it a person, a family, an institution, cultural
group or even an entire single community – these entities are all concepts.
Similarly, concepts are used in historical as well as descriptive research,
as in all types of research we are dealing with individuals, families,
institutions, communities, etc., which are all concepts. Thus, research is
incomplete without concepts.
2.3 OBJECTIVES
Objectives are extremely useful for conducting research. If we take a
broader definition of the term, objective, it is something which is useful
for conducting any action. Every act that we perform has an intention, or
objective. For example, if a mother scolds a child or hits the child, the
action is conducted to achieve a certain goal. The immediate objective,
or goal, is to make the child avoid doing something that is considered
wrong. The long term objective or goal is to discipline the child and make
him differentiate between good and bad. The action of the mother is not
‘research’; it is simply an ‘action’. An example of an objective in social
science research could be on ‘Good parenting skills’ which would be to
make children and parents responsible and knowledgeable. Here, the action
is to provide inputs on good parenting, and the result of this action is
transferring the children from a state of being less responsible and
knowledgeable, to a state where they become more responsible and
knowledgeable.
2.3.1 The Meaning of Objectives
When we talk about an objective or a goal we mean a state which we
would like to achieve in the future. The term, state, is however interpreted,
here, comprehensively. An example of a state would be, ‘to have immigrated
within a ten year period to a particular country’. Here, the objective is
to reach the state of migrating to a particular country, and the period for
achieving that state is ten years. Another example is to have successfully
completed a particular educational course in five years. Here, the objective
is to reach the state of completing a particular educational course, for which
the time period specified is five years. Thus, an objective refers to shifting
from one state to another through a certain process, and, within a certain
time frame. Thus, we describe an objective as, where we want to be at
a particular time. On the other hand, the path to this goal, that is how
we wish to achieve this goal as well as the necessary negotiations and
decisions, is not a part of the objective itself. Neither is the time period
specified for achieving the goal a part of the objective. The dictionary
meaning of objective is, ‘something that one’s efforts or actions are intended
to attain or accomplish’.
Let us take an interesting example. You are all interested in studying this
unit. Why? Here, ‘why’ means what is the objective of studying this unit? 23
Fundamentals of Social The objective of studying this unit is to put in efforts to attain knowledge
Science Research
about the subjects/topics discussed in this unit. When the knowledge is
attained, the student shifts from one state to another – that is the state
of being less knowledgeable about the topics discussed in this unit to a
state of being more knowledgeable about the topics discussed in this unit.
2.3.2 Type of Objectives
Objectives are generally of three types:
1. Broad and specific
2. Long term and short term
3. Measurable and non-measurable
1. Broad and specific objectives
Broad objectives are those which reflect the goal or purpose, but which
do not, in actual terms, specify the goals or purpose or mention in
clear terms, the specific goals or purpose. We can take the example
of a health project, for example. The project is entitled: ‘Improved
Systems for Private and Public Health’. Here the broad objective would
be to improve the health conditions of people. There are, however,
underlying (specific) objectives, whereby, the broad objective, that is,
improving the health conditions of people could be achieved.
Specific objectives, on the other hand, refer to specific goals or purpose
that are to be attained and are mentioned in specific terms in the
statement of objectives. In the example of the health project ‘Improved
Systems for Private and Public Health’, specific objectives would be
more than one. It could be a few or it could be many. For example,
specific objectives could be reduction of disease; it could be provision
of requisite medicines in the Primary Health Centres; it could also
be availability of qualified doctors to cater to the needs of people,
and so on.
2. Long term and short term objectives
The long term objective in the example, above, is improving the health
conditions of people. This is a long term objective, since it cannot
be achieved overnight, or within a short span of time. In this example,
when we refer to people, it could mean the entire population of a
country, a state, a district. It could also mean a limited population
within the country, state, district, or, a specific community within a
block, or, a village. Also, in the same example, when we refer to
‘improving health conditions’, it could include only provision of timely
medicines, or provision of health clinics; it could also mean improvement
of sanitation and supply of safe and potable water. Thus, the long term
objective can only be achieved over a span of, say, 5-10 years.
The short term objective remains confined to the achievement of certain
specific objectives, such as, availability of medicines in a particular
Primary Health Centre, or availability of a doctor in a slum community
on a particular day in a week. These are short term objectives which
can be achieved in a short span of time.
3. Measurable and non-measurable objectives
There are certain objectives which can be quantified or measured. In
24 the example, above, the number of patients attending a Primary Health
Centre every week or every month is a measurable objective. One Components of Social
Science Research
indicator of improving the health conditions of people is more and
more patients turning up at the Primary Health Centre every week or
every month. The non-measurable objective, in this example could be
the improved systems of health in the area. However, even this may
be measurable, but only up to a certain extent, since, although it is
possible to quantify certain indicators of health, not all of them can
be quantified.
2.3.3 Criteria for Judging the Objectives
Certain criteria needs to be laid down for judging the objectives – whether
the objectives have been thoughtfully identified or not. In social research,
it is very important first to understand what are the goals for undertaking
the research/study and then to ascertain that the objectives/goals are
achievable.
The broad criteria for judging the objectives are:
1. It should lead to action
2. It should be important from the context of the research and should
be understood
3. It should be useful and relevant
4. It should be based on logic
5. It should be achievable
1. It should lead to action
Any research is undertaken with a single objective or a set of objectives
in mind. The first criterion for identifying an objective is that it should
lead to action. For example, if the research is on the topic, “Improved
quality of education leads to enhancement of knowledge of children”,
one of the objectives of research is enhancement of knowledge of
children. The objective of research, in this example, leads to action,
since children’s knowledge will increase by the inputs provided through
quality education. The provision of quality education is an action which
is necessary if the objective of enhancement of knowledge of children
is to be achieved.
2. It should be important from the context of the research and should
be understood
The objective of research in the above example is important from the
context of the research that is being undertaken. The most important
aspect of improving the quality of education is enhancement of
children’s knowledge. Improved quality of education may lead to other
things as well, but, the most important aspect is knowledge enhancement.
The objective in this example is also very clear and direct, and is
not presented in a manner which is complex or confusing. As a result,
it will be easily understood by anyone who is associated with this
research.
3. It should be useful and relevant
Let us use the same example to explain our point. In the research
cited above, “Improved quality of education leads to enhancement of 25
Fundamentals of Social knowledge of children”, what are the most significant and useful
Science Research
aspects? The most significant and useful aspects are the improvement
in quality of education and the enhancement in the knowledge of
children. The objective is also relevant in the above research since
the relevance of improvement in the quality of education can only be
judged through the enhancement of knowledge of children.
4. It should be based on logic
Logic is the most important aspect of objective formulation. Any
objective can be formulated only if the researcher has a clear
understanding of the subject matter of research and why it needs to
be conducted. In the above example, once again, the most logical
objective is enhancement of children’s knowledge. There has to be
an application of mind in deciding what could be the best outcome
of an action – improvement in the quality of education. There could
be another objective of this research – increase in the salary of teachers
– but, this would not be the most logical objective of the research.
5. It should be achievable
A very important criterion for the formulation of an objective is that
it should be achievable. In the example given above, the objective,
enhancement in the knowledge of children is achievable. If the quality
of education improves - that is, better teachers are provided to children,
better facilities are provided to children, interaction between teachers
and children are more frequent, parent-teachers meetings are held
frequently, special attention is being provided by teachers to weak
students, and so on, then children’s knowledge is bound to be enhanced.
Thus, the objective is achievable if the inputs provided are of good
and standard quality.
In this section, you have studied the meaning of concepts, types of concepts,
role of concepts, meaning and types of objectives of social science research,
now answer the questions given in Check Your Progress-1.
Check Your Progress 1
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Discuss with your friends and see what they mean by concepts.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. Explain to your friends the meaning of different types of objectives.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
26
Components of Social
2.4 DEFINITION Science Research
Defining a concept is not very different from defining any word. The
objective is to make it very clear to some audience what one is dealing
with. Logically, definitions aim to lay bare the principal features or structure
of a concept, partly in order to make it definite, to delimit it from other
concepts, and, partly, in order to make a systematic exploration of the
subject matter with which it deals.
2.4.1 Meaning and Characteristics of Definition
The dictionary meaning of definition is the act or process of giving an
exact meaning to a concept or word; describing or explaining the scope
of the concept or word; describing or explaining a statement of the meaning
of a concept, or, word or the nature of a thing. All research deals with
concepts which have to be clearly defined. In the absence of a clear cut
definition of a concept, the concept becomes ambiguous and vague, creating
problems for a researcher in carrying out research.
Definitions have the following characteristics:
1. Definitions provide alternate meaning to concepts.
2. Definitions explain the meaning of concepts in clear and explicit terms.
3. Definitions provide reasonable and logical explanations to concepts.
4. Definitions describe concepts in terms which are easily understandable.
5. Definitions use common sense to describe and explain concepts to
make them easily identifiable.
2.4.2 Types of Definitions
Definitions are of two types:
1. Conceptual
2. Operational
1. Conceptual definition
The first step is to define what we mean by any particular concept. Once
that has been done, it will then be possible to develop indicators for that
concept as it has been defined. Conceptual definitions are those which
define a concept. For example, what is the concept of religiousness – or,
how a person can be called more religious or less religious? Similarly,
poverty may be conceptualised in economic terms, perhaps using income
to assess its existence. Similarly, it can be conceptualised in social terms,
using the crime committed in a particular area to assess its magnitude.
There are several dimensions to conceptualise poverty.
1. Poor purchasing power – low capacity to purchase because of low
income.
2. Powerlessness – inability to influence others.
3. Isolation – being cut off from society.
4. Meaninglessness – no purpose of life as basic needs are not fulfilled.
2. Operational definition
Once we are able to specify the different dimensions of a concept, we
will be at a point where we can move from the abstract to the concrete. 27
Fundamentals of Social The operational definition refers to the process through which indicators
Science Research
are developed to measure the concepts – that is, to transform them into
observable phenomena. From each of the four dimensions of poverty, a
set of questions can be developed to operationalize each dimension. For
example, indicators can be developed for the first dimension of poverty,
that is, poor purchasing power. In order to get information on this
dimension, following questions can be asked:
1. What necessities do you purchase for your family’s sustenance?
2. How do you meet the requirements of your child if you are unable
to provide mother’s milk?
3. What are other items apart from food on which you spend money?
The questions identified above indicate one dimension of poverty. Operational
definitions, thus, provide indicators to the conceptual definition of a
concept, in this case, poverty.
2.4.3 Role of Definitions
Social Science Research revolves around the definition of concepts – both
conceptual and operational. If definitions of concepts are not accurate, the
researcher may not be able to understand what the concept is and how
to ask questions related to the concept from the respondents. It is important
for the researcher to understand the concept as it is defined and then conduct
the interview. If this is not done, bias may be reflected from the way the
questions are addressed to the respondents, and from the response of the
respondents, as well.
Concepts like motivation, learning, perception, etc., are research in the
Behavioural Sciences (a branch of the Social Sciences) which need to be
clearly defined conceptually, and understood by the researcher before the
researcher is able to undertake research on these topics. These conceptually
defined concepts, then, need to be made operational (operationally defined)
by devising indicators that can be used to measure these concepts. If the
researcher is unable to understand the conceptual and operational definitions
of these concepts, he or she, will not be able to do justice with research.
It is, therefore, important that definitions of concepts are clearly understood
before they are used by researchers for the purpose of conducting research.
2.5 HYPOTHESIS
What is the purpose of formulating research questions like, “do children’s
performance in school improve on being given special coaching?”, or “Does
regular use of milk make children healthy?” Well, research questions are
formulated with the purpose of seeking answers to these questions. Two
approaches to an answer are used in research and both require an
examination of facts. The first approach is to gather appropriate factual
data and examine them to see whether an answer to the research question
can be formulated. The second approach uses prior research and thinking
to come up with one or more hypothesized answers to the research
question. The answer thus sought is called a hypothesis. Then, research
is designed to gather factual data aimed at testing whether the hypothesized
answers are false. If they are not falsified by the factual data, then the
hypothesized answers are presumed to be reasonable answers to the initial
28
research question for the time being, as further research may show that Components of Social
Science Research
the hypotheses are false in some situations.
2.6 VARIABLES
Concepts such as ‘total family income’ are ideas an investigator has about
important characteristics of some entity such as a family. The concept must
be clarified and defined, preferably explicitly, so that researchers can
understand and share the phenomenon that is being studied. The concept
of ‘total family income’ is defined to have a range of possible values. Thus,
it is called a variable in a given piece of research.
2.6.1 Meaning of Variables
A variable is an indicator of some defined concept or characteristic of a
case. For example, a response to the question, “What is your age?” is a
variable that can be used as indicator of the concept age. The indicator
of the concept age can be: less than one year; between 1 and 5 years;
between 5 and 10 years; between 10 and 15 years; and, so on. The indicator
of the concept age can be the actual age in numerical terms or the date
of birth (date, month, and year). A variable may also be defined as a property
that takes on different values, as many measurable attribute of objects,
things or beings. Examples of variables could be any concept such as age,
income, community, intelligence, motivation, etc. The term variable more
directly expresses the quantitative meaning. It means, ‘whatever varies’.
The most intricate variations can be expressed in terms of numbers which
are capable of indefinite divisions. A variable has, accordingly, been defined
as ‘a symbol to which numerals or values are assigned.
2.6.2 Types of Variables
Variables are of different types
1. dependent and independent variables
34
UNIT 3 RESEARCH DESIGNS
Structure
3.1 Introduction
32. Meaning, Concept and Importance of Research Design
3.3 Functions of Research Design
3.4 Types of Research Design
3.5 Let Us Sum Up
3.6 References and Selected Readings
3.7 Check Your Progress – Possible Answers
3.1 INTRODUCTION
This unit focuses on issues involved in designing a study. Any social science
study involves several components which need to be integrated in a coherent
and unambiguous way so as to ensure that the research problem is addressed
effectively. It is the function of the research design to ensure that the
research problem is addressed in an effective and unambiguous way. One
of the common mistakes made by most of the researchers is that they begin
their research without giving proper attention and importance to research
design. A proper research design would be a blue print in the hands of
the researchers and help in guiding the researcher about the collection,
measurement and analysis of data. This unit would cover various aspects
and types of a research design. After reading this unit you will be able
to:
l Describe the meaning and concept of a research design
l Discuss the importance of a research design
l Elaborate upon the functions of a research design
l Explain the different types of research design
Finding subjects for study with similar characteristics can be difficult. The
results being static cannot be explained in temporal context. These studies
cannot be used to establish cause and affect relationships. There is no follow
up to the findings.
6. Descriptive Research Design
Descriptive research designs are the best methods for collecting information
that will demonstrate relationships or describe the current status of the
phenomenon as it exists. Descriptive studies involve surveys or interviews
to collect the necessary information. Descriptive studies typically provide
answers to questions related to who, what, when, where and how, however,
they cannot provide answers to why.
Advantages:
Descriptive studies observe the subjects in their natural environment as they
exist and are generally precursors to qualitative research designs. They are
useful tools in more focused studies.
Caution:
The results from descriptive studies cannot be used to prove or disprove
a hypothesis. The results of descriptive studies cannot be replicated as they
are based on observational methods.
7. Experimental Design
In the simplest forms of experimental designs, two groups are created which
are equivalent to each other. Then one of the groups, also called the
treatment group gets a programme and the other group, also called the
control group does not get the programme. These two groups are similar
in all other aspects. Then the differences are observed in both the groups.
Since both the groups are supposed to be equivalent in all the aspects,
the difference arising between the two is considered to be due to the only
difference between the two groups which is that one of the groups get
the programme.
Advantages:
If people are assigned to the two groups on a random basis from a common
pool and if the sample size is large enough to achieve probabilistic
equivalence between the two groups, the experimental research designs are
capable of achieving strong internal validity. Since the experimental designs
allow researchers to control a situation, it also allows researchers to
establish a cause and effect relationship.
Caution:
Since the experimental designs are artificially established, the results may
not generalize well to the real world. Also, the artificial setting of the
experiments may alter the response of the participants. Experimental designs
requiring special equipments may be a costly affair. There may also be
ethical considerations to be addressed in experimental research designs.
8. Exploratory Research Design:
Exploratory research designs are used when the researcher has an
understanding or observation about something and seeks to learn more about 41
Fundamentals of Social it. It is used in research problems in which there are few or no earlier
Science Research
studies to refer to or rely upon to predict an outcome. Exploratory designs
are undertaken with one or more of the following goals:
l To develop familiarity with basic details, settings, and concerns.
l To develop a well grounded picture of the existing situation
l Generation of new ideas and assumptions.
l Development of tentative theories or hypotheses.
l To determine the feasibility of the study in future.
l To lay the groundwork which will lead to future studies.
l Direction for future research and techniques get developed.
Advantages:
It helps in gaining background information on a particular topic. Exploratory
research designs are undertaken with one of the two basic purposes to define
new topics or a new angle or to clarify the existing concepts. These research
designs help in generating a formal hypothesis or in developing a more
precise research problem. Exploratory designs are useful in research
prioritization or resource allocation.
Caution:
As the sample size used in exploratory designs are small, they cannot be
used to make generalizations about the population. They do not provide
a definitive conclusion for a research problem. The outcome of exploratory
researchers may not be of much value to the decision makers. As compared
to other designs, this design lacks the rigorous standards of data collection
and analysis.
9. Historical Research Design
Historical research typically involves studying and interpreting past events
to predict the future ones. Historical research typically involves Collection,
verification and synthesis of evidences from the past so as to establish
or refute a hypothesis. It uses both primary and secondary data sources
including documentary evidences such as diaries, reports, archives, official
records etc.
Advantages:
Historical researches are unobtrusive and do not affect the result of the
study. Historical approach is typically suited for trend analysis. There is
no possibility of interaction between the researcher and the subject which
could affect the findings of the research.
Caution:
The fulfilment of the objectives of such researches is based upon the
availability and quality of documents related to the research problem.
Interpreting historical researches can be very time consuming. Historical
researches are weak in terms of the demands of internal validity. Since
the availability of entire documents needed to fully address a historical
research is rare, such gaps need to be acknowledged in the report.
10. Longitudinal Research Design
In a longitudinal study, the sample is followed over time to make repeated
42 observations. In longitudinal studies, the same group or sample is tracked
over time and the changes observed. The researcher then relates them to Research Designs
variables that might have caused the changes. They are useful in studies
required to establish causal relationship and help in showing both the
magnitude and direction of causal relationships.
Advantages:
Longitudinal studies help in identifying the duration of the particular
phenomenon being studied. These studies help in measuring and analysing
the pattern of change on different variables over time. They also facilitate
predictions to be made on the basis of earlier factors.
Caution:
Large sample size is needed to be able to accurately explain the causal
relationships. It may also be difficult to maintain the integrity of original
sample over time. The data collection method may change over time and
it may also be difficult to show more than one variable at a time.
Longitudinal research is based on the assumption that the present trend
will continue which may not hold true all the time.
11. Meta-analysis Design
Meta analysis is a technique designed for combining results from independent
studies. It involves evaluation and summarizing of results from a number
of individual studies thus increasing the overall sample size and also
increasing the ability of researcher to study the effects of interest. This
type of study uses synoptic reasoning to develop a new understanding of
the research problem. Meta-analysis includes analysing differences in the
results among studies and increasing the precision by giving due weight
to the size of different studies included. The validity of meta-analysis
depends on the quality of systematic review on which it is based. A well
designed meta-analysis depends upon strict adherence to the criteria used
for selecting studies and the availability of information in those studies
to be able to properly analyse their findings. Larger the heterogeneity in
the results of the individual studies, the more difficult it becomes to
establish validity of the synopsis of the results.
Advantages:
Meta-analysis can be used as a strategy to determine gaps in already existing
literature. Such studies help in review if researches on a particular topic
over a long period of time and published in various sources. They also
help in overcoming the problem of small sample in individual studies. They
are generally used to generate new hypothesis or suggest problems for future
studies.
Caution:
Large sample may yield reliable but not necessarily valid results. Synthesis
of a heterogeneous collection of studies in terms of literature reviewed,
methods applied or measurements of findings may be difficult. The whole
process can be quite time consuming and small violations in criteria used
for analysing the studies may lead to meaningless findings.
12. Observational Design
This type of research design draws a conclusion by comparing subjects
against a control group; however, the allocation of treatment is not fully 43
Fundamentals of Social under the control of the researcher. The main difference between experimental
Science Research
and observational research is the lack of manipulation of independent or
causal variable in case of observational research design. The researcher
simply observes the naturally occurring values of the independent and
dependent variables and then uses techniques to find out if they co-vary.
This design helps ensure that there are no false responses that might be
introduced by manipulative research procedures. An observational study
allows a useful insight into a phenomenon and avoids the ethical and
practical difficulties of setting up a large and cumbersome research project.
Observational designs are of two types: direct observations in which
subjects know that the researcher is watching them and Unobtrusive
observations in which the individuals do not know that they are being
observed.
Advantages:
Observational designs are flexible and do not necessarily need to be
structured around a hypothesis. The researcher is able to do an in depth
investigation about a particular behaviour and hence helps in understanding
the interrelation among various dimensions of the study. As the variables
being studied are allowed to operate without any intervention in case of
observational designs, the external validity is often very high. The results
of observational research can be generalised in real life situations.
Caution:
The internal validity of observational research designs is low. Reliability
of the data is low as they are difficult to replicate. The findings may pertain
to a unique sample population and cannot be generalised. The study may
have an element of the researcher’s bias. Since nothing is manipulated,
a cause and effect relationship cannot be established. The presence of a
researcher in case of direct observations may lead to skewed data.
13. Philosophical design
The philosophical researches challenge the assumptions underlying a
particular study. Arguments derived from philosophical theories, models,
concepts or traditions form the tools for this type of studies. Such studies
can basically be of three types:
l Ontology — the study that describes the nature of reality; for example,
what is real and what is not, what is fundamental and what is
derivative?
l Epistemology — the study that explores the nature of knowledge; for
example, by what means does knowledge and understanding depend
upon and how can we be certain of what we know?
l Axiology — the study of values; for example, what values does an
individual or group hold and why? How are values related to interest,
desire, will, experience, and means-to-end? And, what is the difference
between a matter of fact and a matter of value?
Advantages:
This type of research helps the researcher in gaining greater self understanding
about the purpose of research. Philosophical studies help to refine concepts
and theories that are invoked in relatively unreflective modes of thought
44 and discourse thus offering clarity to the terms, concepts and ideas.
Caution: Research Designs
45
Fundamentals of Social
Science Research 3.5 LET US SUM UP
In this unit, we discussed the meaning, purpose and features of research
designs and found that a good research design is possible through different
phases. Research design, however, depends on research purpose, and is
bound to be different in the case of exploratory or formulative studies from
other studies, such as descriptive or diagnostic ones. Each type of research
design, however, does not suit all categories of designs and for each
category of research, separate types of designs will be needed. The
researcher must decide in advance of collection and analysis of data as
to which design would prove to be more appropriate for his/her research
project. The researcher must give due weight to various points such as
the type of universe and its nature, the objective of the study, the resource
list or the sampling frame, desired standard of accuracy, and the like, when
taking a decision with respect of the design for the research project.
47
Fundamentals of Social
Science Research UNIT 4 RESEARCH PROJECT
FORMULATION
Structure
4.1 Introduction
4.2 Steps in the Formulation of a Research Project Proposal
4.3 Title of Research Project
4.4 Problem Statement
4.5 Review of Literature
4.6 Objectives of Research
4.7 Methodology
4.8 Work Schedule/Time Frame
4.9 Budget
4.10 Dissemination Strategy
4.11 References
4.12 Let Us Sum Up
4.13 References and Selected Readings
4.14 Check Your Progress – Possible Answers
4.1 INTRODUCTION
Research project formulation is one of the important tasks of the students
pursuing various programmes in Development Studies. The research project
formulation comprises many steps starting from the choosing of a research
topic up till the budget and time line fixation. This unit deals with the
details of the various steps involved in the research project formulation.
After reading this unit, you should be able to formulate a research project
proposal by completing the following steps
l identification, analysis and description of a research problem
l review of relevant literature and other available information
l formulation of research objectives
l development of an appropriate research methodology
l preparation of a work plan for the study
l identification of resources required and preparation of a budget
l development of a strategy for distribution and utilisation of research
results
Researchable Questions
Narrowing and clarifying
Generating Questions
Finding research issues
Defining a Topic
Using creativity and curiosity-
considering practicalities
In this section, you have studied the various steps adopted for the formulation
of research project, the title of a research project and problem statement,
now answer the questions given in Check Your Progress-I .
53
Fundamentals of Social Check Your Progress 1
Science Research
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Define research and write the different components of a research
project proposal.
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
2. What criteria should a researcher take into account while choosing
a research topic?
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
What do I know
about this topic? Exploring Background reading for your own learning, using a broad
How can I find a Topic array of materials, including popular media, texts, journals,
out more? etc.
What should I
read in order to Developing a Need for more in depth engagement in research literature
develop an Research so that questions relevant and significant to the field can be
appropriate Question developed.
question?
How do I
develop a Background/contextual readings that put the significance of
convincing Articulating
a Rationale the research in a broader societal/scientific context.
rationale for this
study?
What theory or
theories will Informing Reading of contemporary and/or classic
inform my your Study theorists/ theory - may be directly related to the topic or may
study? with Theory more broadly inform your thinking approach.
What do I need
to read in order A review of past studies can inform your choice of method.
Designing
to design/ apply Background reading relating to specific methods may also
Method
suitable be necessary for your learning.
methods?
What research
has already been Writing a Need for thorough and critical review of the research
conducted in Literature studies that have been conducted on this and similar topics
this area that Review – generally journal based.
will inform my
review?
Developing
your question Ensuring
Using Keeping adequate
available track of coverage
resources references Arguing
your
rationale
Writing
Honing your Writing Informing purpose-fully
search skills relevant your study
annotations with theory
Working on
Designing style and
method tone
58 ................................................................................................................
Research Project
4.7 METHODOLOGY Formulation
59
Fundamentals of Social
Science Research
4.7.1 Data Collection
The different types of data that are proposed to be collected should be
specifically mentioned. The sources for each type of data and the tools
and techniques that will be used for collecting different types of data should
be specified.
a) For the questionnaire or schedule to be used, the following points
should be covered.
(1) Distribution of questionnaire or schedule in different sections, e.g.
identification data, socio-economic data, questions on various sub-
themes.
(2) The approximate number of questions to be asked from each
respondent.
(3) The scaling techniques to be included in the instrument.
(4) The approximate time needed for the interview.
(5) Plans, if any, for index construction.
(6) A coding plan (whether the questions and responses will be pre-
coded or not; whether the coding is done for computer, or for
hand tabulation).
b) For interviews, provide the following details.
(1) How they are to be conducted (free associational, non-directive,
focused, direct or on telephone)?
(2) Particular characteristics that interviews must have.
c) Describe how the observation technique will be used.
(1) Provide the type of observation: participant, quasi-participant, non-
participant.
(2) Provide the units of observation.
(3) Will this be the only technique or will other techniques also be
employed?
4.7.2 Editing
During editing, look for
l completeness of responses. (Note that a blank space may mean ‘no
response’ or ‘don’t know’, unless you’ve made a category for each
of these responses)
l logical inconsistencies, correcting them whenever possible
l the possibility of combining responses, if that is more suitable for
analysis.
Editing should be done by the research team, or under its direction. If
several persons are involved in editing, as in the case of large surveys,
an editing manual should be compiled beforehand.
4.7.3 Training of Research Team Members
The research team including, in particular, research assistants who join just
before the pre-test, must be given explicit training. They should not only
60
be able to collect data properly but also understand other procedures such Research Project
Formulation
as the selection of sampling units, map reading and data handling. They
may also be involved in the pre-test and in the adjustment of instruction
sheets and data collection tools after the pre-test.
The training programme usually consists of
l discussion on the objectives and methodology of the study
l reading of manuals or instruction sheets prepared for the study
l interview training
l field experience (this should include participation in the pre-test
described below)
l discussion on data-collection tools and instruction sheets and how they
need to be adjusted (based on field-testing).
The research assistants should be trained together with the whole team,
including possible additional supervisors.
4.7.4 Pre-test or Pilot study of the Methodology
A pre-test usually refers to a small-scale trial of particular research
components. The pre-test should assess the validity of the data-collection
instruments and procedures, as well as the sampling procedures. A pilot
study is the process of carrying out a preliminary study, going through the
entire research procedure with a small sample. A pre-test or a pilot study
serves as a trial run that allows us to identify potential problems in the
proposed study. Although this means extra effort at the beginning of a
research project, the pre-test and/or pilot study enables us, if necessary,
to revise the methods and logistics of data collection before starting the
actual fieldwork. As a result, a good deal of time, effort and money can
be saved in the long run. Pre-testing is simpler and less time consuming
and costly than conducting an entire pilot study.
What aspects of your research methodology can be evaluated during
pre-testing?
1. Reactions of the respondents to the research procedures can be
observed in the pre-test to determine
l availability of the study population and how respondents’ daily
work schedules can best be respected
l acceptability of the methods used to establish contact with the
study population
l acceptability of the questions asked
l willingness of the respondents to answer the questions and
collaborate with the study.
2. Data collection tools can be pre-tested to determine the following
points
l Pre-tests show you whether the tools you use allow you to collect
the information you need, and whether those tools are reliable.
You may find that some of the data collected is not relevant to
the problem, or, is not in a form suitable for analysis. This is 61
Fundamentals of Social the time to decide not to collect this data, or, to consider using
Science Research
alternative techniques that will produce data in a more usable form.
l How much time is needed to administer the interview guide or
questionnaire, to conduct observations, or group interviews, and/
or to make measurements?
3. Whether there is any need to revise the format or presentation of
interview guides/ questionnaires, including whether:
l the sequence of questions is logical
l the wording of the questions is clear
l translations are accurate
l space for answers is sufficient
l there is a need to pre-categorise some answers or to change closed
questions into open- ended questions.
l there is a need to adjust the coding system
l There is a need for additional instructions for interviewers (e.g.,
guidelines for ‘probing’ certain open questions).
4. Sampling procedures can be checked to determine
l whether the instructions concerning how to select the sample are
followed in the same way by all staff involved
l how much time is needed to locate individuals to be included
in the study.
5. Staffing and activities of the research team can be checked, while all
are participating in the pre-test, to determine
l how successful the training of the research team has been
l what the work output of each member of the staff is
l how well the research team works together
l whether logistical support is adequate
l the reliability of the results when instruments or tests are
administered by different members of the research team
l whether staff supervision is adequate
The project should be broken up in suitable stages and the time required
for the completion of each stage of work should be specified. Such stages
may cover
l preparatory work, including selection and appointment of staff and their
training
l literature review
l data collection from secondary sources
l preparation of questionnaire and pre testing
l pilot study, if any
l drawing of sample
l data collection from primary sources
l data processing (which should include coding, editing, verification,
sorting, and computer analysis)
l data analysis
l report writing
The timetable should allow adequate time for each stage to carry the project
through to completion. A work schedule is a table that summarises the
tasks to be performed in a research project, the duration of each activity,
and who is responsible for the different tasks.
The work schedule includes
l the tasks to be performed
l the dates each task should begin and be completed
l research team, research assistants and support staff ( typists) assigned
to the tasks
l person-days required by research team members, research assistants
and support staff.
4.9 BUDGET
The budget should specify the resources needed to carry out all the tasks
specified above. The budget for research work is essentially a document
of its expenses. While constructing a budget, one should adopt a bottom
up approach. For example, if fieldwork is involved, the budget for that
item should be calculated by estimating the number of trips required, their
duration and mileage, and so on.
The research proposal is required to outline the capital and running costs
together with the hidden costs such as the use of already existing
laboratories, libraries and computer facilities and technical and secretarial
help, in addition to the costs of travel of researchers and subjects. A portion
of the proposed budget should be reserved for the unforeseen costs. A fully
itemized budget is necessary as granting bodies require a detailed breakdown
of the costs of the projects. The golden rule is not to ask for too much
or too little. It is wise to find out in advance the likely figure a particular
granting authority will allow for a work of the type proposed (which 63
Fundamentals of Social provides a ceiling for the budget). A narrative portion of the budget is
Science Research
used to explain any unusual items in the budget. If the costs are
straightforward no explanation is needed. If the narrative is needed it can
be structured in two ways.
1. Create “notes to the budget” with footnote style numbers on the line
items in the budget, keyed to the numbered explanations.
2. Or, if extensive or more general explanation is required, the budget
narrative can be structured as straight text.
The cost of the project is to be estimated in terms of total man-month
and the facilities needed. Calculate it under the headings that follow.
(1) Personnel
(2) Travel
(3) Data Processing
(4) Stationery and printing
(5) Equipment
(6) Books and journals etc.
(7) Contingency expenses including postage
(8) Any other (specify)
(9) Overhead charges
(10) Grand total
While suggesting budget estimates for your research proposal, the project
director should take into account the time budget, as well as various steps
involved in the conduct of the research proposal. The rationale for the
allocation of time and money for the various items of budget estimates
must be furnished.
4.9.1 How Should a Budget be Prepared?
It is necessary to use the work plan as a starting point. Specify, for each
activity in the work plan, what resources are required. Determine these
for each resources needed, as well as the unit cost and the total cost.
Example, in the work plan of a study to determine factors affecting farmers’
suicides in Andhra Pradesh, it is specified that 5 field investigators will
each visit 20 households, one per working day, as 100 households of
deceased farmers will need to be visited. With two vehicles this amounts
to 20 working days. Each research team member will be accompanied by
one of the research assistants. The budget for the fieldwork component
of the work plan will include funds for personnel salary, transport expenses,
accommodation and daily allowances as outlined below.
(a) Remuneration
(i) Two Field Investigators for 20 days
(ii) Two Field Assistants for 20 days
(b) Expenses for hiring two vehicles for 20 days
(i) Daily Allowance for the two field investigators for 20 days and
two field assistants for 20 days
64
(c) Hotel accommodation expenses Research Project
Formulation
(i) Two Field Investigators for 20 days
(ii) Two Field Assistants for 20 days
4.11 REFERENCES
You always need to reference all the literature that you refer to in your
proposal. When you use the Vancouver system, you will use consecutive
numbers in the text to indicate your references. At the end of your proposal
you will then list your references in that order, using the format described
above. If your research proposal has annexure, the references will come
before the annexure.
Activity 1: Prepare a research proposal for assessing the impact
of air pollution on the health of inhabitants in Delhi.
Alternatively, you can use the Harvard system and refer to the references
more fully in the text, putting the surname of the author, year of publication
and number(s) of page(s) referred to between brackets, e.g., (Basu 1998:15-
17). If this system of citation is used, the references at the end of the
proposal should be listed in alphabetical order. The Harvard author/date
system of referencing seems easier, as you can change the order of
paragraphs without consequences for your referral system. However at
present, computers have programmes that change the numbers of your
references automatically if you reshuffle the text while using the Vancouver
system.
In this section, you studied methodology, work schedule/time frame, budget
and dissemination strategy, now answer the questions given in Check Your
Progress-3.
65
Fundamentals of Social Check Your Progress 3
Science Research
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Write short notes on the following: a) pilot study and b) pre-testing.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. Write the names of some data collection tools and techniques in social
science research?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
ANSWERS
Check Your Progress 1
1. Research is the systematic collection, analysis, and interpretation of
data to answer a certain question or solve a problem. The different
components of a research project proposal are (i) general introduction/
background/ importance of study problem; (ii) review of relevant
literature; (iii) objectives; (iv) methodology; (v) time frame; (vi)
budget, and (vii) references
2. Each topic that is proposed for research has to be judged according
to certain guidelines or criteria. There may be several ideas to choose
from. The guidelines or criteria for selecting a research topic are:
relevance, avoidance of duplication, urgency of data needed, political
acceptability of the study, applicability of results, and ethical acceptability.
Check Your Progress 2
1. The functions of the literature review in the research process are
a. Ensures that work is not duplicated
b. Give credit to those who have laid the groundwork for your
research
c. Demonstrate knowledge of the research problem
d. Demonstrates your understanding of the theoretical and research
issues
e. Show an ability to critically evaluate relevant literature information
f. Indicates the ability to integrate and synthesize the existing
literature
2. The following are some of the important points to formulate research
objectives:
l To be clearly related to the statement of the problem.
l To be clearly phrased in operational terms, specifying exactly what
you are going to do, where, and for what purpose.
l To be realistic considering local conditions and available resources.
l To use specific action verbs (i.e., to determine, to identify, etc.)
and to avoid vague non-action verbs (i.e., to study, to appreciate,
to understand, etc.).
Check Your Progress 3
1. a) Pre-test refers to a small-scale trial of specific research components.
The pre-test should assess the validity of the data collection instruments
and procedures, as well as the sampling procedures. b) A Pilot Study
is the process of carrying out a preliminary study, going through the
entire research procedure with a small sample.
2. In social science research, various types of techniques and tools are
used for data collection. But most the important tools and techniques
for data collection are: a) questionnaire and schedule b) interviews
67
c) observation techniques, etc.
MDV-106: RESEARCH METHODOLOGY
IN DEVELOPMENT STUDIES
(6 CREDITS)
BLOCK UNIT UNIT TITLES
NOS
1 Fundamentals of Social Science Research
1 Social Science Research: An Overview
2 Components of Social Science Research
3 Research Designs
4 Research Project Formulation
2 Development Research
1 Basics of Development Research
2 Methods of Development Research
3 Development Research Applications
3 Measurement and Sampling
1 Measurement
2 Scales and Tests
3 Reliability and Validity
4 Sampling
4 Data Collection
1 Quantitative Data Collection Methods and Devises
2 Qualitative Data Collection Methods and Devises
3 Data Sources
5 Data Analysis
1 Overview of Statistical Tools
2 Use of Computer in Data Analysis
3 Data Processing and Analysis
4 Report Writing
MDV-106
Research Methodology
in Development Studies
Indira Gandhi National Open University
School of Extension and Development Studies
Block
2
DEVELOPMENT RESEARCH
UNIT 1
Basics of Development Research 5
UNIT 2
Methods of Development Research 22
UNIT 3
Development Research Applications 38
PROGRAMME DESIGN COMMITTEE
Prof. Amita Shah Prof. P. Radhakrishan
Gujarat Institute of Development Research, Madras Institute of Development Studies, Chennai
Ahmedabad Prof. Ramashray Roy (Rtd)
Prof. S. K. Bhati Centre for Study of Developing Societies, Delhi
Jamia Millia Islamia, New Delhi Prof. R. P. Singh ( Rtd)
Prof. J. S. Gandhi (Rtd) Ex-Vice-Chancellor, MPUAT, Udaipur
Jawaharlal Nehru University, New Delhi Prof. K. Vijayaraghavan
Prof. Gopal Krishnan (Rtd) Indian Agriculture Research Institute, New Delhi
Punjab University, Chandigarh Dr. Nilima Shrivastava
Prof. S. Janakrajan IGNOU, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai IGNOU, New Delhi
Prof. Kumar B. Das Dr. Nehal A. Farooquee
Utkal University, Bhubaneswar IGNOU, New Delhi
Prof. Nadeem Mohsin (Rtd) Dr. P.V.K.Sasidhar
A.N.Sinha Institute of Social Sciences, Patna IGNOU, New Delhi
PROGRAMME DESIGN COMMITTEE (REVISED)
Prof. T.S. Papola Prof. Nadeem Mohsin (Rtd)
Institute for Studies in Industrial Development, A.N.Sinha Institute of Social Sciences, Patna
New Delhi Prof. Rajesh
Prof. S. Janakrajan University of Delhi, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai. IGNOU, New Delhi
Prof. S. K. Bhati Prof. Nehal A. Farooquee
Jamia Millia Islamia, New Delhi IGNOU, New Delhi
Prof. Preet Rustagi Prof. P.V.K. Sasidhar
Institute for Human Development, Delhi IGNOU, New Delhi
Prof. Gopal Iyer (Rtd) Dr. Pradeep Kumar
Panjab University, Chandigarh IGNOU, New Delhi
Dr. S. Srinivasa Rao Dr. Nisha Varghese
Jawaharlal Nehru University, New Delhi IGNOU, New Delhi
Dr. S. Rubina Naqvi Dr. Grace Don Nemching
Hindu College, University of Delhi, Delhi IGNOU, New Delhi
COURSE PREPARATION TEAM
Unit Writer: Editors:
Prof. .N. Narayanasamy Prof. V.K.Jain (Rtd) (Content Editor)
Gandhigram Rural Institute - Deemed University NCERT, New Delhi
Gandhigram (Units 1, 2 and 3). Mr. Praveer Shukla (Language Editor) New Delhi
Dr. Nisha Varghese, IGNOU
Prof. B. K. Pattanaick, IGNOU
Prof. Nehal A. Farooquee, IGNOU
Prof. P.V.K. Sasidhar, IGNOU
4
UNIT 1 BASICS OF DEVELOPMENT
RESEARCH
Structure
1.1 Introduction
1.2 Concept and Purpose of Development Research
1.3 Approaches to Development Research
1.4 Types of Research in Development Work
1.5 Conditions and Key Principles of Development Research
1.6 Community Consultation and Development Research
1.7 Conventional Research and Development Research
1.8 The Roles Required of a Development Researcher
1.9 The Outcomes of Development Research
1.10 The Limitations of Development Research
1.11 Let Us Sum Up
1.12 Key Words
1.13 References and Selected Readings
1.14 Check Your Progress - Possible Answers
1.1 INTRODUCTION
We have seen in the previous block that research is an ‘organized inquiry’.
It is a systematic and logical study of an issue, a problem, or a phenomenon
through scientific methods. We have also seen that social sciences research
uses various methods, such as surveys, case studies, experiments, observations,
and so on. They are all time-tested methods and are widely used. They
have inherent merits and strengths. Yet, they do have certain limitations
and shortcomings, especially when they are applied in development-related
issues and initiatives especially at the micro level. The limitations include:
i) top-down orientation and approach with limited scope for the participation
of the respondents or subjects of research; ii) the use of one or two methods
of enquiry (mono-method bias) which invariably fails to capture the
complexities of various issues at the grassroots level; iii) rigidity, or a
blueprint approach adopted in the research process restricting the researcher
to see beyond what is in the design; iv) research that take a long time
to complete, making the data and information out of date; and v) absence
of avenues to share the findings of the research with the subjects of research
resulting in the denial of opportunities for the respondents to know the
outcome of research. These limitations act as barriers and impediments in
taking the benefits of development initiatives and interventions to the
people, especially at the grassroots level. Experience has shown that the
people, especially the deprived, underprivileged, and marginalized sections
of society are excluded from the process of development. Hence, the
development facilitators, in order to enable the people to be part and parcel
of development process, have come out with alternative ways of doing
research which include improvisation of the existing methods and innovation 5
Development Research of new methods. These methods are employed in different stages/phases
of development such as the appraisal of existing situation, identifying and
prioritizing the problems and needs, finding solutions, preparing development
proposal, implementing, monitoring and evaluating projects, and learning
lessons from the process, and reflecting on them for future guidance. The
outcome of all these efforts is the emergence of the concept of development
research.
After studying this unit you should be able to
l explain the meaning, concept and purpose of development research
l differentiate the types and approaches to development research
l describe the key principles of development research
l list the differences between conventional research and development
research
l identify the roles required of a development researcher
l explain the outcome and limitations of development research.
14
Table 1.1: Distinctions between Development research and Basics of Development
Research
Conventional research
Point of Conventional Development
Reference Reference Research
Assumptions Assumptions about reality Assumption of multiple
about reality Assumption of singular, realities that are socially
tangible reality. constructed.
Scientific Scientific method is Scientific method of
method reductionist and positivist; holistic and post-positivist;
complex world split into local categories and
independent variables and perceptions are central;
cause-effect relationships; subject and method-data
researchers’ perceptions are distinctions are blurred.
central.
Strategy and Investigators know what Investigators do not know
context of they want; pre-specified where the research will
inquiry research plan or design. lead; it is an open-ended
Information is extracted learning process.
from respondents or Understanding and focus
derived from controlled emerges through
experiments; context is interaction; context of
independent and controlled. inquiry is fundamental.
Who sets Professionals set priorities Local people and
priorities? together professionals
Professionals control and Professionals enable and
motivate clients from a engage in close dialogue;
Relationship
distance; they tend not to they attempt to build trust
between all
trust people (farmers, rural through joint analysis and
actors in the
people etc.) who are simply negotiation; understanding
process
the object of inquiry. arises through this
engagement, resulting in
inevitable interactions
between the investigator
and the ‘objects’ of
research.
21
Development Research
UNIT 2 METHODS OF DEVELOPMENT
RESEARCH
Structure
2.1 Introduction
2.2 Methods Of Development Research
2.3 Direct Observation
2.4 Interviewing
2.5 Focus Group Discussion
2.6 Case Study
2.7 Process of Development Research
2.8 Let Us Sum Up
2.9 Key Words
2.10 References and Selected Readings
2.11 Check Your Progress - Possible Answers
2.1 INTRODUCTION
You have studied in the previous unit that research has become established
in recent years as a powerful tool for development workers. Development
workers widely differ (from conventional researchers) in their orientation
towards the use of research, as they intend it to be mostly participatory
and utility-oriented in approach, rather than research merely for the sake
of adding to the existing body of knowledge in a given field. Therefore,
research for development is utility-oriented. It is not pure, or basic research
as it is technically known. It is applied or application-oriented in approach.
Development research aims at providing better understanding of a given
context or phenomenon, so as to devise appropriate interventions. The
knowledge generated through development research usually enhances the
way development practice takes place. It expands avenues for better
comprehension of issues and intricacies, which results in enhancing the
effectiveness of development practice or social action.
After studying this unit, you should be able to:
l analyse the various methods of development research
l describe in detail the meaning, features, merits and limitations of
development research methods
l explain the process of development research
2.4 INTERVIEWING
Interviewing and dialogue skills are quite essential in development research.
This is all the more important in rural development work where the outside
professionals’ level of information and understanding about the rural
situation are limited. Interview, in true sense, means dialogue. Dialogue
is based on people sharing their own perceptions of a problem, offering
their opinions and ideas on the issues or research questions in hand. A
series of interviews or a chain of interview with key informants help us
gain more accurate insights into rural situations, problems, customs,
practices, systems, values, and the way the rural people think, act, and
perceive things. Good interviewing and dialogue facilitate an information
flow that is true, authentic and relevant.
While interviewing is basically about asking questions and receiving
answers, there is much more to it than that, especially in a development
research context. The most common type of interviewing is individual, face-
to-face verbal interchange, but it can also take the form of face-to-face
group interviewing and telephone surveys. Interviewing can be structured,
semi structured, or unstructured. It can be used for the purpose of
measurement or its scope can be the understanding of an individual or
a group perspective. An interview can be a one-time brief exchange, say
five minutes over the telephone, or it can take place over multiple, lengthy
sessions, sometimes spanning days, as in life history interviewing.
Interviewing can be used in focus group discussions (FGDs) and during
direct observations (DO) as well. It can be fully structured or semi
structured. Semi structured interviewing in itself has emerged as a skill 25
Development Research among development professionals involved in development research. Direct
observation always involves dialogue or interviewing. No observation can
take place in a dump-found situation. Passive watching cannot be construed
as observation. Observation involves an inquisitive and alert mind. It
involves a questioning mind. Direct observation is not passive watching,
it involves questioning in an attempt to understand the phenomenon or
subject being observed. Therefore, direct observation is also considered as
a technique in interviewing.
Similarly, FGD as the method itself suggests, involves dialogue/interview.
Often during DO and FGD, checklists are used for interviewing. When
the checklist contains only the possible lines of questions (focused to the
core of the research), the method is known as semi structured, and not
fully structured. Interviewing progresses based on subsequent responses.
Interviewing is one of the major methods of data collection in development
research. It is defined as ‘a two-way systematic conversation between an
investigator and an informant, initiated for obtaining information relevant
to a specific study’. It involves conversation as well as learning from the
respondent’s gestures, facial expressions, pauses, and body language, etc.
It is useful for collecting a wide range of data from factual demographic
data to highly personal and intimate information relating to a person’s
opinions, attitudes, beliefs, past experiences and future intensions.
Interviewing is often superior to other data-gathering methods, because
people are more willing to talk rather than write. It permits probing into
the context and reasons for answers to questions.
2.4.1 Types of Interviews
Interviews may be broadly classified into the following categories.
(i) Structured or Directive Interview
(ii) Unstructured or Non-Directive Interview
(iii) Semi-Structured Interview (SSI)
(i) Structured Interview is an interview made with a detailed standardised
schedule. The same questions are put to all the respondents and in
the same order. Each question is asked in the same way in each
interview, promoting measurement reliability. This type of interview
is used for large scale formalised surveys.
(ii) Unstructured Interview is the least structured one. The interviewer
encourages the respondent to talk freely about a given topic with
minimum prompting or guidance. Under this type, a detailed pre-
planned schedule is not used. The interviewer avoids channelling the
direction of the interview. Instead, he develops a very permissive
atmosphere. The questions are not standardised and not framed in a
particular way.
(iii) Semi-structured Interviewing (SSI) is a focused interview. Here, the
interviewer attempts to focus the discussion on the actual effects of
a given experience to which the respondents have been exposed. This
type of interview is free from the inflexibility of formal methods, yet
gives the interview a set form and ensures adequate coverage of all
topics.
26
What type of interviewing method is used when as far as development Methods of Development
Research
research is concerned depends on the demands of the contexts. As
participatory methods are in the forefront of working of several development
organisations, most of them prefer to use SSI for development related
research. It is more conversational while still controlled and structured. This
is referred to as a semi-structured interview or SSI for short. In SSI, only
some of the questions and topics are predetermined. Most often it is only
a checklist of topics and subtopics that are used for conducting SSI, and
not the structured questions as such. Questions are framed, drawing points
from the checklist, based on how the interview progresses. It often becomes
a progressive learning process for the interviewer, who has to record all
the conversations using shorthand, scribbling, or by recording on a digital
voice recorder.
SSI, in short, is a conversation where only some of the questions are
predetermined, and new questions or insights arise as a result of discussion
and visualised analysis.
33
Development Research 2. What is a case study?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
37
Development Research
UNIT 3 DEVELOPMENT RESEARCH
APPLICATIONS
Structure
3.1 Introduction
3.2 Application of Development Research
3.3 Appraisal of Existing Situation
3.4 Formulation and Implementation of Development Research Project
3.5 Monitoring and Evaluation of Development Research Project
3.6 Let Us Sum Up
3.7 References and Selected Readings
3.8 Check Your Progress - Possible Answers
3.1 INTRODUCTION
The most common application of research that many development professionals
put to use in their work is to know about the planning and implementation of
their programme. They may want to get to grips with issues that are perceived
to be important by local people or to understand in-depth about why things
are the way they are. Sometimes this will mean a broad look at a wide range
of issues affecting a community, and sometimes research will be targeted at a
particular issue.
Increasingly, development professionals engage communities directly in the
research process of deciding priorities for their work, rather than regarding it
as primarily a technical exercise. Research for development has kept spreading,
from rural development projects to a wide set of other issues. The applications
include analysis, planning, implementation, and evaluation in many areas of
development.
The current application of participatory development approaches goes beyond
projects to application of participation in connection with programme strategies
and policies, national policy reforms like poverty reduction plans, and strategies
for decentralization. Development research methods can be practised with a cross
section of a wide variety of audience as they are flexible, interactive, semi-
structured, visual and user-friendly. The strength of development research lies
in its methodological pluralism. It is the combination of flexible and innovative
use of these methods that enrich both the process and the outcome of the
research process.
After studying this unit, you should be able to
l examine the areas of application of development research
l define project proposal and discuss the process of project formulation
l describe the contents of the project proposal
l develop a system for monitoring and evaluating a development project
l distinguish between process monitoring and progress monitoring
38
Development Research
3.2 APPLICATION OF DEVELOPMENT Applications
RESEARCH
Among many development issues that students and development researchers
address, there are five areas where this research can be particularly
interesting, relevant, and rewarding. These include
(i) research on development policy
(ii) working with civil society or Non Governmental Organisations (NGOs)
(iii) interpreting the imagery of development
(iv) understanding the historical construction of development
(v) using the potential of information and communication technologies
(ICTs) to contribute to development practices.
The three key components of development research application which are
discussed in this unit are
(i) Appraisal of Existing Situation
(ii) Research Project Formulation and Implementation
(iii) Monitoring and Evaluation
ANSWERS
Check Your Progress 1
1. The methods used for the appraisal of situation are Needs Assessment,
Stakeholder Analysis, Social Assessment, Beneficiary Assessment,
Social Audit, Participatory Poverty Assessment, Sustainable Livelihood
Analysis, Analysis of Hunger, Vulnerability Analysis, Institutional
Analysis, and Participatory Evaluation.
2. There are five areas where this research can be particularly interesting,
relevant, and rewarding. These include
(i) research on development policy
(ii) working with civil society or Non-Governmental Organisations
(NGOs)
(iii) interpreting the imagery of development
(iv) understanding the historical construction of development
(v) using the potential of information and communication technologies
(ICTs) to contribute to development practices.
Check Your Progress 2
1. A proposal is ‘a statement of purpose and plan’ that is presented for
someone’s acceptance. It intends to persuade that person / agency to
fund your project. Its components are: Title Page, Table of Contents,
Project Summary, Introduction, Project Context, Problem Statement,
Objectives, Anticipated Outcomes or Results, Implementation Plan
(work scope), Project Evaluation, Project Budget, Project Sustainability,
Appendices.
2. Types of Project Proposals can be explained at two levels, based on
time and nature. Projects based on time are at the pilot or experiment
level and at full length project level. Pilot projects are usually meant
for experimentation. It can be for understanding a specific situation
or for testing the validity of a solution at the micro level before it
is scaled up at full length project level. Projects based on nature are
(i) Research Project Proposals
(ii) Action Project Proposals
(iii) Business Project Proposals
Check Your Progress 3
1. Monitoring and evaluation is a management tool for tracking progress
of ongoing projects. The basic idea is to compare actual performance
with plans and to measure actual results against expected results. It
means, checking the progress, watching, tracking, finding the relative
position of a project as to where the project stands compared to where
it should be as per plan.
2. The steps involved in participatory monitoring and evaluation are
review objectives and activities, review reasons for evaluation, why
are we doing evaluation, what do we want to know, develop evaluation 53
Development Research questions, decide who will do the evaluation, identify the direct
indicators and indirect indicators, identify the information sources for
evaluation questions, determine the skills, and labour that are required
to obtain information, determine when information gathering and
analysis can be done, determine who will gather information.
54
MDV-106: RESEARCH METHODOLOGY
IN DEVELOPMENT STUDIES
(6 CREDITS)
BLOCK UNIT UNIT TITLES
NOS
1 Fundamentals of Social Science Research
1 Social Science Research: An Overview
2 Components of Social Science Research
3 Research Designs
4 Research Project Formulation
2 Development Research
1 Basics of Development Research
2 Methods of Development Research
3 Development Research Applications
3 Measurement and Sampling
1 Measurement
2 Scales and Tests
3 Reliability and Validity
4 Sampling
4 Data Collection
1 Quantitative Data Collection Methods and Devises
2 Qualitative Data Collection Methods and Devises
3 Data Sources
5 Data Analysis
1 Overview of Statistical Tools
2 Use of Computer in Data Analysis
3 Data Processing and Analysis
4 Report Writing
55
MDV-106
Research Methodology
in Development Studies
Indira Gandhi National Open University
School of Extension and Development Studies
Block
3
MEASUREMENT AND SAMPLING
UNIT 1
Measurement 5
UNIT 2
Scales and Tests 20
UNIT 3
Reliability and Validity 36
UNIT 4
Sampling 50
PROGRAMME DESIGN COMMITTEE
Prof. Amita Shah Prof. P. Radhakrishan
Gujarat Institute of Development Research, Madras Institute of Development Studies, Chennai
Ahmedabad Prof. Ramashray Roy (Rtd)
Prof. S. K. Bhati Centre for Study of Developing Societies, Delhi
Jamia Millia Islamia, New Delhi Prof. R. P. Singh ( Rtd)
Prof. J. S. Gandhi (Rtd) Ex-Vice-Chancellor, MPUAT, Udaipur
Jawaharlal Nehru University, New Delhi Prof. K. Vijayaraghavan
Prof. Gopal Krishnan (Rtd) Indian Agriculture Research Institute, New Delhi
Punjab University, Chandigarh Dr. Nilima Shrivastava
Prof. S. Janakrajan IGNOU, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai IGNOU, New Delhi
Prof. Kumar B. Das Dr. Nehal A. Farooquee
Utkal University, Bhubaneswar IGNOU, New Delhi
Prof. Nadeem Mohsin (Rtd) Dr. P.V.K.Sasidhar
A.N.Sinha Institute of Social Sciences, Patna IGNOU, New Delhi
PROGRAMME DESIGN COMMITTEE (REVISED)
Prof. T.S. Papola Prof. Nadeem Mohsin (Rtd)
Institute for Studies in Industrial Development, A.N.Sinha Institute of Social Sciences, Patna
New Delhi Prof. Rajesh
Prof. S. Janakrajan University of Delhi, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai. IGNOU, New Delhi
Prof. S. K. Bhati Prof. Nehal A. Farooquee
Jamia Millia Islamia, New Delhi IGNOU, New Delhi
Prof. Preet Rustagi Prof. P.V.K. Sasidhar
Institute for Human Development, Delhi IGNOU, New Delhi
Prof. Gopal Iyer (Rtd) Dr. Pradeep Kumar
Panjab University, Chandigarh IGNOU, New Delhi
Dr. S. Srinivasa Rao Dr. Nisha Varghese
Jawaharlal Nehru University, New Delhi IGNOU, New Delhi
Dr. S. Rubina Naqvi Dr. Grace Don Nemching
Hindu College, University of Delhi, Delhi IGNOU, New Delhi
4
UNIT 1 MEASUREMENT
Structure
1.1 Introduction
1.2 Measurement – Meaning and Concept
1.3 Importance of Measurement
1.4 Measurement Postulates
1.5 Kinds of Measurement
1.6 Admissible Statistical Tests for Measurement
1.7 Criteria for Judging the Measuring Instruments
1.8 Sources of Errors in Measurement
1.9 Let Us Sum Up
1.10 Key Words
1.11 References and Selected Readings
1.12 Check Your Progress – Possible Answers
1.1 INTRODUCTION
Generally, the term measurement is understood as measurement of length,
weight, quantity etc. In statistics however, measurement is more broadly
used to refer to the ways in which the variables are defined or categorised.
In statistics it is broadly termed as scales of measurement. In research one
has to deal with different kinds of data. There are certain variables, the
data for which meet the requirements with respect to the parameters of
the population. Such data can be subjected to various mathematical as well
as statistical operations. They are known as parametric data. Nonparametric
data on the other hand lack those parameters and hence cannot be added,
subtracted, multiplied or divided. Measurement of these two different kinds
of data needs different scales of measurement. This unit will deal with
the different scales of measurement used in measuring different kinds of
data.
After studying this unit you will be able to
l explain the meaning and concepts of measurement in social science
research
l describe the levels of measurement that quantify social variables.
l distinguish between various levels of measurement that have been used
in the social science research
l describe the importance of measurement.
7
Measurement and Sampling
1.4 MEASUREMENT POSTULATES
There are three postulates that are basic to measurement. A postulate is
an assumption that is an essential prerequisite for carrying out some
operations or line of thinking. In this case, it is an assumption about the
relations between the objects being measured.
The three postulates that are basic to measurement are as follows.
(a) Either (a=b) or (a =/= b) but not both. This postulate says ‘a’ is either
equal to ‘b’ or not equal to ‘b’, but not to both. We must be able
to assert, either that one object is the same in a characteristic as another,
or, it is not the same. In measurement ‘the same’ does not necessarily
mean complete identity. It can mean ‘sufficiently the same’ to be
classed as members of the same set.
Example: Duration of variety X is greater than variety Y:
Yield of variety X is greater than variety Y
Height of a person X is greater than person Y.
(b) If (a=b) and (b=c) then (a=c). This postulate says, “If ‘a’ equals ‘b’,
and ‘b’ equals ‘c’, then ‘a’ equals ‘c’. If one member, of a universe
is the same as another member, and the second member is the same
as third member, then the first member is the same as the third member.
This postulate enables a researcher, to establish the quality of set
members, on a characteristic by comparing objects.
Example: As large farmers who have T.V. and radio have the same
level of mass media exposure as that of the small farmer having T.V.
and radio, and, this, in turn, is equal to the marginal farmer having
T.V. and radio.
(c) If (a>b) and (b>c) then (a>c). The third postulate is of more immediate
and practical importance for our purpose. It says “If ‘a’ is greater than
‘b’, and ‘b’ is greater than ‘c’, then ‘a’ is greater than ‘c’. Other symbols
or words can be substituted for greater than (>) less than (<), such
as, is at a greater distance than, is stronger than, and so on. Most
measurement in psychology and education depends on this postulate.
It must be possible to assert ordinal rank order statements, such as
‘a’ has more property than ‘b’, ‘b’ has more of property than ‘c’, thus,
‘a’ has more property than ‘c’.
Example: Yield of variety X is more than variety Y and yield of variety
Y is more than variety Z.
The rate of adoption is higher in innovators than early adopters, than
early majority. Innovators> Early -adopters; Early adopter > early
majority, then, Innovator > early majority.
In the above section you have studied about the meaning and postulates of
measurement. Now, answer the questions given in Check Your Progress-1.
8
Check Your Progress 1 Measurement
18
Measurement
1.12 CHECK YOUR PROGRESS – POSSIBLE
ANSWERS
Check Your Progress 1
1. Measurement consists of identifying the values which may be assumed
by some variable, and representing these values by some numerical
notation. The numerical notation is systematically and consistently
assigned, that is, it is assigned according to some set of rules.
2. The three postulates basic to measurement can be written as:
(a) Either (a=b) or (a =/= b), but not both. We must be able to assert
either that one object is the same in a characteristic as another,
or it is not the same.
(b) If (a=b) and (b=c) then (a=c). This postulate enables a researcher
to establish the quality of set members, on a characteristic by
comparing objects
(c) If (a>b) and (b>c) then (a>c). Most measurement in psychology
and education depends on this postulate.
Check Your Progress 2
1. When a scale has all the properties of ordinal and ordered metric scale,
and, when we have additional information about how large the
distances (intervals) between any two stimuli are, we have achieved
a more powerful measurement, stronger than ordinal. In such a device,
a measurement has been achieved in the sense of an interval scale.
2. The most important criteria to be used in evaluating a measurement
tool are unidimensionality, reliability and validity.
19
Measurement and Sampling
UNIT 2 SCALES AND TESTS
Structure
2.1 Introduction
2.2 Scales: Meaning and Techniques
2.3 Types of Rating Scales
2.4 Uses and Guidelines for Construction of Rating Scales
2.5 Rating Errors
2.6 Tests
2.7 Types of Objective Test Questions
2.8 Test Construction
2.9 Let Us Sum Up
2.10 Key Words
2.11 References and Selected Readings
2.12 Check Your Progress – Possible Answers
2.1 INTRODUCTION
In extension and development research, we quite often encounter the
problem of measurement. This is especially true when the measurement
concepts are complex and when we do not possess the standardised
measurement tools. To overcome this, social science researchers develop
self reporting measuring instruments to assess people’s knowledge, opinion,
perceptions, attitudes etc., on extension and development programmes.
Technically speaking these reporting measurement instruments are popularly
called as scales and tests. The scales and tests are one of the most popular
methods of observation and data collection in behavioural as well as social
sciences.
After studying this unit, you should be able to:
l discuss the meaning and applicability of scales and tests
l describe the important types of scales and tests
l explain the test as well as scale construction methodology
Direction:- For each of the items listed in this scale, place a ‘X’ in one
of the columns to indicate the extent to which you feel that the student
possesses the particular characteristic/ kind of behaviour.
Item Never Seldom Sometime Usually Always
Listens to
others opinion
Accepts
constructive
criticism
Example: Suppose you would like to collect data on liking of information
sources on development programmes from your study respondents. The
following is an example of five-points graphic rating scale on liking of
information sources.
How do you like the following information sources for obtaining information
on development programmes?
Information Liking of information source
source
Like Like Neutral Dislike Dislike
very Some Some very
much what what much
Institutional Sources
BDO
VDO
Extension Personnel
Any other
(Please specify)
Non Institutional
Sources
Other Beneficiaries
Village Key Personnel
Own Family Members
Any other
(Please specify)
Mass Media Sources
Radio
TV
2.3.3 Numerical Rating Scale
The numerical scale makes use of numbers to indicate the extent to which
an individual is believed to possess certain characteristic or kinds of
behaviour.
23
Measurement and Sampling Example of a behaviour rating scale:
Directions: As you rate the student on each of the following items, circle
1 for inferior, 2 for below average 3 for average, 4 for above average and
5 for superior.
1. Cooperates with students 1 2 3 4 5
2. Cooperates with teachers 1 2 3 4 5
3. Maintains an attractive appearance 1 2 3 4 5
2.3.4 Itemized Rating Scale
It is also referred to as specific category scale. In this type of scale, the
respondent selects or picks the one that best characterizes the behaviour
or characteristic of the object being rated. Suppose a teacher’s classroom
behaviour is being rated. The characteristics rated say may be alertness
or imaginativeness.
A category item might be ‘how alert is he / she?’ (Check one).
a) very alert
b) Alert
c) Not alert
d) Not at all alert
A slightly different category item might be ‘how imaginative is he /she?’
(Check one)
a) Extremely imaginative
b) Very imaginative
c) Imaginative
d) Unimaginative
e) Very unimaginative
f) Extremely unimaginative
2.6 TESTS
The tests are frequently used in education and psychological researches
and more recently in development studies to measure the achievement and
personality tract of various categories of respondents.
According to the dictionary ‘test’ is defined as a series of questions on
the basis of which some information is sought. According to Bean (1953)
a test is “an organized succession of stimuli designed to measure
quantitatively or to evaluate qualitatively some mental process, trait or
characteristics”.
The two types of tests popularly used are:
l Objective Tests
l Teacher-Made Tests
2.6.1 Objective Tests
There are various types of objective tests viz.,
i. Achievement Test
ii. Diagnostic Test
iii. Intelligence Test
iv. Aptitude Test
v. Personality Test
Achievement Test
Achievement or proficiency test is one, which measures the extent to which
a person has acquired certain information or proficiency as a function of
instruction or training. The achievement test is used in order to assess the
achievement of a person in certain areas. For example a teacher can conduct
a test to assess the student achievement in mathematics.
Diagnostic Test
This test intends to assess the strength and weakness of a person in one
or more than one areas of his/her activities. It is conducted with a view
to carry out interventions in weak areas. It also makes an enquiry about
28 the weak areas of the respondent who may be a student, employee or worker.
Intelligence Test Scales and Tests
R
IDI =
N
35
Measurement and Sampling
UNIT 3 RELIABILITY AND VALIDITY
Structure
3.1 Introduction
3.2 Reliability
3.3 Methods of Determining the Reliability
3.4 Validity
3.5 Types of Validity
3.6 Reliability Or Validity - Which is More Important?
3.7 Let Us Sum Up
3.8 Keywords
3.9 References and Selected Readings
3.10 Check Your Progress – Possible Answers
3.1 INTRODUCTION
Dear learners, in the first unit of this block, we discussed that measurement
of social and psychological variables is a complex and demanding task.
In development research, the common term for any type of measurement
devise is ‘instrument’. Thus the instrument could be a test, scale,
questionnaire, interview schedule etc. An important question that is often
addressed is what is the reliability and validity of the measuring instrument?
Therefore, the purpose of this unit is to make you understand the concept
of reliability and validity and their interrelationship in extension and
development research.
After studying this unit you should be able to:
l discuss the meaning of reliability and methods of determining the
reliability of measuring instruments.
l describe the meaning of validity, approaches and types of validating
measuring instruments.
l differentiate the interrelationship between reliability and validity of
measuring instruments.
3.2 RELIABILITY
In the context of development research, one of the most important criterions
for the quality of measurement is reliability of the measuring instrument.
A reliable person for instance, is one whose behavior is consistent,
dependable and predictable – what (s)he will do tomorrow and next week
will be consistent with what (s)he does today and what (s)he has done
last week. An unreliable person is one whose behavior is much more
variable and one can say (s)he is inconsistent.
The inherent aspects and synonyms of reliability are:
l dependability
l stability
36
l consistency Reliability and Validity
l predictability
l accuracy
l equivalence
3.2.1 What is Reliability of Measuring Instrument?
Reliability means consistency with which the instrument yields similar
results. Reliability concerns the ability of different researchers to make the
same observations of a given phenomenon if and when the observation
is conducted using the same method(s) and procedure(s).
Stability and Equivalence Aspects of Reliability
Stability and equivalence deserves special attention among different aspects
of reliability,
l The stability aspect is concerned with securing consistent results with
repeated measurements of the same researcher and with the same
instrument. We usually determine the degree of stability by comparing
the results of repeated measurements.
l The equivalence aspect considers how much error may get introduced
by different investigators or different samples of the items being
studied. A good way to test for the equivalence of measurements by
two investigators is to compare their observations of the same events.
3.2.2 How to Improve Reliability?
The reliability of measuring instruments can be improved by two ways.
i. By standardizing the conditions under which the measurement takes
place i.e. we must ensure that external sources of variation such as
boredom, fatigue etc., are minimized to the extent possible to improve
the stability aspect.
ii. By carefully designing directions for measurement with no variation
from group to group, by using trained and motivated persons to conduct
the research and also by broadening the sample of items used to
improve equivalence aspect.
2 r
R = ———
1+ r
r = estimated correlation between two halves (Pearson r).
Advantages
l Both, the test–retest and alternative form methods require two test
administrations with the same group of people. In contrast the split–
half method can be conducted on one occasion.
l Split-half reliability is a useful measure when impractical or undesirable
to assess reliability with two tests or to have two test administrations
because of limited time or money.
39
Measurement and Sampling Limitations
l Alternate ways of splitting the items results in different reliability
estimates even though the same items are administered to the same
individuals at the same time.
Example: The correlation between the first and second halves of the test
would be different from the correlation between odd and even items.
42
iv) Independent Criteria Reliability and Validity
Abstractly speaking, this is an ideal technique but its application is usually
difficult. There are four qualities desired in a criterion measure. In order
of their importance they are:
(a) Relevance: We judge a criterion to be relevant if standing on the
criterion measure corresponds to the scores on scale.
(b) Freedom from bias: By this we mean that the measure should be one
on which each person has the same opportunity to make a good score.
Example of biasing factors are such things as variation in the quality
of equipment or conditions of work for a factory worker, a variation
in the quality of teaching received by studying in different classes.
(c) Reliability: If the criterion score is one that jumps around from day
to day, so that the person who shows high job performance one week
may show low job performance the next or who receives a high rating
from one supervisor gets a low rating from another, then there is no
possibility of finding a test that will predict that score. A measure that
is completely unstable by itself cannot be predicted by anything else.
(d) Availability: Finally, in the choice of a criterion measure we always
encounter practical problems of convenience and availability. How long
will we have to wait to get a criterion score for each individual? How
much is it going to cost? Any choice of a criterion measure must make
a practical limit to account.
However, when the independent criteria are good validation, it becomes
a powerful tool and is perhaps the most effective of all techniques of
validation.
In this section you have read about validity and various approaches to
validation of a measuring instrument. Now try and answer the questions
given in Check Your Progress -2.
Check Your Progress 2
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Do you agree that ‘one validates not the measuring instrument, but
the purpose for which it is being used’? Write your agreement or
disagreement.
.................................................................................................................
.................................................................................................................
.................................................................................................................
2. Name the four approaches to validation of measuring instrument.
.................................................................................................................
.................................................................................................................
.................................................................................................................
Both content and criterion validities have limited usefulness for assessing
the validity of empirical measures of theoretical concepts employed in
development studies. In this context, construct validity must be investigated
whenever no criterion or universe of content is accepted as entirely adequate
to define the quality to be measured. Examination of construct validity
involves validation not only of the measuring instrument but of the theory
underlying it. If the predictions are not supported, the investigator may have
no clear guide as to whether the shortcoming is in the measuring instrument
or in the theory.
Construct validation involves three distinct steps.
a. specify the theoretical relationship between the concepts themselves
b. examine the empirical relationship between the measures of the
concepts
c. interpret the empirical evidence in terms of how it clarifies the
construct validity of the particular measure.
Indeed strictly speaking, it is impossible to validate a measure of a concept
in this sense unless there is a theoretical network that surrounds the concept.
In this section you have read about the various types of validity. Now try
and answer the questions given in Check Your Progress – 3.
Check Your Progress 3
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Name the three types of validity.
..................................................................................................................
..................................................................................................................
..................................................................................................................
..................................................................................................................
2. Write the major difference between predictive and concurrent validities.
..................................................................................................................
..................................................................................................................
..................................................................................................................
..................................................................................................................
In Fig. 3.2, the gun is also aimed in the direction of the target, but the
shots are widely scattered, indicating low consistency or reliability. Thus
the poor reliability undermines an attempt to achieve validity.
Fig. 3.2: Unreliable which undermines the valid aim of the gun – Not useful
In Fig. 3.3, the gun is not pointed at the target, making it invalid, but
there is great consistency in the shots in one direction, indicating that it
46 is reliable (In a sense, it is very reliably invalid).
Reliability and Validity
In Fig. 3.4, the gun is not pointed at the target making it invalid, and
the lack of consistency in the direction of the shots indicates its poor
reliability.
49
Measurement and Sampling
UNIT 4 SAMPLING
Structure
4.1 Introduction
4.2 Sampling: Meaning and Concepts
4.3 Types of Sampling
4.4 Sample Design Process
4.5 Errors in Sampling
4.6 Determination of Sample Size
4.7 Let Us Sum Up
4.8 Key Words
4.9 References and Selected Readings
4.10 Check Your Progress – Possible Answers
4.1 INTRODUCTION
Sampling has been an age old practice in everyday life. Whenever we want
to buy a huge quantity of a commodity, we decide about the total lot by
simply examining a small fraction of it. It has been established that the
sample survey if planned properly, can give very precise information. Since
in surveys only a part of the population is surveyed and inference is drawn
about the whole population, the results are likely to be different from the
population values. But the advantage with the sample survey is that this
type of error can be measured and controlled and it can be eliminated to
a great extent by employing properly trained persons in surveys. The other
advantages of sample surveys are that it is less time consuming and involves
less cost. Usually, the population is too large for the researcher to attempt
to survey all of its members. A small, but carefully chosen sample can
be used to represent the population. The sample reflects the characteristics
of the population from which it is drawn.
After studying this unit, you should be able to
l discuss the meaning and importance of sampling
l describe the steps and criteria involved in selecting a sampling
procedure
l distinguish between different types of sampling
l explain the process of determination of sampling size
2. Decide a random starting point in the table. Any point will do.
Say second row in the second column (Appendix 1).
3. Look at the first three digits at that point, because there are three
digit in 600.
4. Then, if the number is less than 600, include it in the sample;
if not then look for a number where the first three digits are less
than 600.
5. From that point you can move in any direction. Select only three
digit numbers that are less than 600, until you have 25 such
numbers.
Note: You can move in any direction in the random number table because
every digit has been placed in the table at random.
For example, here if we start from the second row in the third column,
then, the random numbers are: 31684; 09865; 14491; 34691, continuing
till 25 samples are selected.
2. Systematic Random Sample
Designing a Systematic Random Sample is sometimes quite difficult and
time consuming and therefore, Systematic Random Sample, like Simple
Random Sample, also uses a list of all members of the population in its
sampling frame. However, instead of using random numbers to select the
sample elements, the researcher applies a skip interval to the list to produce
a sample of the required size.
number of elements in the population
Skip interval =
the required sample size
K = N
n
K = skip interval
N = Universe size
n = Sample size
For example if we have to select a sample of 100 persons from a universe
of 1000 population, then the skip is 10. In this case one number between
1 and 10 has to be selected. Suppose 5 is selected, then the first sample
would be 5th and the next one 15th, 25th, 35th, 45th, and so on. One of the
advantages of this method is that it is more convenient than other methods
and simple to design. Again, it is used with very large populations.
3 Stratified Random Sample
In Stratified Random Sampling, the target population of N units is first
divided into k subpopulations of N1 , N 2 , .........., N k units. These
populations are non-overlapping and together they comprise the whole
population. So that
N1 N 2 .......... N k N
53
Measurement and Sampling The sub-populations are called strata. The number in each stratum should
be known. A sample is drawn from each stratum independently. The sample
sizes within ‘k’ strata are denoted by n 1 , n 2 , .......... , n k respectively. If
the total sample size ‘n’ is to be drawn from the target population then
n1 n 2 .......... n k n
If a simple random sample is drawn in each stratum, the whole procedure
is described as stratified random sampling.
Stratified random sampling requires more than making a list of elements
(and estimating the number of elements on the list). It also involves
ordering that list by sub groups (or strata) and then, to do sampling
randomly or systematically within those sub groups. This method of
sampling is used for the following reasons.
l It can reduce the errors in the statistical estimates calculated from the
sample.
l It allows you to create a sample that is a good representative of the
various sub groups in the population that you find to be of special
interest.
For example, the selected village may have households of SC, ST, OBCs,
Others, Minority. The village population first may be divided into smaller
sub groups of different sections of population (stratum) and, thus, the village
sample may consist of households from each stratum so that sample may
contain all the important characteristics of the village population. In the
case of SRS, the sample of all strata/ sub groups sometimes may not be
included or covered adequately.
l This method helps in conducting and managing a large scale survey
to be conducted in a country like India. The agency conducting the
survey may have field offices in different locations; each one can
supervise the survey for a part of the population.
l The basic idea is that it sub-divides the heterogeneous population into
homogeneous sub-populations. If each stratum is homogenous in itself,
a precise estimate of any stratum mean can be obtained from a small
sample, thus, saving a lot of time and cost.
There are two types of stratified samples.
A proportionate stratified sample selects the number of elements from
each stratum so that the stratum sample size ( n1, n2,…….....…….., nk )
is proportional to their respective stratum population size
(N1, N2, ………….., Nk ).
Consider the following examples:
l A selected village may have households of SC(10%), ST (5%), OBCs
(45%), Others (30%), Minority (10%). A village sample of 100 may
constitute the households of various casts in the above proportion/
percentage so that the sample may contain all important characteristics
of village population.
l Hospital patients are stratified according to age, dividing the population
into those who are aged 50years or above, and, those who are under
54 50. If there are twice as many people aged 50 or above admitted
to the hospital as those under 50, a proportionate stratified sample Sampling
will include twice as many people aged 50 or above.
A disproportionate stratified sample selects the number of elements from
each stratum so that the stratum sample size is not proportional to the
stratum population size. The most common reason for selecting this type
of sample is when you want to study a relatively rare but important
subpopulation, such as younger patients suffering from heart disease.
Proportionate stratification may result in too few elements being selected
so that little, if any, statistical analysis can be done. Consequently, even
if these patients represent only 1% of the population, you might decide
to make them 10% of the final sample. However, once we combine values
of all stratums, the size of the higher selected proportion needs to be
readjusted which is called weighted estimate.
4. Probability Proportional to Size (PPS) Sample
It has been observed that the elementary units of the population vary in
size. Such ancillary information about the size of the unit can be utilized
in selecting the sample so as to get better and efficient estimates of the
population parameter. For example villages with larger geographical area
are likely to have larger area under food crops; therefore, in estimating
the production, it would be desirable to adopt a sampling scheme in which
villages are selected with probability proportional to geographical area.
When units vary in their size and the variable under study is directly related
with the size of the unit, the probabilities may be assigned proportional
to the size of the unit.
Probability Proportional to Size (PPS) Sampling assures higher probability
of selection to sampling unit which are larger in size. This technique was
initially used in estimation of crop production, fruits production etc because
productivity is directly related with the size of field. In social science
surveys also characteristics of village population is influenced by the size
of population. The procedure of selecting the sample is described below.
Suppose you have to select 5 villages from the list of 10 using PPS
sampling. First arrange all villages in ascending or descending order of
population size as may be seen in column 2 of the table 1. Then, in the
third column, find the cumulative sum of population size and in the fourth
column, assign them range of serial numbers as shown below in the table.
Table 1: Village population Size
Serial Village Cumulative Cumulative
Number Population Sum of Population
Size Population Size Size Interval
1 2 3 4
1 200 200 0001 - 0200
2 250 450 0201 - 0450
3 300 750 0451 - 0750
4 350 1100 0751 - 1100
5 400 1500 1101- 1500
6 450 1950 1501 - 1950 55
Measurement and Sampling
7 500 2450 1951 - 2450
8 550 3000 2451 - 3000
9 600 3600 3001 - 3600
10 650 4250 3601 - 4250
Total 4250
Please notice that the total population of all villages in the target population
is a four digit number (4250). Therefore, initially, a random number in
four digits, which is less than or equal to the total population of all villages
(4250), is selected from the random number table. For example, it is 0331
which will correspond to serial number 2. Next random number is 4320;
therefore, it may be discarded. The next number selected is 1296; therefore,
it will correspond to serial number 5. The next random numbers may be
1553, 2402 and 3640 which will correspond to serial numbers 6, 8, and
10 respectively. In this way, selected villages will be serial numbers 2,
5, 6, 8, 10.
5 Cluster Sample
Cluster sampling is a sampling technique used when natural groupings
are evident in a statistical population. It is often used in marketing research.
In this technique, the total population is divided into these known groups
(or clusters) and a sample of the groups is selected. Then the required
information is collected from the elements within each selected group. This
may be done for every element in these groups, or a sub sample of elements
may be selected within each of these groups. The technique works best
when most of the variation in the population is within the groups, not
between them.
Briefly, the procedure for selecting a cluster sample is given below.
l The population is divided into N groups, called clusters.
l The researcher randomly selects n clusters to include in the sample.
l The number of observations within each cluster is known:
M = M1 + M2 + M3 + ....……. + MN
l Each element of the population can be assigned to one, and only one,
cluster.
Cluster sampling should be used only when it is economically justified
- when reduced costs can be used to overcome losses in precision.
This is most likely to occur in the following situations.
l Constructing a complete list of population elements is difficult, costly,
or impossible. For example, it may not be possible to list all elementary
units of the populations, for example all households in village, block,
etc. However, it would be possible to randomly select a subset of
villages, blocks (stage 1 of cluster sampling) and, then, interview the
head of family in a house of the selected cluster (stage 2).
l The population is concentrated in natural clusters (city blocks, schools,
hospitals, etc.). For example, to conduct personal interviews of
operating room nurses, it might make sense to randomly select a sample
56 of hospitals (stage 1 of cluster sampling) and then interview all of
the operating room nurses at that hospital. Using cluster sampling, the Sampling
interviewer could conduct many interviews in a single day at a single
hospital. Simple random sampling, in contrast, might require the
interviewer to spend all day travelling to conduct a single interview
at a single hospital.
As discussed above, in the cluster sampling method, the primary selecting
unit is not a household, rather a natural cluster of households, viz., hamlets
in villages, or, created clusters, viz., schools, malls, etc., may be decided.
The first list of clusters may be selected using the SRS or the PPS sampling
techniques. Then, from each selected cluster, all units, or, some of the units,
may be selected as per the required sample size using Stratified Random
Sampling or the Systematic Random Sampling techniques.
This sampling technique is quite popular in evaluation surveys in health
– it is also called the 30 Cluster Sampling Technique. This is also a rapid
method of data collection as the researcher can collect more data in less
time due to the decrease in transportation time as compared with other
sampling techniques.
4.3.2 Non-Probability Sampling
A non-probability sample is one in which a case in a sample is chosen
in such a manner that it gives you information for the sample itself and
makes it possible to generalize the findings for the population with certain
degree of precision. Such a sample is also called a purposive sample. This
kind of sampling is primarily used to collect information on market surveys
to know the attitude, opinion, behaviour, reactions of individuals. There
are many types of non-probability samples, including snowball sampling,
convenience, purposive/ judgment, quota sampling, etc.
1. Convenience Sample
The convenience sample is so called because it is relatively easy to
obtain and contact. In this method the investigators are usually asked
to select the people for the interview in accordance to the instructions
from the researcher. The benefit of a convenience sample is that the
interviewer can usually get interviews done quickly and cheaply.
Convenience sampling is appropriate for exploratory research.
2. Judgments Sample:
A judgment sample is similar to that of convenience sample. In a
judgment sample, the researcher selects samples that are believed to
represent the population. The selection of samples is based on the
knowledge of the population and the characteristics which the sample
is to represent. It is less costly and very useful for forecasting.
3. Quota Sample:
Quota sampling is like stratified sampling. In quota sampling, the
population is categorized into several strata which consist of an
expected size, and the samples are considered to be important for the
population they represent. The advantages of quota sample are that
it involves a short time duration, is less costly, and gives moderate
representation to a heterogeneous population.
57
Measurement and Sampling 4. Snowball Sample:
This is one of the important types of non-probability sampling. In
snowball sampling, the investigator encourages the respondents to give
the names of other acquaintances and it continues growing in size and
chains until the research purpose is achieved. It is also, therefore,
known as networking, chain, or referred sampling method. It is very
useful in the study of networking and is less costly.
A comprehensive overview of the various types of sampling can be seen
in Figure 4.1
Fig. 4.1: Types of Sample
Sample Types
p (1 p)
Standard error = ................ (2)
n
Here, p represents the proportion of successes (favourable response, those
who received the benefits), {q = (1-p)} represents the proportion of failures
(those who did not receive the benefits), and n is the total number of
respondents. The standard error of a statistic is greatest when p and (1-
p) are equal, which occurs when each is 0.50, or 50%, of the sample.
(ii) Non-Sampling Error
Before discussing how to determine sample size, we will briefly review
other sources of error in surveys. When you read a news article that reports
the results of a national poll, the error in the estimates is always listed,
derived, generally speaking, from Equation 2. However, experienced survey
researchers know that errors due to other sources are typically greater than
the error due to sampling alone. Following are some other types of errors.
l Measurement errors, caused by poorly written questions, poorly
designed questionnaires, respondent errors in completing questionnaires,
and so on.
l Non-response errors, caused because the respondents are not a
representative subset of the population.
l Data coding errors, caused, by errors in coding and entering the data.
Of these error sources, the first two are typically more severe. In mail
surveys, non-response error is often the most serious problem.
There are two critical characteristics of these non sampling errors. First,
as mentioned above, their sum is often greater than the sampling error.
Second, and more insidious, these errors are often impossible to estimate
for any one survey, especially measurement and non-response errors.
Consequently, using Equation 1 and Equation 2 to estimate the error in
a statistics often provides a false sense of security.
Experienced survey researchers take this fact into account by being more
cautions in discussing survey results than the sampling error alone would
indicate, and you should do the same. Ideally, the other sources of error 61
Measurement and Sampling would balance themselves out so that errors in one direction negate errors
in the other directions, but you cannot assume that this is the case.
In this session you studied about errors in sampling and determination of sample
size. Now, answer the questions given in Check Your Progress-3.
Check Your Progress 3
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. What is sampling error?
....................................................................................................................................................
....................................................................................................................................................
....................................................................................................................................................
....................................................................................................................................................
63
Measurement and Sampling 2. How is the sample size determined using the formula?
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
64
Appendix 1 Sampling
17147 19519 22497 16857 42426 84822 92598 49186 88247 39967
13748 4742 92460 85801 53444 65626 58710 55406 17173 69776
87455 14813 50373 28037 91182 32786 65261 11173 34376 36408
8999 57409 91185 10200 61411 23392 47797 56377 71635 8601
78804 81333 53809 32471 46034 36306 22498 19239 85428 55721
82173 26921 28472 98958 7960 66124 89731 95069 18625 92405
98594 25168 89178 68190 5043 17407 48201 83917 11413 72920
73881 67176 93504 42636 38233 16154 96451 57925 29667 30859
46071 22912 90326 42453 88108 72064 58601 32357 90610 32921
44492 19686 12495 93135 95185 77799 52441 88272 22024 80631
31984 72170 37722 55794 14636 5148 54505 50113 21119 25228
51574 90692 43339 65689 76539 27909 5467 21727 51141 72949
35350 76132 90925 92124 92634 35681 43690 89136 35599 84138
46943 36502 1172 46045 46991 33804 80006 35542 61056 75666
22665 87226 33304 57975 3985 21566 65796 82915 81466 89205
39437 97957 11838 10433 21564 51570 73558 27495 34533 57808
77082 47784 40098 97962 89845 28392 78187 6112 8169 11861
24544 25649 43370 28007 6779 72402 62632 53956 24709 6978
27503 15558 37738 24849 70722 71859 83736 6016 94397 12529
24590 24545 6435 52758 45685 90151 45616 49644 92686 84870
48155 86226 40359 28723 15364 69125 12609 57171 86857 31702
20226 53752 90648 24367 83314 14 19207 69413 97016 86290
70176 73444 38790 53626 93780 18626 68766 24371 74639 30782
10169 41465 51935 5711 9799 79077 88159 33437 68519 3040
81084 3701 28598 70013 63794 53169 97054 60303 23259 96196
65
Measurement and Sampling 69202 20777 21727 81511 51887 16175 53746 46516 70339 62727
80561 95787 89426 93325 86412 57479 54194 52153 19197 81877
8199 26703 95128 48599 9333 12584 24374 31232 61782 44032
98883 28220 39358 53720 80161 83371 15181 11131 12219 55920
84568 69286 76054 21615 80883 36797 62845 39139 90900 18172
4269 35173 95745 53893 86022 77722 52498 84193 22448 22571
10538 13124 36099 13193 37706 44562 57179 44693 67877 1549
77843 24955 25900 69843 95029 93859 93634 20205 66294 41218
12034 94636 49455 76362 83532 31062 69903 91186 65768 55949
10524 72823 47641 93315 80875 28090 97728 52560 34937 79548
68935 76632 46984 61772 92786 22651 7086 89754 44143 97687
83450 65665 29190 43709 11172 34481 95977 47535 25658 73898
90696 20451 24211 97310 60446 73530 62865 96574 13829 72226
49006 32047 93086 112 20470 17136 28255 86328 7293 38809
74591 87025 52368 59416 34417 70557 86745 55809 53628 12000
6315 17012 77103 968 7235 10728 42189 33292 51487 64443
62386 9184 62092 46617 99419 64230 96034 85481 7857 42510
86848 82122 4028 36959 87827 12813 8627 80699 13345 51695
65643 69480 46598 4501 40403 91408 32343 48130 49303 90689
11084 46534 78957 77353 39578 77868 22970 84349 9184 10603
n1 n 2 .......... n k n
If a simple random sample is drawn in each stratum, the whole
procedure is described as stratified random sampling.
2. Cluster sampling is a sampling technique used when natural groupings
are evident in a statistical population. It is often used in marketing
research. In this technique, the total population is divided into these
known groups (or clusters) and a sample of the groups is selected.
Then, the required information is collected from the elements within
each selected group. This may be done for every element in these
groups or a sub sample of elements may be selected within each of
these groups. The technique works best when most of the variation
in the population is within the groups, not between them.
3. Quota sampling is like stratified sampling. In quota sampling, the
population is categorized into several strata which consist of an
expected size and they are considered to be important for the
population they are supposed to represent. The advantages of the quota
sample are: shorter time duration, less costly, and gives moderate
representation to a heterogeneous population.
Check Your Progress 3
1. By definition, when you have collected a sample from a population,
you have less than complete information about the population. This,
in turn, means that there is a chance that the sample statistics you
calculate, (for example, the mean of a variable, a frequency distribution,
67
Measurement and Sampling etc.) may not be unbiased estimate of the population parameter. This
error is called sampling error.
2. The calculation of the sample size is concerned with the number of
respondents required. To determine the number to select for the
sample drawn from the sampling frame, you must estimate the non-
response rate. The actual sample size to be drawn is:
Number of respodents
Sample size
Re sponse rate
So, if any survey organization decides that they need 700 respondents,
and the expected response rate from the population is 50%, then 700/
0.50, or 1400, customers must be drawn from the sampling frame.
68
Sampling
69
MDV-106
Research Methodology
in Development Studies
Indira Gandhi National Open University
School of Extension and Development Studies
Block
4
DATA COLLECTION
UNIT 1
Quantitative Data Collection Methods and Devices 5
UNIT 2
Qualitative Data Collection Methods and Devices 22
UNIT 3
Data Sources 36
PROGRAMME DESIGN COMMITTEE
Prof. Amita Shah Prof. P. Radhakrishan
Gujarat Institute of Development Research, Madras Institute of Development Studies, Chennai
Ahmedabad Prof. Ramashray Roy (Rtd)
Prof. S. K. Bhati Centre for Study of Developing Societies, Delhi
Jamia Millia Islamia, New Delhi Prof. R. P. Singh ( Rtd)
Prof. J. S. Gandhi (Rtd) Ex-Vice-Chancellor, MPUAT, Udaipur
Jawaharlal Nehru University, New Delhi Prof. K. Vijayaraghavan
Prof. Gopal Krishnan (Rtd) Indian Agriculture Research Institute, New Delhi
Punjab University, Chandigarh Dr. Nilima Shrivastava
Prof. S. Janakrajan IGNOU, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai IGNOU, New Delhi
Prof. Kumar B. Das Dr. Nehal A. Farooquee
Utkal University, Bhubaneswar IGNOU, New Delhi
Prof. Nadeem Mohsin (Rtd) Dr. P.V.K.Sasidhar
A.N.Sinha Institute of Social Sciences, Patna IGNOU, New Delhi
PROGRAMME DESIGN COMMITTEE (REVISED)
Prof. T.S. Papola Prof. Nadeem Mohsin (Rtd)
Institute for Studies in Industrial Development, A.N.Sinha Institute of Social Sciences, Patna
New Delhi Prof. Rajesh
Prof. S. Janakrajan University of Delhi, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai. IGNOU, New Delhi
Prof. S. K. Bhati Prof. Nehal A. Farooquee
Jamia Millia Islamia, New Delhi IGNOU, New Delhi
Prof. Preet Rustagi Prof. P.V.K. Sasidhar
Institute for Human Development, Delhi IGNOU, New Delhi
Prof. Gopal Iyer (Rtd) Dr. Pradeep Kumar
Panjab University, Chandigarh IGNOU, New Delhi
Dr. S. Srinivasa Rao Dr. Nisha Varghese
Jawaharlal Nehru University, New Delhi IGNOU, New Delhi
Dr. S. Rubina Naqvi Dr. Grace Don Nemching
Hindu College, University of Delhi, Delhi IGNOU, New Delhi
COURSE PREPARATION TEAM
Unit Writer: Editing and Proof Reading:
Mr. P. Shukla Prof. V.K.Jain (Rtd) (Content Editor)
Saket, New Delhi (Unit 1) NCERT, New Delhi
Dr. V. Sailaja
Mr. Praveer Shukla (Language Editor),
New Delhi
S. V. Agricultural College, Tirupati (Unit 2)
Dr. Nisha Varghese, IGNOU
Dr. Nisha Varghese Prof. B. K. Pattanaik, IGNOU
SOEDS, IGNOU (Unit 3) Prof. Nehal A. Farooquee, IGNOU
Prof. P. V. K. Sasidhar, IGNOU
4
UNIT 1 QUANTITATIVE DATA
COLLECTION METHODS
AND DEVICES
Structure
1.1 Introduction
1.2 Primary Data Collection: Meaning and Methods
1.3 Questionaire Methods of Data Collection
1.4 Interview Schedule
1.5 Secondary Data Methods
1.6 Let Us Sum Up
1.7 References and Selected Readings
1.8 Check Your Progress - Possible Answers
1.1 INTRODUCTION
There are two types of primary research: one is done through quantitative
data collection and the other, through qualitative data collection. Customarily,
quantitative data collection means using numbers to assess information. As
you are aware, some kinds of information are numerical in nature, for
example, a person’s age, or annual income. The answers to these questions
are in numbers.
Quantitative data is used for testing of a hypothesis and drawing inferences.
Quantitative data is collected by using the following two set of data
resources:
i. Primary data
ii. Secondary data.
In this unit, we will discuss in detail, methods of collecting primary and
secondary data, along with the advantages and disadvantages of the
methods.
After reading this unit, you should be able to
l explain the primary data collection methods
l discuss the questionnaire and interview methods of data collection
l describe secondary methods of data collection
Sample Questionnaire
Indira Gandhi National Open University
School of Extension and Development Studies
MA in Development Studies
Title: Functioning of Primary School in Bhilwara village
1. Name of the State .........................................................................
2. Name of the District .....................................................................
3. Name of the Block........................................................................
4. Name of the Village ......................................................................
5. Name of the Teacher (Respondent) .............................................
6. Sex: Male/Female…………….
7. Age ………………………..
8. Educational Qualification ...............................................................
9. Caste………………..
10 Marital Status …………………………
11. Years of Teaching ..........................................................................
12. Training received, if any...............................................................
13. If yes, write the subjects taught in the training programme ...
14. Subject you are teaching
Mathematics ....................................................................................
Science ............................................................................................
Literature .........................................................................................
Any other; specify .........................................................................
15. Medium in which you are teaching
English .............................................................................................
Hindi ................................................................................................
Any other, specify..........................................................................
16. In your opinion, which students were performing better in the
class:
General Caste .................................................................................
SC ....................................................................................................
ST ....................................................................................................
Girls .................................................................................................
Boys ................................ ………………………………………….
11
Data Collection
17. Your interaction with the
Categories Frequent Occasional Not at all
Parents .............................................................................................
Father ...............................................................................................
Mother .............................................................................................
18. The role of Panchayat in your school management
Good ................................................................................................
Average ............................................................................................
Poor .................................................................................................
19. Functioning of the Village Education Committee
Good ................................................................................................
Average ............................................................................................
Poor .................................................................................................
Areas of their involvement ...........................................................
20. In your opinion, who are the real beneficiaries of rural education?
Economically Poor .........................................................................
Girl Children ..................................................................................
Socially backward ..........................................................................
All ....................................................................................................
21. Write the main problems of your School
1 .......................................................................................................
2 .......................................................................................................
3 .......................................................................................................
22. What are your suggestions for improvement of the school
conditions?
1 .......................................................................................................
2 .......................................................................................................
3 .......................................................................................................
In this section, you studied about quantitative data collection and the
questionnaire method of data collection. Now, answer the questions given
in Check Your Progress-1.
Check Your Progress 1
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end
of the unit.
1. What is primary data?
.................................................................................................................
.................................................................................................................
.................................................................................................................
12
2. What are the advantages of a questionnaire? Quantitative Data Collection
Methods And Devices
.................................................................................................................
.................................................................................................................
.................................................................................................................
Marital Status
Occupation
Diseases
This unit describes in detail the various sources and methods of collection
of quantitative data. It also deals with the methods of collection of primary
data and secondary data. The advantages and disadvantages of various
methods of data collection have also been discussed in the unit. The unit
also narrates the precautions one has to take while collecting the primary
and secondary data.
21
Data Collection
UNIT 2 QUALITATIVE DATA
COLLECTION METHODS AND
DEVICES
Structure
2.1 Introduction
2.2 Qualitative Data - Meaning and Concept
2.3 Methods and Techniques of Qualitative Data Collection
2.4 Features of Qualitative and Quantitative Research
2.5 Let Us Sum Up
2.6 Key Words
2.7 References and Selected Readings
2.8 Check Your Progress – Possible Answers
2.1 INTRODUCTION
Data Collection is an important aspect of any type of research study. Data
collection techniques allow us to systematically collect information about
the subject of our study (people, objects, phenomena), and about the
environment. In the collection of data we have to be systematic. If data
are collected haphazardly, it will be difficult to answer our research
questions in a conclusive way. Inaccurate data collection can impact the
results of a study and ultimately lead to invalid results.
After studying this unit, you should be able to
l discuss the meaning and concept of qualitative data;
l describe the features of various methods and devices used for
qualitative data collection; and
l state the uses and limitations of various qualitative data collection
methods.
22
Qualitative data are descriptive in nature and can be statistically analyzed Qualitative Data Collection
Methods and Devices
only after processing and after having them classified into some appropriate
categories. Qualitative data can, however, facilitate in-depth analysis of a
social situation. There are certain situations where qualitative research alone
can provide the researcher with all insights needed to make decisions and
take actions; while in some other cases quantitative research might be
needed as well.
32
Qualitative Data Collection
It cannot be known how true the May be possible to estimate how Methods and Devices
findings are of the population from reliable the findings are. It
which the respondents are drawn depends on which sampling
method is used
Data collection is usually handled Usually done by trained
by research professionals interviewers or through self-
completion questionnaires
A qualitative project cannot be Can usually be replicated, because
repeated exactly, because every data every interview in the project
collection event in a project is follows the same procedure
different
The findings can rarely be expressed Findings are expressed in number
in statistical form and can be analysed using
statistical techniques
Analysis and conclusion rely heavily Because statistical procedures are
on the researcher’s perceptions and used the analysis is less likely to
interpretation skills be disputed
Source: John Boyce, Marketing Research, MacGraw Hill Pvt Ltd, Australia,
2005.
In this section you have read about Focus Group Discussion, Content
Analysis and other qualitative data collection tools. You also read about
features of qualitative and quantitative research. Now try and answer the
questions given in Check Your Progress-2.
Check Your Progress 2 2
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end
of the unit.
1. Differentiate between multiple choice and open ended questions.
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. What is the importance of pre-testing the interview schedule or
questionnaire?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
ANSWERS
Check Your Progress 1-
1. In the interview method, the questions are presented to the respondents
in a face-to-face situation as oral-verbal stimuli, and the researcher,
or personnel trained for the purpose (interviewers, enumerators) note
down oral-verbal responses. In the questionnaire method, the questions
are delivered (generally mailed) to the respondents, who note down
their responses on it and send them back to the researcher.
2. (i) The assumption of uniformity in basic human nature, in spite of
the fact that human behaviour may vary according to situations.
(ii) The assumption of studying the natural history of the unit
concerned.
(iii) The assumption of comprehensive study of the unit concerned.
Check Your Progress 2
1. There are two principal forms of questions: multiple choice and open
ended. For a multiple choice question, the researcher prepares a set
of probable replies (generally after pre-testing) and presents it to the
respondents. The respondents select a reply or replies considered most
appropriate for each of them. In the case of open ended questions,
the respondents reply in their own words, which are then recorded.
2. Pre-testing means testing the interview schedule/questionnaire in
advance to find out whether it is capable of eliciting appropriate
response from the respondents. Pretesting shall help to find out if the
questions are properly understood by the respondents and it shall also
help to identify whether the questions are logically organized, the
replies could be properly recorded in the space provided for, or there
is any scope for further improvement. On the basis of the responses
obtained, appropriate changes in the interview schedule/questionnaire
should be made.
35
Data Collection
UNIT 3 DATA SOURCES
Structure
3.1 Introduction
3.2 Sources of Data
3.3 Process of Sourcing Data
3.4 Qualities of Data Source
3.5 Data Sources for Agriculture
3.6 Data Sources for Infrastructure
3.7 Data Sources for Service Sector
3.8 Other Data Sources
3.9 Global Data Sources
3.10 Let Us Sum Up
3.11 References and Selected Readings
3.12 Check Your Progress – Possible Answers
3.1 INTRODUCTION
Data can be defined as the quantitative or qualitative information values
of an aspect that one is studying. The aspect can take the quantitative or
numerical value for example the average household income of India is
$1670 in 2016.The data can be relative or subjective for example the
population in Africa on an average are dark skinned. Data is the plural
form of the word datum which literally means to give or something given.
Data is thought to be the lowest unit of information from which other
measurements and analysis can be done. Data can be images, words, figures,
facts or ideas. Data in itself cannot be understood and to get information
from the data one must interpret the values into meaningful information.
Beyond interpretation the data vary considerably based on the type of data
collected and the source from where the data is collected. The source of
data is important as it gives the full information of the aspect one is
studying. The data can vary based on the source and the methodology used
to bring out the information. Many a time’s data sometimes give weird
values which become difficult to explain and turn out unreliable. Thus the
source of data is critical for any work in economic, social or political.
After reading this unit you will be able to
l Discuss about the data sources for agricultural data
l Describe the data sources for infrastructure data
l Elaborate on the data sources for service sector data
l Discuss about the global data sources
37
Data Collection
3.3 PROCESS OF SOURCING DATA
Information present everywhere is data and what really matters is how this
data is organized. Data has many components before it can be of any use.
The main component in any data is the process of collecting the data, who
is collecting data and the time of collection of data. These three things
are important for data assimilation and further analysis. The first aspect
in data sourcing is the process involved in collecting data. If a person goes
in the street and asks particular information directly from an individual
and it is recorded is called primary sourcing of information. Here the
primary source is the individual whom you are getting the information from.
Primarily data can be collected directly from the respondents. And if the
information collected passes on to another stage in which the information
is changed on the basis of parameters and categorization it becomes
secondary sourcing information. If the information is collected in a periodic
manner over time interval and recorded it becomes secondary information.
If the information is analyzed and presented into reports with another
categorization and sequence it becomes tertiary sourcing of information.
Another important part in the process as suggested earlier who is collecting
the information. If the collector has the authority to collect the information
and can it represent thesource. This aspect is important as dealt in the earlier
section about the authenticity of the data and the reliability of the
information. The time of collection of data is crucial in sourcing of data
as it provide a relevance and comparability. Sourcing of data involves many
aspects like how the information is collated, time and at what level is the
data collected. Another important aspect is the quality of the data source
and it strongly interlinks with the process of sourcing the same.
3.3.1 Forms of Sourcing
As distinguished earlier there are two main sources official source of data
and private source of data which can be classified as primary, secondary
and tertiary. Most often official sourcing of data is made to make it a formal
source of data most often referred to as the most authentic source of data.
These data can also take forms in the way the sourcing is done which
involves variation in the methodology of collecting data.
(i) Statistical Survey or sample survey
This form of sourcing of data involves collecting of information based on
inductive approach where we move from specific observation to give the
general observation of all. In simple words the recall of specific sets of
people or respondents are used to generalize for the whole population. It
collects data from a sample of the population and estimating the characteristics
of the population by systematic use of methodology.
Advantages of sample survey source
The sample survey takes a small set of observation and it is more relevant
for particular time and complete enumeration not possible. It saves cost and
time as it involves only few observation and trained staff not required. As
sample is done on selected observation at the outset the process clearly
indicates the limitation which leaves errors very minimum the bias in
enumerator is avoided and the reliability can be assessed easily. For
validation in complete enumeration like Census sample survey is usually
38 conducted.
Disadvantages of sample survey source Data Sources
The sample survey takes a fixed design and cannot be changed throughout
the process of data gathering. Sample may not work in conducting
controversial issues like religion issues because of recalling difficulty.
Sometimes the questions in the survey are standardized and may not be
appropriate.
(ii) Census
Census is process of sourcing of complete enumeration of a population
or groups at a point in time with respect to well defined characteristic
(population, production). Data are collected for a specific reference point
of time. Usually a Census is taken at regular internal of time for making
the sourcing of data comparable. Most Census enumeration is conducted
every five to ten years. Data is collected through questionnaire mailed to
respondents, via the internet or completed by an enumerator visiting the
respondent or contacting them by telephone. The respondents are trained
to handle such enumeration and involved huge costs.
Advantages of Census source
The Census enumeration is the most accurate and complete information
gathering mechanism and does not include any option of error.It covers
a wide spectrum of people covering the complete population. It involves
meticulous planning and the Enumerator is adequately trained to gather
the desired information.
Disadvantages of Census source
Census involves huge cost as it is sent out to the whole population. It
is very time consuming and many a times the data becomes out of date
once it is collected. It sometimes excludes people who are temporarily not
available during the period. In many unidentifiable issues like fishes census
cannot be carried out. It sometimes has a limitation of geographical area.
(iii) Register Sourcing
Register is the process of collecting information that is updated continuously
for a specific purpose and from which statistics can be collected and
produced. It contains information on a complete group of units. It is done
by official organization like National Sample Survey Organisation or the
National Household Survey.
Advantages of Register sourcing
An advantage of the register sourcing is the total coverage even if collecting
and processing represent low cost. It allows producing more detailed
statistics than using surveys. Different registers can be combined and linked
together on the basis of defined keys (personal identification codes, business
identification codes, address codes etc.). Moreover, individual administrative
registers are usually of high quality and very detailed.
Advantages of Register sourcing
A disadvantage is the possible under-coverage that can be the case if the
incentive or the cultural tradition of registering events and changes are
weak, if the classification principles of the register are not clearly defined
or if the classifications do not correspond to the needs of statistical
production to be derived from them. 39
Data Collection There are different types of registers:
Administrative registers or records like National Sample Survey Organization
(NSSO) in India help in collecting data. Using the existing administrative
data for statistical production may be approved by the public because it
can be seen as a cost efficient method; individuals and enterprises are less
harassed by a response burden; data security is better as fewer people handle
it and data have an electronic format.
Private registers such as registers operated by insurance companies and
employer organizations can also be used in the production process of official
statistics, providing there is an agreement or legislation on this.
Statistical registers are frequently based on combined data from different
administrative registers or other data sources.
For businesses, it is often legally indispensable to be registered in their
country to a business register which is a system that makes business
information collection easier.
It is possible to find agricultural registers and registers of dwellings.
Even though different types of data collection exist, the best estimates are
based on a combination of different sources providing the strengths and
reducing the weakness of each individual source.
http://dahd.nic.in/documents/statistics/livestock-census
http://www.cmfri.org.in/marine-fisheries-census/8/2017
https://data.gov.in/catalog/land-use-statistics-lus
http://dahd.nic.in/Division/statistics/animal-husbandry-statistics-division
www.hudcoindia.com
www.urbanindia.nic.in
http://financialservices.gov.in/data-statistics/banking-statistics
http://www.indiantradeportal.in/index.jsp
http://apeda.gov.in/apedawebsite/#
http://www.civilaviation.gov.in/
http://www.dot.gov.in/#
http://tourism.gov.in/
http://shipping.nic.in/
http://www.m ospi.gov.in/statistical-year-book-india/2016/188 https://
view_section.jsp?lang=0&id=0,1,304,366,554.
http://www.mospi.gov.in/statistical-year-book-india/2016/188 https://
www.indiabudget.gov.in/rec.asp
http://www.mospi.gov.in/national-data-bank
http://rchiips.org/nfhs/
http://censusindia.gov.in/2011-common/AHSurvey.html
http://www.fao.org/faostat/en/#data
https://comtrade.un.org/
https://unstats.un.org/unsd/statcom/
http://unctad.org/en/Pages/statistics.aspx
https://datacatalog.worldbank.org/
https://www.oecd-ilibrary.org/statistics
52
MDV-106: RESEARCH METHODOLOGY
IN DEVELOPMENT STUDIES
(6 CREDITS)
BLOCK UNIT UNIT TITLES
NOS
1 Fundamentals of Social Science Research
1 Social Science Research: An Overview
2 Components of Social Science Research
3 Research Designs
4 Research Project Formulation
2 Development Research
1 Basics of Development Research
2 Methods of Development Research
3 Development Research Applications
3 Measurement and Sampling
1 Measurement
2 Scales and Tests
3 Reliability and Validity
4 Sampling
4 Data Collection
1 Quantitative Data Collection Methods and Devises
2 Qualitative Data Collection Methods and Devises
3 Data Sources
5 Data Analysis
1 Overview of Statistical Tools
2 Use of Computer in Data Analysis
3 Data Processing and Analysis
4 Report Writing
MDV-106
Research Methodology
in Development Studies
Indira Gandhi National Open University
School of Extension and Development Studies
Block
5
DATA ANALYSIS
UNIT 1
Overview of Statistical Tools 5
UNIT 2
Use of Computer in Data Analysis 44
UNIT 3
Data Processing and Analysis 68
UNIT 4
Report Writing 82
PROGRAMME DESIGN COMMITTEE
Prof. Amita Shah Prof. P. Radhakrishan
Gujarat Institute of Development Research, Madras Institute of Development Studies, Chennai
Ahmedabad Prof. Ramashray Roy (Rtd)
Prof. S. K. Bhati Centre for Study of Developing Societies, Delhi
Jamia Millia Islamia, New Delhi Prof. R. P. Singh ( Rtd)
Prof. J. S. Gandhi (Rtd) Ex-Vice-Chancellor, MPUAT, Udaipur
Jawaharlal Nehru University, New Delhi Prof. K. Vijayaraghavan
Prof. Gopal Krishnan (Rtd) Indian Agriculture Research Institute, New Delhi
Punjab University, Chandigarh Dr. Nilima Shrivastava
Prof. S. Janakrajan IGNOU, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai IGNOU, New Delhi
Prof. Kumar B. Das Dr. Nehal A. Farooquee
Utkal University, Bhubaneswar IGNOU, New Delhi
Prof. Nadeem Mohsin (Rtd) Dr. P.V.K.Sasidhar
A.N.Sinha Institute of Social Sciences, Patna IGNOU, New Delhi
PROGRAMME DESIGN COMMITTEE (REVISED)
Prof. T.S. Papola Prof. Nadeem Mohsin (Rtd)
Institute for Studies in Industrial Development, A.N.Sinha Institute of Social Sciences, Patna
New Delhi Prof. Rajesh
Prof. S. Janakrajan University of Delhi, New Delhi
Madras Institute of Development Studies, Prof. B. K. Pattanaik
Chennai. IGNOU, New Delhi
Prof. S. K. Bhati Prof. Nehal A. Farooquee
Jamia Millia Islamia, New Delhi IGNOU, New Delhi
Prof. Preet Rustagi Prof. P.V.K. Sasidhar
Institute for Human Development, Delhi IGNOU, New Delhi
Prof. Gopal Iyer (Rtd) Dr. Pradeep Kumar
Panjab University, Chandigarh IGNOU, New Delhi
Dr. S. Srinivasa Rao Dr. Nisha Varghese
Jawaharlal Nehru University, New Delhi IGNOU, New Delhi
Dr. S. Rubina Naqvi Dr. Grace Don Nemching
Hindu College, University of Delhi, Delhi IGNOU, New Delhi
COURSE PREPARATION TEAM
Units Writers: Editors:
Prof. V. K. Tiwari
National Institute of Health and Prof. V.K.Jain (Rtd)
Family Welfare (Content Editor) NCERT, New Delhi
New Delhi (Unit 1, 3 and 4) Mr. Praveer Shukla
Dr. Nisha Varghese (Language Editor), New Delhi
SOEDS, IGNOU (Unit 2) Dr. Nisha Varghese, IGNOU
Prof. B. K. Pattanaik, IGNOU
Prof. Nehal A. Farooquee, IGNOU
Prof. P. V. K. Sasidhar, IGNOU
CourseCoordinator : Dr.NishaVarghese,E-mail :[email protected]
ProgrammeCoordinators: Prof. P.V.K. Sasidhar, Prof. B.K. Pattanaik,
Prof.NehalA.Farooquee,
PRINT PRODUCTION
Mr. S. Burman Mr. KN Mohanan Mr. Babu Lal Rewadia
Deputy Registrar (Publication) Asst. Registrar (Publication) Section Officer (Publication)
MPDD, IGNOU, New Delhi MPDD, IGNOU, New Delhi MPDD, IGNOU, New Delhi
May, 2018
Indira Gandhi National Open University, 2018
ISBN: 987-
All rights reserved. No part of this work may be reproduced in any form, by mimeograph or any other
means, without permission in writing from the Copyright holder.
Further information on the IGNOU courses may be obtained from the University’s office at Maidan
Garhi, New Delhi or the official website of IGNOU at www.ignou.ac.in
Printed and published on behalf of IGNOU, New Delhi by Registrar, MPDD, IGNOU, New Delhi.
Laser Typeset by Rajshree Computers, V-166A, Bhagwati Vihar, (Near Sec. 2, Dwarka), New Delhi
Printed at:
BLOCK 5 DATA ANALYSIS
Block 5 on ‘Data Analysis’ with four units gives an overview of various
tools and techniques of data analysis needed for conducting development
research.
Unit 1 on ‘Overview of Statistical Tools’ provides information about
various measures of central tendency and dispersion. It also discusses
correlation, regression, as well as hypothesis testing.
Unit 2 on ‘Use of Computer in Data Analysis’ discusses the use of MS
Excel and SPSS in data analysis. Use of these software for working out
mean, standard deviation, correlation, regression and testing of hypothesis
has been discussed.
Unit 3 on ‘Data Processing and Analysis’ discusses about data processing
particularly tabulation and graphical presentation. It also briefs about data
coding, editing and feeding.
Units 4 on ‘Report Writing’ discuss about various types of research reports
and details about the various components of a research report.
Data Analysis
4
UNIT 1 OVERVIEW OF STATISTICAL
TOOLS
Structure
1.1 Introduction
1.2 The Data: Meaning and Types
1.3 Frequency Distributions
1.4 Measures of Central Tendency
1.5 Measures of Dispersion
1.6 Hypothesis Testing and Inferential Statistics
1.7 Statistical Tests
1.8 Chi-Square Test
1.9 F-Test
1.10 Z-Test
1.11 t-Test
1.12 Correlation Analysis
1.13 Regression Analysis
1.14 Let Us Sum Up
1.15 Keywords
1.16 References and Selected Readings
1.17 Check Your Progress – Possible Answers
1.1 INTRODUCTION
Statistical tools are the pillars of the research study on which data analysis
for all types of developmental programmes stand. Those who are researchers
also need some understanding of statistical analysis to be able to produce
a meaningful research report. With the availability of several user friendly
software, the performance of statistical analysis has now become a reality,
even for non-statisticians, provided they are computer literate, and understand
the basic principles of statistical analysis.
There are two types of statistics:
(i) Descriptive statistics which include techniques for organizing,
summarizing, and presenting data using tables, graphs, or single numbers
(ii) Inferential statistics which consist of statistical methods for making
inferences about a population based on information obtained from a
sample.
This unit aims to make you conversant with the basic statistical tools
applicable in developmental research.
After studying this unit, you should be able to
l describe various measures of central tendencies and dispersion;
5
Data Analysis l discuss the applicability of various tests involved in hypothesis testing;
l explain the use of correlation in data analysis; and
l describe the use of regression in data analysis.
Note
One should be cautious when calculating and interpreting percentages
where the total number is small because one unit more or less would
make a big difference in terms of percentages. As a general rule,
percentages should not be used when the total is less than 30. Therefore,
it is recommended that the number of observations, or total cases
studied, should always be given together with the percentage.
After having gone through the concept of data, answer the following questions
given in Check Your Progress-1.
Check Your Progress - 1
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end
of the unit.
1. What are the different types of data?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. What are the two types of variables?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
Note
Note 1: In the case of the grouped frequency data given in table 1.1,
the midpoint of the interval will become X i and a similar procedure
for computing the mean can be followed as shown above.
Note 2: However, in the case of open interval data, such as ‘less than
19’ (denoted as <19) or ‘more than 159’ (denoted as > 159), it is not
possible to fix a midpoint, and, therefore, the mean cannot be calculated.
As a consequence, we use the median in place of the mean, which
is explained in the next section.
9
Data Analysis
1.4.2 Median
The median is a value that divides a distribution into two equal halves.
The median is useful when the data is in ordinal scale, i.e., some
measurements are much bigger or much smaller than the other measurement
values. The mean of such data will be biased toward these extreme values.
Thus, the mean is not a good measure of distribution, in this case. The median
is not influenced by extreme values. The median value, also called the central
or halfway value, is obtained in the following way:
l List the observations in order of magnitude (from the lowest to the
highest value, or vice versa).
l Count the number of observations = n.
l The median value is the middle value, if n is odd, it will be the (n+1)/
2th term and if n is even, it will be equal to the mean of two middle
values, {i.e., (n/2) and the next value}
Example 3:
Case 1: The weights of 7 women are:
S.No. Weight of women (kg)
1 40
2 41
3 42
4 43
5 44
6 47
7 72
The median value is the value belonging to observation number (7 + 1)/
2, which is the fourth one value: 43 kg.
Case 2: If there are 8 observations:
S.No. Weight of women (kg)
1 40
2 41
3 42
4 43
5 44
6 47
7 49
8 72
The median would be the average of ‘(n/2)th or the 4th value i.e. 43’ and
‘next value, i.e., 44’}; the median in this case would be (43+44)/2 = 43.5
kg}.
10
Calculation of Median for Grouped Data Overview of Statistical
Tools
We can use the grouped data of the table 1.1 of section 1.3.1, ‘Distribution
of clinics according to number of patients treated for malaria in one month’
for calculation of median, which is given below
Table 1.2: Distribution of clinics according to number of patients
treated for malaria in one month
Number of Number Cumulative
patients of clinics frequency
0 - 19 5 5
20 - 39 8 13
40 - 59 10 23
60 - 79 11 34
80 - 99 19 53
100 - 119 10 63
120 - 139 9 72
140 - 159 8 80
Total 80
Step1: The total of frequency is first divided by 2, i.e., 80/2 (=40). The
cumulative frequency 40 will correspond to the class interval (80-99). This
is called the median interval.
N
Step2: The formula is Median L F d / f
2
Step3: Record all values of symbol variables from the table as given
below:
L (=80) is the lower limit of the median interval,
F (=34) is the cumulative frequency of the class, preceding to
median class,
d (=20) is the width of class interval,
f (=19) is the frequency of median class.
Step4: Replace the symbol values with numeric values as noted in step3
in the formula,
Therefore, Median = 80+ [(40-34) x 20] / 19 = 80 + 6.32 = 86.32 patients.
1.4.3 Mode
The mode is the most frequently occurring value in a set of observations.
The mode is not very useful for numerical data that are continuous. It is
most useful for numerical data that have been grouped. The mode is usually
used to find the norm among populations and is calculated when the
calculation of mean and median is inappropriate, viz., the average shoe size
of the Indian population, standard birth weight, etc. The mode can also be
used for categorical data, whether they are nominal or ordinal.
11
Data Analysis In Example 1 (height of 7 girls) the mode is 141.
In Example 2 the mode is 144 as 4 persons are with this observation.
Calculation of Mode for Grouped Data
We will again use the grouped data given in table 1.1 to calculate the mode.
The steps for computing the mode are
f m f1
Step1: Formula for calculating mode, M o d e L 2 f f f h
m 1 2
Step2: Find the class interval with the largest frequency, which is also
called the modal class. In this case, the class interval ‘80-99’ has
the maximum frequency, equal to 19.
Step3: Record all values of the symbol variables used in the formula.
L= (80) is the lower limit of the modal class,
19 11
= Mode 80 20
2 19 11 10
=80+9.41 = 89.41 patients
1.4.4 Relationship between Mean, Median, and Mode
In normal data, mean, median, and mode are same. However, in a moderately
skewed (non normal) distribution, Mode = 3 Median - 2 Mean.
In summary, the mean, the median, and the mode are all measures of central
tendency or measures of location. The mean is most widely used. It contains
more information because the value of each observation is taken into account
in its calculation. However, the mean is strongly affected by values far from
the centre of the distribution, while the median and the mode are not. The
calculation of the mean forms the beginning of more complex statistical
procedures, like, correlation and regression, etc., to describe and analyze
data. In general, as the skewness increases, the mean and median move away
from the mode. If the mean is less than the median, the data is skewed
to the left; and if it is greater than median, the data is skewed to right.
The choice of central tendency, therefore, depends upon the type and
distribution of data. Nowadays, with the easy availability of scientific
12
calculators and computers (Excel and other statistical software), the Overview of Statistical
calculation of mean, median, mode, etc., has become very simple. Tools
values X i X in a data set are calculated by subtracting the mean, X ,
of the data from each observation X i . Variance is the mean of the sum of
squares of all the deviation scores of a data. Mathematically this is written
as follows.
Variance = s2 = ,
Xi
2
1
An easy to calculate formula is: s
2
n 1
Xi2
n
sd Variance
X = Mean
n = Number of observations
Large values of variance and standard deviation represent higher variability
in the data and vice versa. If the value of variance is equal to ‘zero’, it
represents no variability in the data. To obtain the standard deviation of
a set of measurements one has to carry out the following steps:
i) Calculate the mean of all the measurements.
ii) Calculate the difference between each individual measurement and the
14 mean.
iii) Square all these differences. Overview of Statistical
Tools
iv) Take the sum of all squared differences.
v) Divide this sum by the number of measurements minus one.
vi) Finally take the square root of the value obtained (in order to get back
to the same unit of measurement).
Example 6: Suppose you need to calculate the standard deviation of 2, 4,
6, 8, 10 and 12.
S. No. X
1 2 -5 25
2 4 -3 9
3 6 -1 1
4 8 1 1
5 10 3 9
6 12 5 25
Sum 42 70
Variance = s2 = = = 14
(sd): 14= 3.74.
Fortunately many pocket calculators can do this calculation for us, but it
is still important to understand what it means. In the case of grouped data,
the mid value of the interval may be taken as observation value and the
above procedure can be followed.
In the above sections you studied about the measures of central tendency
and the measures of dispersion. Now try and answer the questions in Check
Your Progress-2.
Check Your Progress - 2
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end
of the unit.
1. What are the different measures of central tendency?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................
2. What are the different measures of dispersion?
................................................................................................................
................................................................................................................
................................................................................................................
................................................................................................................ 15
Data Analysis
1.6 HYPOTHESIS TESTING AND
INFERENCIAL STATISTICS
1.6.1 Understanding True Difference
The analysis and interpretation of the results of our study must be related
to the objectives of study. It is important to tabulate the data in univariate
and/ or bi-variate or multivariate tables appropriate to the research
objectives. We may find some interesting results. For example, in a study
on nutrition, we find that 30% of the women included in the sample are
anaemic as compared to only 20% of the men. How should we interpret
this result?
l The observed difference of 10% might be a true difference, which also
exists in the total population from which the sample was drawn.
l The difference might also be due to the chance; in reality there is no
difference between men and women, but the sample of men just
happened to differ from the sample of women. One can also say that
the observed difference is due to sampling variation.
l A third possibility is that the observed difference of 10% is due to
defects in the study design (also referred to as Bias). For example,
we only used male interviewers, or omitted a pre-test, so we did not
discover that anemia is a very important topic for women which require
a female investigator.
If we feel confident that an observed difference between two groups cannot
be explained by bias, we would like to find out whether this difference
can be considered as a true difference. We can only conclude this only if
we can rule out chance (sampling variation) as an explanation. We
accomplish this by applying a test of significance. A test of significance
estimates the likelihood that the observed result (e.g., a difference between
two groups) is due to chance or real. In other words, a significance test
is used to find out whether a study result, which is observed in a sample,
can be considered as a result which indeed exists in the study population
from which the sample was drawn.
1.6.2 Tests of Significance
Different sets of data require different tests of significance. Throughout this
section, two major sets of data will be distinguished.
l Two (or more) groups, which will be compared to detect differences.
(e.g., men and women, compared to detect differences in anemia.)
l Two (or more) variables, which will be measured in order to detect
if there is an association between them. (e.g., between anemia and
income.)
(i) How Tests of Significance Work
The reasoning behind significance tests is the same, no matter whether a
researcher is comparing two groups for differences or whether s/he is
measuring two variables to detect possible associations.
We will first concentrate on the comparison of groups.
16 l Suppose you observed a difference between two groups in your sample.
l You want to know whether this observed difference between the two Overview of Statistical
groups represents a real difference in the study population from which Tools
the sample was drawn, or whether it just occurred by chance (due
to sampling variation).
l To find out, you determine how likely this difference could have
occurred by chance, if in the population no difference exists between
the two groups.
In any study that is looking for differences between groups or associations
between variables, the likelihood or probability (p) of observing a certain
result by chance has to be calculated by statistical tests.
(ii) How to state Null (Ho) and Alternative (H1) Hypothesis:
In statistical terms the assumption that no real difference exists between
groups in the total study (target) population (or, that no real association exists
between variables) is called the Null Hypothesis (Ho). The Alternative
Hypothesis (H1) is that there exists a difference between groups or that a
real association exists between variables.
Null Hypothesis is also known as the hypothesis of no difference. Examples
of null hypotheses are
l There is no difference in the incidence of measles between vaccinated
and non-vaccinated children.
l There is no difference between the alcohol consumption of male and
female.
l There is no association between families’ income and malnutrition in
their children.
However, if by testing the hypothesis we find that the result is statistically
significant, we reject the Null Hypothesis (Ho) and accept the Alternative
Hypothesis (H1) that there is real difference between two groups, or a real
association between two variables. Examples of alternative hypotheses (H1)
are:
l There is a difference in the incidence of measles between vaccinated
and non-vaccinated children.
l Males drink more alcohol than females.
l There is an association between families’ income and malnutrition in
their children.
Be aware that ‘statistically significant’ does not mean that a difference or
an association is of practical importance. The tiniest and most irrelevant
difference will turn out to be statistically significant if a large enough sample
is taken. On the other hand, a large and important difference may fail to
reach statistical significance if too small a sample is used.
(iii) The Concept of Type I and Type II Error
The testing of hypothesis will lead the investigator to take one of the
following decisions:
17
Data Analysis
Oi Ei
2
Chi-square=
2
df (the summation sign, also called
Ei
sigma) directs you to add together
each value of (Oi – Ei) 2 / Ei for
all the ‘k’ cells of the table.
Where, Oi and Ei are the observed
and expected frequency of each
cell and ‘df’ is the degree of
freedom.
The alternative simplified formula for calculating chi-square is:
O i2
2
Ei
N , where N is the total of observed frequency..
O 2 O 2 O2 O2
The alternative formula 12 1 2 3 4 N
E1 E 2 E 3 E 4
Using a Chi-square table
The calculated chi-square value has to be compared with a theoretical chi-
square value in order to determine whether the null hypothesis is rejected
or not. Annex I contains a table of theoretical chi-square values.
(1) First you must decide what significance, or alpha, level you want to
use ( value). We usually take 0.05.
(2) Then, the degrees of freedom have to be calculated. With the chi-square
test the number of degrees of freedom is related to the number of cells,
i.e., the number of groups you are comparing. The number of degrees
of freedom is found as:
21
Data Analysis
Calculating the Chi-square 2 value
First the expected frequencies for each cell are computed for calculating
the Chi-square value as follows:
Distance from Used health Did not Use Total
health care centre care centre health care centre
86 80 69 80
Less than 10km E1 44.4 E 2 35.6 80
155 155
86 75 69 75
10km or more E3 41.6 E 4 33.4 75
155 155
Total 86 69 155
Note that the expected frequencies refer to the values we would have
expected, given the total numbers of 80 and 75 women in the two groups,
if the null hypothesis, there is no difference between the two groups, were
true. Now the chi-square value can be calculated:
Chi-square= + + +
44.4 35.6 41.6 33.4
=
2
0.98 + 1.22 + 1.05 + 1.30 = 4.55
With the alternative formula, the value of chi-square is same and is given
below.
8 9
= 37.848
1.10 Z -TEST
To test the hypothesis about a population mean or two population means
when the sample size is large (>30) and population variances are known,
we use the Z-test.
1.10.1 Testing of Significance of Difference between Two
Proportions (Two Large Samples)
Let there be two large samples drawn from one population or from two
populations, having same variance. The populations are distributed normally.
The proportions, p1 and p2 are the two proportions of an event from the
two samples ( >30), which are compared for their difference. We may apply
z
p1 p 2
1 1 n1 p1 n 2 p 2
the formula: PQ , where, P n1 n 2 , Q
n1 n 2
= 1 - P
The P is the mean proportion of success of the two proportions, p1 and
24 p2, and n1 and n2 are the respective sample sizes.
The above calculated value of ‘z’ is compared with the tabulated value of Overview of Statistical
‘z-normal variate’ for ‘ = 0.05’, i.e., at 5% level of significance, which Tools
is 1.96 and the value of z for ‘ =0.01’, i.e., at 1% level significance is
2.58.
Example 8: To test the conjecture of the management that 60 % of employees
prefer the new bonus scheme, a sample of 150 employees was drawn, and
their opinion was taken, whether they favoured it or not. Only 55 employees
out of 150 favoured the new bonus scheme.
Thus, we test the hypothesis, H0: P = 0.60
H1: P 0.60
Calculating the test statistic:
Z = 0.367-0.60 Since p = 55/150 = 0.367
0.60*0.40
150
Z = - 11.65
| Z | = 11.65
The Table value
At = 0.01, ztab = 2.58.
Interpreting the results
As Zcal > Ztab, hence H0 is rejected. It means that 60 percent of the employees
do not favour the new bonus scheme.
1.10.2 Testing of Significance of Difference between Means
of Two Large Samples (Continuous Data)
In case of large samples where population variances are known, we test
the equality of two population means using z – test.
Hypothesis:
H0: µ = µ0
H1: µ µ0
Calculating the test statistic:
Z = x - µ0
/ n
where, x is the sample mean and is the standard deviation.
The Table value
The value of z () for comparing the calculated test statistics is taken as
1.96 at 5% level of significance (); and it is 2.58 at 1% level of significance
(). Here, the sample size is treated as large and, therefore, the degree
of freedom plays no role, unlike in the t-test.
Example 9: The table below gives the total income in thousand rupees per
year of 36 persons selected randomly from a particular class of people.
25
Data Analysis
Income (thousand Rs)
6.5 10.5 12.7 13.8 13.2 11.4
5.5 8.0 9.6 9.1 9.0 8.5
4.8 7.3 8.4 8.7 7.3 7.4
5.6 6.8 6.9 6.8 6.1 6.5
4.0 6.4 6.4 8.0 6.6 6.2
4.7 7.4 8.0 8.3 7.6 6.7
On the basis of the sample data, can it be concluded that the mean income
of a person in this class of people is Rs. 10,000 per year?
We have to test the hypothesis H0 : µ = 10,000
H0 : µ1 10,000
Calculating the test statistic:
Since the sample size is 36, we will use a normal test for which the test
statistic is
Z = x - µ0
/ n
Now we compute x and .
x = 280.7/36 = 7.80
2 = 1/35{2368.75 – (2368.75 – (280.7)2/36}
= 5.14
= 2.27
Z = ( 7.80 – 10 )
2.27/36
= - 5.81
| Z | = 5.81
The Table value: Ztab = 1.96
Interpretation: Since Zcal >Ztab, reject H0. It means that the average annual
income is less than ten thousand rupees.
Assumption for use of z statistics: The assumption for using the z statistics
is that the parent population, from where samples have been drawn, should
be normal. The z statistics presumes that the population variances ( 12 and 22 )
of the parent populations are known and, therefore, the z statistics for testing
z
X X
1 2
X X
1 2
2 2
use the z statistics sd1 sd 2 . In z statistics, we refer to the tables of normal
n1 n2
area curve, where degrees of freedom do not play any role. The computation
of z statistics is almost same except for computing of standard error (SE),
sd12 sd 22
which is given as .
n1 n2
1.11 t-Test
1.11.1 Testing the Significance of Independent Samples
from Two Groups for Continuous Data
Example 10: It has been observed that in a certain province the proportion
of women, joining the army, is very high. A study is, therefore, conducted
to discover why this is the case. The height of women is supposed to be
the contributory factor; the researcher may want to find out if there is a
difference between the mean height of women in this province who preferred
joining army and of those who opted for other services. The null hypothesis
would be that there is no difference between the mean heights of the two
groups of women. Suppose the following results were found:
Table 1.4: Mean heights of women as per type of service
Type of Service Sample size Mean height Standard
in cm deviation
The mean height for each of the two samples was calculated and compared,
using the t-test, to determine whether there was a difference.
These are the steps to follow, in determining whether the difference is
statistically significant:
Hypothesis:
Ho: There is no significant difference between the heights of the women
joining army and other services.
H1: There is difference between the heights of the women joining army
and other services.
Test Statistic: A t-test would be the appropriate way to determine whether
the observed difference of 2 cm can be considered statistically significant.
27
Data Analysis
t df
X X
1 2
S
1 1
deg rees of freedom , df n 1 n 2 2
n1 n 2
s 2
n 1 1 s d 12 n 2 1 s d 2 2
Pooled Variance
n1 n 2 2
= 2
(2) Calculate the standard deviation (square root of variance S2) of all
observations pooled together for both the samples.
In case the standard deviations for each of the study groups are given
or have been calculated, then compute the pooled variance of samples
(S2) as given:
1 1
S SE 2.96 0.01895 0.5608
n1 n 2
(4) Finally, divide the difference between the means by the standard error
of the difference. The value now obtained is called t-
2
value. t 3.57
0.5608
The Table value
Once the t-value has been calculated, you will have to refer to a t-table,
from which you can determine whether the null hypothesis is rejected or
not. Annex II contains a t-table.
(1) First, decide which significance level ( value) you want to use. Usually
we choose a significance level of 0.05.
(2) Second, determine the number of degrees of freedom for the test being
performed. For student’s t-test the number of degrees of freedom is
calculated as the sum of the two sample sizes minus 2. Thus, for
Example 1.6, it is calculated as follows:
The number of degrees of freedom is: d.f. = 60 + 52 - 2 = 110
(3) Third, the t-value belonging to the ‘’ value (the significance level we
28 chose) and the degrees of freedom are located in the table.
In our example we look up the t-value belonging to = 0.05 and d.f. Overview of Statistical
= 120 (nearest to 110 in the table) and we find it is 1.98. Tools
mean of differences d
t df t df
s tan dard error of differences , sd(d) n
29
Data Analysis Table 1.5: Results of quality control exercise during a nutritional survey
21.1
Mean difference = = 1.05
20
30
ii) Calculate the standard deviation of the differences and Standard Error, Overview of Statistical
Tools
1.77
Standard deviation = 1.77, and the standard error: SE 0.40
20
iii) The value of ‘t’ is the mean difference divided by the standard error:
1.05
t 2.62
0.40
The degrees of freedom are the sample size (the number of pairs of
observations) minus 1, which in this case is (20 – 1 = 19). The tabled
t-value at 19 degrees of freedom is 2.09
The Interpretation
tcal > ttab
If the calculated t-value (ignoring the sign) is larger than the value indicated
in the table, the null hypothesis, stating that there is no difference, is rejected,
and it can be concluded that there is a significant difference in the result
of your study.
Note: Computers are helpful when dealing with large data sets. A variety
of software including Excel and SPSS provides options for various statistical
tests.
In this section you have read about the various tests of significance. Now
try and answer the questions given in Check Your Progress-4.
Check Your Progress - 4
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end
of the unit.
1. What are the important considerations that one should keep in mind
while applying the chi-square test?
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
2. What is a paired t-test and when is it used?
.................................................................................................................
.................................................................................................................
.................................................................................................................
.................................................................................................................
31
Data Analysis
1.12 CORRELATION ANALYSIS
When exploring associations between variables we have to distinguish
between nominal, ordinal, and continuous data. In this unit, we will examine
associations between continuous data where a linear relationship is suspected.
1.12.1 Understanding Correlation
Correlation is relationship between the two sets of continuous data; for
example the relationship between height and body weight. Correlation
statistics are used to determine the extent to which two independent variables
are related and can be expressed by a measure called ‘coefficient of
correlation’. The correlation coefficient may be positive or negative and
therefore it may vary from ‘-1’ to ‘+1’. Positive correlation means that values
of two different variables increase and decrease together. For example, height
and weight correlate positively. Negative correlation means that if the value
of one variable decreases then the value of the other variable increases
(inverse relationship). For example, literacy and number of children in family
may correlate negatively.
The strength of a correlation is determined by the absolute value of the
correlation coefficient; the closer the value to 1, the stronger the correlation.
For example, a correlation of -0.9 indicates an inverse relationship between
two variables and shows a stronger relationship than that associated with
a correlation of +0.2 or -0.5. Correlation between two variables is shown
by scatter plot (Fig. 1.1).
Figure 1.1: Scatter Diagram showing relation between two variables
25
21, 21
20
17, 16
15
Y 13, 13
10, 11
10
5
5, 4
0
0 5 10 15 20 25
X
25
25
20 5, 21
21, 21
20 10, 17
15
17, 16 13, 13
Y
15
13, 13 10 16, 10
10, 11
10
5 21, 5
5 5, 4
0
0 0 5 10 15 20 25
0 5 10 15 20 25
X
X
32
Overview of Statistical
Tools
C : Exam ple o f Pos itive L ow D: Example of Negative Low Correlation
Co rre latio n
25
25 5, 21
20 7, 20
20 15, 20 21, 21 10, 17
15 7, 14
12, 17 13, 13
Y
15
17, 16 17, 12
10 8, 10 16, 10 20, 10
6, 12
8, 13 13, 13 13, 8
10
10, 11
5 12, 6 21, 5
10, 8
14, 7 20, 7
5 5, 4
0
0 5 10 15 20 25
0
0 5 10 15 20 25 X
X
The slopes of both the lines are identical in these two examples, but the
scatter around the line is much greater in the second. Clearly the relationship
between variables y and x is much closer in the first diagram.
If we are interested only in measuring the association between the two
variables, then Pearson’s Correlation Coefficient (r) gives us an estimate
of the strength of the linear association between two numerical variables.
Pearson’s Correlation Coefficient can either be calculated using a calculator
or on computer using various analytical software. Note that in case there
is curvilinear relationship, the value of r will be shown to be zero. The
correlation coefficient has the following properties:
1. For any data set, r lies between ‘-1’ and ‘+1’.
2. If r = +1, or -1, the linear relationship is perfect, that is, all the points
lie exactly on a straight line. If most of the points lie on the line, then
it is very strong relationship and r is near to 1. If r = +1, variable
y increases as x increases (i.e., the line slopes upwards). (See Diagram
A.) If r = -1, variable y decreases as x increases (i.e., the line slopes
downward). (See Diagram B.)
3. If r lies between 0 and +1, the line slopes upwards, but the points
are scattered about the line. (See Diagram C.) The same is true of
negative values of r, between 0 and -1, but in this case the line slopes
downward. (See Diagram D.)
4. If r = 0, there is very low linear relationship between y and x. This
may mean that there is no relationship at all between the two variables
(i.e., knowing x tells us nothing about the value of y). (See Diagram
E.).
33
Data Analysis
1.12.2 Calculation of the Pearson’s Correlation Coefficient
In a nutrition study in a large rural district, a sample of 20 children 5 years
of age were weighed and their family incomes estimated. The results were
as follows:
Table 1.6: Weights and family incomes of 20 children 5 years of age
Serial Family Weight Serial Family Weight
Number Income in in kg Number Income in in kg
$ Per year $ Per year
1. 130 15.5 11. 225 18.1
2. 200 19.8 12. 95 17.4
3. 345 21.5 13. 130 17.9
4. 245 18.8 14. 330 17.0
5. 155 12.8 15. 295 18.7
6. 300 18.8 16. 170 16.0
7. 360 18.1 17. 250 18.2
8. 105 18.7 18. 355 16.4
9. 80 13.1 19. 220 15.4
10. 275 20.1 20. 175 17.6
n
(X i X) Yi Y
r i 1
n n
(X i X)2
i 1
(Y Y)
i 1
i
2
n
(X i X) Yi Y
r i 1
n n
(X
i 1
i X)
2
(Y Y)
i 1
i
2
r
X Y ( X Y ) / n
i i i i
2 2
n n
Xi Yi
X i2 i 1 Yi2 i 1
n n
i 1 n
i 1 n
34
Overview of Statistical
4440 349.9 Tools
79416.5
r 20
19713600 122430
1141550 6210.77
20 20
r = 0.466
which is positive (indicating an upward sloping line), but away from 1
(indicating that there is plenty of scatter around the line) showing that there
is a positive relation between ‘family income’ and ‘weight of five-year-
olds’, however the relationship is not strong.
1.12.3 The Significance of the Correlation Coefficient
The value of r was calculated from a sample of just 20 children. The result
is, therefore, subject to sampling error, and is unlikely to be equal to the
true value of r, which we would obtain if we measured all 5-year-old
children in this district. The question arises as to whether there really is
any relationship at all between weight and income. Perhaps, in the entire
population of 5-year-old children, the scatter diagram would look like
diagram ‘E’, above, (no relationship between y and x) and the positive
relationship in our sample occurred by chance. To assess whether this is
the case we do a significance test on r. The null hypothesis is
H0: There is no relationship between X and Y
H1: There is significant relationship between X and Y
To do the test we calculate
n2
t n 2 r
1 r2
We compare this value of t to tables of the t distribution with (n - 2) degrees
of freedom, where n is the number of observations.
18
In our example: n = 20, r = 0.466, t18 0.466 2.23
1 0.4662
Therefore, using value of 0.05, the t-table value for 18 degrees of freedom
(t18; 0.05) = 2.10. Thus the calculated t-value is more than the table value;
therefore we reject the null hypothesis and accept the alternative hypothesis
that the linear relationship is statistically significant.
35
Data Analysis Where = the intercept of the regression line
ß = the slope of the regression line
which would provide a best fit for the data points. Here the best fit will
be understood as a line that minimizes the sum of squared residuals of the
linear regression model.
By using calculus, it can be shown that the values of and that minimize
the objective function Q are
36 3 80 70 2 -7 4 49 -14
Overview of Statistical
4 70 65 -8 -12 64 144 96
Tools
5 60 70 -18 -7 324 49 126
Sum 390 385 730 630 470
Mean 78 77
_
The regression equation is a linear equation of the form: y = + ßx. To
conduct a regression analysis, we need to solve for and ß. The
computations are shown below.
_ _
_ = [(xi
_ - x ) (yi - y )] / [ (xi - x)2] =
y - ß* x
= 470/730 = 0.644 = 77 - (0.644)(78) = 26.768
_
Therefore, the regression equation is: y = 26.768 + 0.644x.
1.13.3 How to Use the Regression Equation
Once you have the regression equation, using it is easy. Choose a value
for the independent variable
_ (x), perform the computation, and you have
an estimated value ( y ) for the dependent variable. In our example, the
independent variable is the student’s score on the aptitude test. The dependent
variable is the student’s statistics grade. If a student made an 80 on the
aptitude test, the estimated statistics grade would be
_
y = 26.768 + 0.644x = 26.768 + 0.644 * 80 = 26.768 + 51.52 = 78.288
1.13.4 Difference between Regression and Correlation
S.No. Correlation Regression
1 Correlation quantifies the degree to
which two variables are related. Regression finds out the
You simply are computing a best fit line for a given set
correlation coefficient (r) that tells of variables.
you how much one variable tends
to change when the other one does.
39
Data Analysis Annexure-II: Table of the values
1.15 KEYWORDS
Independent variable: the characteristic being observed or measured which
is hypothesized to influence an event or outcome (dependent variable), and
is not influenced by the event or outcome, but may cause it, or contribute
to its variation.
Dependent variable: a variable whose value is dependent on the effect
of other variables (independent variables) in the relationship being studied.
Mean: the mean (or, arithmetic mean) is also known as the average. It is
calculated by totaling the results of all the observations and dividing by
the total number of observations.
Median: the median is the value that divides a distribution into two equal
halves. The median is useful when some measurements are in ordinal scale,
i.e., much bigger or much smaller than the rest.
Mode: the mode is the most frequently occurring value in a set of
observations. The mode is not very useful for numerical data that are
continuous. It is most useful for numerical data that have been grouped. The
40
mode is usually used to find the norm among populations.
Range: this can be represented as the difference between maximum and Overview of Statistical
minimum value or, simply, as maximum and minimum values. Tools
Percentiles: percentiles are points that divide all the measurements into 100
equal parts. The 30th percentile (P3) is the value below which 30% of the
measurements lie. The 50th percentile (P50), or the median, is the value
below which 50% of the measurements lie.
Mean Deviation: this is the average of deviation from arithmetic mean.
Standard Deviation: this denotes (approximately) the extent of variation of
values from the mean.
Parametric statistical test is a test whose model specifies certain conditions
about the parameters of the parent population from which the sample was
drawn.
Non-parametric statistical test is a test whose model does not specify
conditions about the parameters of the parent population from which sample
was drawn.
Normal Distribution: The normal distribution is symmetrical around the
mean. The mean, median, and mode assume the same value if observations
(data) follows a normal distribution.
Sampling Variation: any value of a variable obtained from the randomly
selected sample (e.g., a sample mean) cannot assume the true value in the
population. The variation is called a sampling variation.
Test of Significance: a test of significance estimates the likelihood that an
observed study result (e.g., a difference between two groups) is due to chance
or real.
43
Data Analysis
UNIT 2 USE OF COMPUTER IN DATA
ANALYSIS
Structure
2.1 Introduction
2.2 Need for the Use of Computer in Data Analysis
2.3 Working with Excel
2.4 SPSS: An Overview
2.5 Data Management
2.6 Using Descriptive Statistics to Describe Your Data
2.7 Testing of Hypothesis Using SPSS
2.8 Multiple Regression Using SPSS
2.9 Let US Sum Up
2.10 References and Selected Readings
2.11 Check Your Progress – Possible Answers
2.1 INTRODUCTION
Data analysis is an integral part of any research and more particularly of
development research. It is a process of applying statistical or logical
techniques to describe, illustrate, condense or evaluate data. In this unit we
will be dealing with data analysis on MS excel and SPSS. MS excel allows
the creation, analysis and revision of data. Excel is a spreadsheet programme,
used for data analysis, in the Microsoft Office system. Excel offers a variety
of functions like tracking data, building models for analyzing data, writing
formulas to perform calculations on the data, pivot the data in numerous
ways and present the data in a number of chart forms. SPSS or Statistical
Package for Social Sciences is a windows based programme that can be
used for data entry and analysis and also to create tables and graphs. SPSS
is a comprehensive and flexible statistical analysis and data management
solution. SPSS can take data from almost any type of file and use them
to generate tabulated reports, charts, and plots of distributions and trends,
descriptive statistics, and conduct complex statistical analyses. You will find
SPSS customers in virtually every industry, including telecommunications,
banking, finance, insurance, healthcare, manufacturing, retail, consumer
packaged goods, higher education, government, and market research.
After reading this unit you will be able to:
l feed data on MS Excel and work out mean and standard deviation;
l prepare a chart using MS Excel;
l feed your research on SPSS;
l manage and do basic transformations needed for your research on SPSS;
l calculate descriptive for your data on SPSS;
l perform testing of hypothesis on SPSS;
l work out correlation on SPSS; and
44 l work out regression analysis on SPSS.
Use of Computer in
2.2 NEED FOR THE USE OF COMPUTER Data Analysis
IN DATA ANALYSIS
In research, you are required to handle a lot of data. Handing a small data
on a piece of paper with the use of a calculator is easy but imagine
calculating mean of a variable for 200 or more data. Using a calculator
in this case would not just be time consuming but you are also in the danger
of making mistakes. You may just punch in a wrong number and not notice
it. Such mistakes can lead to erroneous results. Wrong results can render
your research meaningless and hence you need to be very cautious in your
data analysis and interpretation. Using computer to analyse your data will
not just speed up the process of data analysis but also help in reducing
the errors. Some of the advantages of use of computers in data analysis
is given below:
1. It helps in reducing/eliminating errors in calculation
2. It helps in better data management as you can easily add or delete
variables and observations, code and recode values etc.
3. It is very easy to depict the results in graphs and charts
4. Several users can work on the same set of data simultaneously by
sharing the data file
5. It is a faster and more efficient way of handling data
45
Data Analysis In order to select a cell, click in the cell and in order to select a group
of cells, click in a corner cell and, with the left mouse button depressed,
drag the cursor horizontally and/or vertically until all of the cells you want
selected are outlined in black. To select multiple cells that are not contiguous,
press and hold the Ctrl key while clicking in the desired cells. To select
every cell in the worksheet, click in the upper right corner of the worksheet
to the left of “A.” .
2.3.2 Entering Data in a Worksheet
To enter data in a cell, you just need to click in that cell and type. The
contents in a cell can be edited by double clicking in the desired cell and
then making the needed correction. To insert a new row in a worksheet,
right click on the row number and click insert. A new row will be added
above the row that was clicked. To delete a row, right click on the row
number and click delete. In the same way, to insert a new column, right
click on a column head and click insert. A new column will be inserted
to the left of the column that was clicked. To delete a column, right click
on the column letter and click delete.
Worksheet tabs are found at the bottom of the workbook, towards the left.
If you want to open any particular worksheet within a workbook, you just
need to click on that particular worksheet tab. It is also possible to rename
the worksheet. In order to rename a worksheet, you need to right click on
the worksheet tab and click on the rename option. The new name can then
be entered. In order to insert an additional worksheet, right click on the
worksheet tab and click insert. A new worksheet will be inserted towards
the left. Similarly, to delete an existing worksheet, right click on that
worksheet tab and click delete. The positions of the worksheet can also
be changed by left clicking on the worksheet tab and then dragging it to
the desired position.
2.3.3 Basic Calculations on Excel
Calculating Mean in Three Easy Steps
We take a simple example in which we have the weights of 8 male and
8 female students. We need to calculate the average weight of male and
female students.
Step 1: Click on the ‘formula’ tab on the top and then click ‘insert function’.
Step 2: Select the option ‘All’ in the Select a category option so that all
the available functions are displayed in the box under the heading Select
46 a function. Select ‘AVERAGE’ function in the box and click OK.
Use of Computer in
Data Analysis
Step 3: Another box titled Function Arguments opens. Within this box are
options ‘Number 1’ and ‘Number 2’. In ‘Number 1’ you need to select the
numbers for which you want the average. This can be done by left clicking
on the first number 40 and then dragging it upto the last number which is
50 in case of male students. Then press the OK button. It will give the
average weight of eight male students as 50.625.
Calculating Standard Deviation in Three Easy Steps
Step 1: Click on the ‘formula’ tab on the top and then click ‘insert function’.
Step 2: Select the option ‘All’ in the Select a category option so that all
the available functions are displayed in the box under the heading Select
a function. Select ‘STDEV’ function in the box and click OK.
Step 3: Step 3: Another box titled Function Arguments opens. Within this
box are options ‘Number 1’ and ‘Number 2’. In ‘Number 1’ you need to
select the numbers for which you want the average. This can be done by
left clicking on the first number 40 and then dragging it upto the last number
which is 50 in case of male students. Then press the OK button. It will
give the standard deviation of the weight of eight male students which is
2.67. Similarly the standard deviation of female students works out to be
2.96.
Calculating Correlation
Step 1: Click on the ‘formula’ tab on the top and then click ‘insert function’.
Step 2: Select the option ‘All’ in the Select a category option so that all
the available functions are displayed in the box under the heading Select
a function. Select ‘CORREL’ function in the box and click OK.
Step 3: Another box titled ‘Function Arguments’ opens which has two boxes
‘Array1’ and ‘Array2’. Select the data on temperature for ‘Array1’ and the
data on ice cream sales for ‘Array2’. Then press the OK button. It will
give the correlation between temperature and the corresponding ice cream
sales as .976 indicating that there is a very high correlation between the
temperature levels and sale of ice creams.
47
Data Analysis
Activity
Feed the data of the weight of Female Students in the example shown
in the picture above in Excel and work out the mean and standard
deviation.
Excel can be used to work out correlations, regression, carrying out tests
of significance and several other purposes. In this section you have seen
how to work out the basic calculations and charts. You can explore all the
other options on Excel as the more you use it the more you learn.
i. The Data View: In the data view, the data are laid in a rectangular
format of rows and columns in which the columns represent the variables
and each row represents a unit of observation. The data in a column
should be of the same type ie. either numeric (numbers) or string
(characters). Data values can be added by typing the values directly
in the data view.
ii. The Variable View: You can easily switch between data view and
variable view by clicking on the relevant button on the lower left corner.
In the variable view you can edit the information that defines a variable.
Each row of the variable view describes a column of the data view.
The first attribute or the Name is how the data column is named or
identified. The Name may contain characters, numerals, punctuation
marks etc. However, space is considered as an illegal character while
naming a variable. The Type of data is generally a numeric or a string
but SPSS also provides with other options (comma, dot, scientific
notation, currency etc.) by which you may define the type of your data.
Label may be considered as a longer description of the variable name.
Values allow you to create a list of value labels for example you may
use 1 and 2 for male and female gender. If a researcher wants SPSS
to designate certain data values as missing values, it can be done
conveniently on SPSS using the missing attribute. For instance the
researcher may give a number, say ‘8’ to all the responses in which
respondent says “I don’t know” in response to a question, the researcher
can have SPSS treat all 8s in a variable as missing data. 51
Data Analysis 2. Working with Output Viewer:
All the statistical tables and graphs analysed in the data editor can
be viewed in the output viewer. The output viewer consists of two
sections: the one on the left is the outline pane and the one on the
right is the tables pane. Whenever the output for any analysis is
generated, it is added on the tables pane as object. Selection of an
object can be done by clicking on it in the tables pane or clicking
the corresponding entry in the outline pane. It is also possible to edit
the objects on the table plane by double clicking on the object. To
delete the object, simply select the object and press the Delete key.
Double clicking an object in the outline pane can be used to hide an
object from the tables pane. To rearrange or reposition any object on
either pane, you just need to select the object and drag and drop the
object at the place where you want it to be. You can also export your
output by clicking File àExport in the output viewer. A new dialogue
box will open and you can select the options regarding the portion of
output and the type of file in which you want it to be saved.
52
It is important to note here that the rows are used to enter information on Use of Computer in
Data Analysis
respondents or cases and columns are used to enter variable information.
For example you have to enter information on area, production and yield
of a particular crop in 19 districts in Rajasthan, ‘area, production and yield’
will be represented in rows and districts in columns.
SPSS data editor will have two views: Data view and Variable view and
you can select the relevant view by clicking at either of the option given
at the bottom left of the window. To proceed logically, you would first click
on the Variable View. Here you will define your variables.
Under Name, you would be required to type the desired variable name.
While naming a variable, you need to take care that the first letter should
be a alphabet and no spaces can appear in the variable name. The next
is to define the type of the variable under the Type option. The functions
and corresponding default values of each of these columns are given below:
Characteristics of Variables and their Default Values:
The picture below gives the details of the variable view for our dataset.
53
Data Analysis The next step would be to enter the data. The variable information entered
in the variable view can be altered at later stage. Next click the Data View
at the bottom left of the window. Once the variable information is ent in
the variable view, you would be able to see all the variable names that
you entered in the Data View. Now type the name of the districts and the
data pertaining to area production and yield. Once you are done with entering
the data, you need to save the data by clicking the File Save option
at the top left corner of the window. You can give a relevant name to the
file when the dialogue box opens. The SPSS files are saved with .sav as
suffix.
Some important tips in data management in SPSS:
l You can open an existing data file by clicking File Open Data
in sequence.
l You can insert an additional variable/column by clicking at the column
next to the desired place where you want to insert the variable, go
to edit option at top and click insert variable. A new variable can
be inserted by following these steps in both data and variable view.
l In order to insert a new row, just above the selected row, click on
the row above which you want to insert the additional row. Then click
edit option at top and click insert case. A blank row will appear just
above the selected row.
So far you have read about working with excel, an overview of SPSS and
data management in SPSS. Now answer the questions given in Check Your
Progress-1.
Check Your Progress - 1
Note: (a) Write your answer in about 50 words.
(b) Check your answer with possible answers given at the end of
the unit.
1. Eplain about data editor in SPSS?
................................................................................................................
................................................................................................................
................................................................................................................
2. How will you include data lebels in a bar graph in excel?
................................................................................................................
................................................................................................................
................................................................................................................
ii. The gender has been recoded as 1 for Males and 2 for Females. The
results will be displayed in the output file. The results of working out
frequencies of the variable gender using SPSS will be as under
Genderrecode
Frequency Percent Valid Cumulative
Percent Percent
1.00 4 40.0 40.0 40.0
Valid 2.00 6 60.0 60.0 100.0
Total 10 100.0 100.0
iii. This shows that in the given data there are 4 males and 6 females.
2. Descriptive Statistics
Descriptive statistics includes calculation of various measures of central
tendency and measures of dispersion. These can be calculated by
following the steps given below:
i. Click Analyze, Descriptive Statistics and Descriptives. This will open
a dialogue box in which you move the variables whose descriptives
need to be calculated by shifting them to the Variable(s) text box. In
this example we have shifted the daily income and monthly income to
the variable box.
55
Data Analysis ii. Click on the Options button to get another dialogue box. Then select
the desired statistics by clicking the appropriate check boxes. In this
example we have checked the Mean and Standard Deviation. Click
Continue to go back to the previous dialogue box and then click OK.
In this example, we have selected Mean and Standard Deviation.
iii. The results will be displayed in the output file. The Mean and Standard
Deviation for daily income and monthly income as calculated using
SPSS are as given:
Descriptive Statistics
N Mean Std. Deviation
Dailyincome 10 548.60 373.613
Monthlyincome 10 16458.00 11208.383
Valid N (listwise) 10
3. Crosstabs
The Crosstabs command is used to get information on the intersection
of two or more categorical variables. While information on each level
of one categorical variable can be obtained by working out frequencies
but it does not provide information on intersection of two variables.
This can be done using the crosstabs command as follows:
i. For example in a given data set we want to cross tabulate migrant
education and their occupation.
ii. Click Analyze, Descriptive Statistics and Crosstabs. A dialogue
box opens. Select Mig_Edu and move it to the Rows list box and
select Occupation and move it to the Columns list box.
iii. Click Cells button which opens another dialogue box. This dialogue
box allows you to select additional information you want to
compute for the selected variables. Select the relevant information
and click Continue button to go to the previous dialogue box and
then click OK.
56
Use of Computer in
Data Analysis
User 1 2 3 4 5 6 7 8 9 10
CFL 2789 2800 2300 3100 2750 2600 2895 3175 2500 2400
life
For the given data, one sample t test can be done as follows:
i. Enter the data in SPSS data editor.
ii. Click Analyze, Compare Means, One Sample T Test. A dialogue
box will open. Move the variable CFLlife to the Test Variable(s) box.
iii. Click Options to set the confidence interval and then press Continue
and then OK.
The test results on SPSS would give the following two sets of tables:
One-Sample Statistics
N Mean Std. Deviation Std. Error Mean
CFLlife 10 2730.9000 285.68104 90.34028
One-Sample Test
Test Value = 3000
95% Confidence
Interval of
the Difference
Sig. Mean Lower Upper
t df (2-tailed) Difference
CFLlife -2.979 9 .015 -269.10000 -473.4639 -64.7361
The first table showing One Sample Statistics gives the summary measures
58 and the second table on One Sample Test gives the test values. The
Significance value or the p value shows the risk of Type-1 error. Here the Use of Computer in
Data Analysis
p value is less than .05 indicating that there is significant difference between
the hypothesised mean and sample mean. Hence we reject the null hypothesis
and thus the claim of the company that the average life of the CFL tubes
manufactured by them is 3000 hours.
2. Independent Sample t-Test
This test is employed to study the difference in two groups. Some of the
examples where independent sample t-test can be employed are to study
the difference in productivity of high yielding varieties (HYV) between
irrigated and unirrigated farms, gender differences in IQ levels etc.
In order to take up an independent sample t test, the following steps need
to be followed:
Eg. In a physical education class, the scores of males and females are
compared. We need to find out if the mean score of men in class differ
significantly from the mean score of women in the class.
Scores
Male 82 80 85 78 87 82 77 81 76 84
Female 75 76 80 77 80 77 73 81 72 78
i. Open the data file and click on Analyze from the command menu.
ii. Select Compare Means and Independent Sample T-test from the
Analyze menu. The Test Variable is Scores and the grouping variable
is Gender .
iii. To define groups, click on define groups to get a new dialogue box.
Assign code 1 to group 1 (males) and 2 to group 2 (females).
iv. Click Options button to set the confidence interval of 95%. Click
Continue and then OK to execute the command.
The output will have the following tables:
Group Statistics
Gender N Mean Std. Deviation Std. Error
Mean
Scores 1.00 10 81.2000 3.55278 1.12349
2.00 10 76.9000 2.99815 .94810
59
Data Analysis The group statistics table displays the summary measures of the selected
variables. The second table displays the results of the t test. The first step
is to look at the Levene’s test to decide if you need to assume equal variance
or not. The null and alternative hypothesis for Levene’s test for equality
of variances is as follows:
H0: Variances of two groups are equal
H1: Variances of two groups are unequal.
In this example, since the significance of F value is .571 which is higher
than .05, we accept the null hypothesis. Thus in this example we need to
interpret the results related to assumption of equal variance. The next step
would be to interpret the t test. In this example, the t value is 2.925 and
the significance level is .009 which is less than .05. Hence we reject the
null hypothesis for equality of means. In other words, there is significant
difference between the marks scored by males and females.
3. Paired Sample t-Test
This test is employed to test the difference in means of the dependent
samples. Here, the observations are recorded on the same subject but at
different points of time. For example, comparing the performance of the
students before and after vacations, sales of a product before and after
advertisement etc. These samples are very closely matched and hence are
also called dependent sample t test.
Suppose 20 students were given a test before and after studying a module.
We want to find out if in general, reading our module, leads to improvement
in student’s knowledge levels. This can be done using paired t test. The
scores of students before and after reading the module are given:
Student 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Pre-
module
score 18 21 16 22 19 24 17 21 23 18 14 16 16 19 18 20 12 22 15 17
Post-
module
score 22 25 17 24 16 29 20 23 19 20 15 15 18 26 18 24 18 25 19 16
The null and alternative hypothesis for examining the difference in scores
of students before and after reading the module is given as:
H0: There is no difference in the marks scored before and after reading
the module.
H1: There is difference in the marks scored before and after reading the
module.
This can be done using SPSS by following these steps:
i. Open the data file. Click on Analyze, Compare Means and Paired
Samples t-test. A dialogue box opens.
ii. Select the two variables Pre-module Score and Post-module Score
and shift them to Paired Variable list box. Click on Options button
and set the confidence interval, click Continue and then OK.
60
Use of Computer in
Data Analysis
You would like to know if the choice of commercial was related to the
gender of the child.
The null and the alternative hypothesis would be defined as follows:
H0: There is no association between gender and choice of commercial
H1: There is association between gender and choice of commercial
Procedure to conduct Chi square test is as follows:
i. Click Analyze, Descriptives and Crosstabs. A dialogue box will open.
ii. Select Gender and move it into the Row list box and Location in
the Column list box.
iii. Click Statistics button to get another dialogue box. Select Chi square
check box and then click Continue and then OK.
62
You will get the following output tables: Use of Computer in
Data Analysis
Case Processing Summary
Cases
Valid Missing Total
N Percent N Percent N Percent
Gender *
Breakfast 125 98.4% 2 1.6% 127 100.0%
Gender * Breakfast Crosstabulation
Count
Breakfast Total
1.00 2.00 3.00
Gender 1.00 30 29 16 75
2.00 12 33 5 50
Total 42 62 21 125
Chi-Square Tests
Value df Asymp. Sig.
(2-sided)
Pearson Chi-Square 9.098a 2 .011
Likelihood Ratio 9.254 2 .010
Linear-by-Linear Association .136 1 .712
N of Valid Cases 125
a. 0 cells (.0%) have expected count less than 5. The minimum expected
count is 8.40.
Case Proceedings Summary table gives the summary information of the
variables.
Crosstabulation of gender and breakfast is given in the second table showing
30 boys and 12 girls prefer the first option, 29 boys and 33 girls prefer
the second option and 16 boys and 5 girls prefer the third option.
The Chi square test table gives the test results. The value of Pearson Chi
Square is 9.098 and the associated significance is .011 which is less than
.05. Therefore the null hypothesis is rejected and we can say that there is
association between gender and breakfast commercial preference.
Model Summary
Model R R Adjusted Std. Error of
Square R Square the Estimate
1 .600a .360 .343 6.162
64
a. Predictors: (Constant), X3, X1, X2 Use of Computer in
Data Analysis
ANOVAb
Model Sum of Mean
Squares df Square F Sig.
1 Regression 2473.601 3 824.534 21.716 .000a
Residual 4404.365 116 37.969
Total 6877.967 119
67
Data Analysis
UNIT 3 DATA PROCESSING AND
ANALYSIS
Structure
3.1 Introduction
3.2 Data Measurement and Its Types
3.3 Tabulation and Interpretation of Data
3.4 Let Us Sum Up
3.5 Keywords
3.6 References and Selected Readings
3.7 Check Your Progress – Possible Answers
3.1 INTRODUCTION
The purpose of data analysis is to identify whether research assumptions
were correct or not, and to highlight possible new views on the problem
under study. The ultimate purpose of analysis is to answer the research
questions outlined in the objectives with the collected data. However, before
we look at how variables may be affecting one another, we need to
summarize the information obtained on each variable in simple, tabular form,
or, in a figure.
Some of the variables may produce numerical (continuous) data, while other
variables produce categorical data. In analyzing our data, it is important,
first, to determine the type of data that we are dealing with. This is crucial
because the type of data used largely determines the type of statistical
techniques that should be used to analyze the data. Once the data is processed,
tables and graphs are prepared, then the report writing work may be initiated.
After studying this unit, you should be able to:
define data and describe various types and nature of data.
describe techniques of data processing, tabulation, and presentation.
describe and interpret data from tables that have been generated.
In the master chart you can enter the data of 14 sample respondents. Likewise
you can expand the number of respondents in the columns and variables
in the rows. It is always better to enter code (numerical number) in the
master chart.
(iv) Entering Data into the Computer
Computers are widely used for the analysis of data. It makes the calculation
72 much faster. The excel sheet and the SPSS package can be used in social
science research. The following steps are used in the analysis of data by Data Processing and
using the SPSS package. Analysis
Source:
Footnote:
While preparing the group frequency distributions, the following points have
to be taken into consideration.
The groups must not overlap, otherwise there will be confusion about
which group a measurement belongs to.
There must be continuity from one group to the next, which means that
there must be no gaps. Otherwise, some measurements may not fit in
a group.
The groups must range from the lowest measurement to the highest
measurement so that all of the measurements have a group to which
they can be assigned.
The groups should normally be of an equal width, so that the counts
in different groups can easily be compared.
ii. Construction of Cross Tabulation
So far, we have made tables containing frequency distributions for one
variable at a time, in order to partially describe our data. Depending on
the objectives of our study, and the study type, we may have to examine
the relationship between several of our variables at the same time. For this
purpose it is appropriate to construct cross tabulation of data. Depending
on the objectives and the type of study, different kinds of cross tabulations
may be required. The examples of cross tabulation are given below. Here,
three different types of cross tabulation of data have been given.
Example 1: A study was carried out on the degree of job satisfaction among
doctors and nurses in rural and urban areas. To describe the sample, a cross
tabulation was constructed which included the sex and the residence (rural,
75
Data Analysis or urban) of the doctors and nurses interviewed. This was useful, because
in the analysis, the opinions of male and female staff had to be compared
separately for rural and urban areas.
Table 3.3: Type of teachers by residence
Interpretation: it can be concluded from Table 3.4 that there are more male
teachers serving in rural areas than females.
To obtain an overview of the distribution of principals and teachers by
gender in rural and urban areas, we can construct the following two-by-
four cross-table.
Table 3.5: Residence and sex of principals and teachers
Teaching staff Residence Total
Rural Urban
Principals Males 8 (10%) 35 (21%) 43 (18%)
Females 2 (3%) 16 (10%) 18 (7%)
Teachers Males 46 (58%) 36 (22%) 82 (34%)
Females 23 (29%) 77 (47%) 100 (41%)
Total 79(100%) 164 (100%) 243(100%)
3. Histograms
Numerical data are often presented in histograms, which are very similar
to the bar charts which are used for categorical data. An important difference,
however, is that in a histogram the bars are connected (as long as there
is no gap between the data), whereas in a bar chart the bars are not
connected, as the different categories are distinct entities. An example of
histogram is given in Figure 3.3.
Figure 3.4: Daily number of malaria patients at the health centers in District
X
Note: It is important that all figures presented in your research report have
numbers, clear titles, and are clearly labeled (or keyed).
6. Maps
In addition to the figures above, the use of maps may be considered to present
information. For instance, the area, where a study was carried out, can be
shown in a map. If the study explored the epidemiology of cholera, a map
could be produced showing the geographical distribution of cholera cases, 79
Data Analysis together with the distribution of protected water sources, thus illustrating
that there is an association. If the study related to vaccination coverage,
a map could be developed to indicate the clinic sites and the vaccination
coverage among under-fives in each village, perhaps showing that home-
clinic distance is an important factor associated with vaccination status.
In this section, we discussed about the tabulation and interpretation of data.
Now answer the questions given in Check Your Progress - 2.
Check Your Progress-2
Note: a) Write your answer in about 50 words.
b) Check your answer with possible answers given at the end of
the unit
1. What is meant by coding of data?
.................................................................................................................
.................................................................................................................
.................................................................................................................
2. What is a pie chart and where is it used?
.................................................................................................................
.................................................................................................................
.................................................................................................................
3.5 KEYWORDS
Data Measurement: measurement is the process of observing and recording
the observations that are collected as part of a research effort.
Type of Data: broadly there are two types of data: (i) quantitative and;
(ii) qualitative which can be further classified as categorical, nominal and
continuous data.
Data Quality: the quality data can be characterized as: (i) precise, (ii)
unambiguous, (iii) free from errors, (iv) valid, (v) reliable, and (vi)
practical.
Data Processing: means the generation of frequency distribution and cross
tabulation and calculation of other statistical measures.
Frequency Distribution: preparation of tables which distribute respondents
80 according to a particular characteristic of sample, or research outcome.
Cross Tabulation: this is a process of generating tables giving the outcome Data Processing and
of interest in columns, and various characteristics of respondents, or factors Analysis
affecting outcomes in rows.
Data Interpretation: is drawing valid and meaningful conclusions from the
tables generated with the help of collected data.
Report Preparation: is the process of documenting the whole process of
research conducted to identify the problem, or to prove some relationships,
or for proving the success of some programme related activities.
81
Data Analysis
UNIT 4 REPORT WRITING
Structure
4.1 Introduction
4.2 Types of Report
4.3 Writing the Research Report
4.4 The Preliminary Pages of Research Report
4.5 Main Components or Chaptering the Research Report
4.6 Style and Layout of the Report
4.7 Common Weaknesses in Report Writing and Finalizing the Text
4.8 Let Us Sum Up
4.9 References and Selected Readings
4.10 Check Your Progress – Possible Answers
4.1 INTRODUCTION
A research report is considered a major component of any research study as
the research remains incomplete till the report has been presented or written.
No matter how good a research study, and how meticulously the research study
has been conducted, the findings of the research are of little value unless they
are effectively documented and communicated to others. The research results
must invariably enter the general store of knowledge. Writing a report is the
last step in a research study and requires a set of skills somewhat different
from those called for in actually conducting a research.
After reading this unit you will be able to:
describe the various steps involved in writing a research report.
explain the various components of a research report
identify common mistakes committed while writing a research report.
The cover page should contain the title, the names of the authors with their
designations, the institution that is publishing the report with its logo, (e.g.,
Health Systems Research Unit, Ministry of Health), the month, and the year
of publication. The title could consist of a challenging statement or question,
followed by an informative subtitle covering the content of the study and
indicating the area where the study was implemented. However, this is
suggestive in nature and should not be considered standard. It would be
appropriate if the cover page is designed by an expert in computer graphics
who may be suggested to include some important photograph related to
identity of organization or problem under study or from the field within the
background. Design software may be used. An example of a title of a research
report is given in the box below.
(ii) Foreword
Foreword
(iii) Preface
Preface
The present study was conducted in three states of Bihar, Uttar Pradesh,
and Punjab to study various aspects of labour migration, and its impact
on rural economy in the Indo-Gangetic plains in India. The study
focused on labour outmigration across two states of the Indo-Gangetic
Region and in-migration in Punjab. The results of this study would
help researchers, policy makers and planners as well as development
agencies in addressing various issues of labour migration and its
implication in India.
(iv) Acknowledgements
Acknowledgements
I take this opportunity to thank the (Name of the funding agency) for
providing funds and facilities for the project. I offer my sincere thanks
to the (Name of your employer) for his encouragement and support
for pursuing this study. I am also grateful to the head, (Name of your
department) for providing all needed support, encouragement, and
technical guidance. All the Research Associates, Senior Research
Fellows and technical assistants working under the project deserve
special appreciation for their hard work and sincere efforts in
completing this project.
Contents
S. No.Contents Pages
1 Introduction
2 Review of Literature
3 Methodology
3.1 Data
3.2 Analytical Tools
3.3 Profile of Area Under Study
4 Research Findings
4.1 Macro Level Evidences
4.2 Evidences from filed Survey
5 Discussion
6 Conclusions and Policy Implications
7 References
Appendix
Review of Literature
Singh (2008) conducted a study on labour out-migration from the Indo-
Gangetic plains of India. The study provides sufficient evidence of
the effect of male out-migration on the rural economy of the Indo-
Gangetic plains of India. Male out-migration has resulted in gender
role reversal in terms of decision making on important household and
farm issues. Besides, the women of the migrant households had to take
up many male specific activities, like land preparation, seed selection,
broadcasting, irrigation, and herbicide application. The study also
proved that the crop returns of non-migrant households were significantly
higher than that of migrant households in case of both rice and wheat
cultivation. The technical, allocative and economic efficiencies of non-
migrant households was much higher that the migrant households in
both rice and wheat cultivation.
Methodology
3.1 Data Collection / Sample
A micro level study based on primary cross section data was designed
to attain the objectives of this project. The survey was conducted in
three states; Bihar, Uttar Pradesh and Punjab. A systematic interview
schedule was used to collect information on various aspects of labour
migration and its impact on rural economy of Indo-Gangetic Plains
of India. The data was collected for 200 families with migration and
200 families without migrating members.
3.2 Analytical tools
Various statistical tools were used in the analysis of data. Those are
mean, standard deviation, correlation, t-test, and regression.
Include only those tables and figures that present main findings and need more
elaborate discussion in the text. Others may be put in annexes, or, if they don’t
reveal interesting points, be omitted.
It is advisable to involve a statistician/data analyst from the very beginning in
each process of the research so that he/she may provide meaningful tables and
help remove irrelevant findings.
Note: Never start writing without an outline. Make sure that all
sections carry the headings and numbers consistent with the outline
before they are word-processed. Have the outline visible on the wall
so that everyone will be aware immediately of any additions or
changes, and of progress made.
99