Assessment Handout
Assessment Handout
PROFESSIONAL EDUCATION
BASIC CONCEPTS
Test An instrument designed to measure any quality, ability, skill or knowledge.
Comprised of test items of the area it is designed to measure.
Measurement A broader term than test because there are other ways of measuring other than by
test (E.G. observation, use of checklists and rating scale.)
A process of quantifying the degree to which someone/something possesses a given trait,
quality, characteristic or feature.
Process of determining the degree and boundaries of specific traits and characteristics
being assessed.
Process of assigning numerical value to the trait or characteristic in question.
Aspect of evaluation that tells us “how much” or “how often”.
Assessment A broader term than measurement and it involves interpreting or placing such information
in context.
A process of gathering and organizing data into an interpretable form to have a basis for
decision-making.
It is a prerequisite to evaluation. It provides the information which enables evaluation to
take place.
Types of Assessment
1. Traditional Assessment – It refers to pen and paper mode of assessing any quality, ability, skill
or knowledge (E.G. Standardized and Teacher–Made tests, etc.)
Evaluation A process of systematic collection and analysis of both qualitative and quantitative data in
order to make some judgment or decision.
It involves judgment about the desirability of changes in students as a result or
manifestation that learning has taken place.
Process of measuring a range of student attributes, abilities, and interests and of making
professional judgments based on the results of measurements.
Involves collecting data from a variety of sources, forming opinions and making
comparisons with which to guide students and others in educational and career decisions.
Process of summing up the results of measurements or tests and giving them some
meaning based on value judgment (Hopkins, 1981).
1
Purposes of Classroom Assessment
1. Assessment FOR Learning – this includes three types of assessment done before or during
instruction.
a. Placement – done prior to instruction
Its purpose is to assess the needs of the learners to have basis in planning for a relevant
instruction
Teachers use this assessment to know what their students are bringing into the learning
situation and use this as a starting point for instruction.
The results of this assessment put students in specific learning groups to facilitate teaching
and learning.
2. Assessment OF Learning – this is done after instruction. This is usually referred to as the
SUMMATIVE ASSESSMENT.
It is used to certify what students know and can do within a level of proficiency or competency.
Its results reveal whether or not instructions have successfully achieved the curricular outcomes.
The information from assessment of learning is usually expressed as marks or letter-grades.
The results of which are communicated to the students, parents, and other stakeholders for
decision making.
It is also a powerful factor that could pave the way for educational reforms.
3. Assessment AS Learning – this is done for teachers to understand and to perform well their role in
assessing FOR and OF learning. It requires teachers to undergo training on how to assess learning
and be equipped with competencies needed in performing their work as assessors.
TYPES OF TESTS
According to: Educational test – measures the Psychological test – measures the
results of instruction intangible aspects of an individual
Example:
Example: Aptitude test – measures the area
what it Achievement test – measures where the students will likely succeed
measures what the students have achieved
at the end of the instruction Personality test – measures the
students’ personality traits
b. Alternative Response – consists of declarative statements that one has to mark true or false,
right or wrong, correct or incorrect, yes or no, fact or opinion, and the like.
c. Matching Type – consists of two or more parallel columns: Column A, the column of premises
from which a match is sought; Column B, column C, the columns of responses by which the
selection is made.
2. Supply Test
a. Short Answer – uses a direct question that can be answered by a word, a phrase, a number, or
a symbol.
b. Completion Test – consists of an incomplete statement.
3. Essay Test
a. Restricted Response – limits the content of the response by restricting the scope of the topic.
b. Extended Response – allows the students to select any factual information that they think is
pertinent and to organize their answers in accordance with their best judgment.
3
PERFORMANCE-BASED ASSESSMENT
Performance-based assessment is a process of gathering information about student’s learning
through actual demonstration of essential and observable skills and creation of products that are grounded
in real world contexts and constraints. It is an assessment that is open to many possible answers and
judged using multiple criteria or standards of excellence that are pre-specified and public.
1. Develop a scoring rubric reflecting the criteria, levels of performance and the scores.
PORTFOLIO ASSESSMENT
Types of Portfolios
1. The working portfolio is a collection of a student’s day-to-day works which reflect his/her learning.
2. The show portfolio is a collection of a student’s best works.
3. The documentary portfolio is a combination of a working and a show portfolio.
DEVELOPING RUBRICS
Rubric is a measuring instrument used in rating performance-based tasks. It is the “key to
corrections” for assessment tasks designed to measure the attainment of learning competencies that
require demonstration of skills or creation of products of learning. It offers a set of guidelines or
descriptions in scoring different levels of performance or qualities of products of learning. It can be used in
scoring both the process and the products of learning.
2. Rating Scale
4
Measures the extent or degree to which a trait has been satisfied by one’s work or
performance
Offers an overall description of the different levels of quality of a work or a performance
Uses 3 or more levels to describe the work or performance although the most common
rating scales have 4 or 5 performance levels.
Below is a Venn Diagram that shows the graphical comparison of rubric, rating scale and
checklist.
R
U - shows
- shows the
Checklis observed traits B
degree of Rating
quality of
of a work / R Sale
work/
t performance I
performance
C
Types of Rubrics
5
1.1. Rational Validity – This depends upon professional judgment alone; by judgment of competent
teachers, usually three or more experts in the field.
1.1.1. Content / Curricular Validity – Validity established by comparison of the content of the
test with a particular type of curriculum, textbook, course of the study or outline.
Example:
A teacher made a test in biology. Her test has curricular validity if the content is biology and
not history or geography.
1.1.2. Concept / Construct Validity – Validity established by analyzing the activities and
processes that correspond to a particular concept.
Example:
Analysis of the scientific method, of critical thinking, and efficient skill in writing.
1.2. Statistical / Empirical / Criterion-Related Validity – Validity established by correlating the
results of test with an outside criterion or against an outside valid criterion.
1.2.1. Congruent Validity – Validity which is established when a test is correlated whit an
existing measure which has a similar function.
Example:
A group of intelligence test is valid if it correlates reasonably with another intelligence test
with known high validity, such as the Otis intelligence test.
1.2.2. Concurrent Validity – Validity established by correlating the test with some other
measures which is obtained at the same time.
Example:
Relate the reading test result with pupils’ average grades in reading given by the teacher.
1.2.3. Predictive Validity – Validity established by correlating the test with another measure
which can foretell later success in school, in one’s job or on life.
Example:
The entrance examination scores in a test administered to a freshman class at the beginning
of the school year is correlated with the average grades at the end of the school year.
1.3. Logical and Psychological Validity – Validity is established through subjective analysis of the
test by experts in the field. This is usually done if the test cannot be statistically measured.
Example: Artistic works.
2. RELIABILITY
Reliability refers to consistency and accuracy of test results, the degree to which two or more forms
of the test will yield the same results under uniform conditions.
Increasing the length of the test may raise the reliability of the test. Clear and concise directions
would also increase the reliability of the test.
6
Internal halves of the test. Spearman
Consistency Brown Formula
5. Kuder- Measure of Give the test once then correlate the Kuder-
Richardson Internal proportion/percentage of the students Richardson
Consistency passing and not passing a given item Formula 20
and 21
3. OBJECTIVITY
The degree to which no personal judgment, opinion, or bias will affect the scoring of the test. This
can be secured by wording the statements of items in the test in such a way that only one answer is
possible.
* A test should be such that different teachers can similarly score the test and arrive at the same
scores. In other words, the more objective the test is, the greater is its reliability.
4. USABILITY / PRATICABILITY
The degree to which a test can be used by teachers and administrators without unnecessary waste
of time, money, and effort.
It involves the following factors:
Specific Suggestions
2. Matching Type
a. Use only homogenous material in a single matching exercise.
b. Include an unequal number of responses and premises, and instruct the pupils that
responses may be used once, more than once or not at all.
c. Keep the list of items to be matched brief, and place the shorter responses at the right.
d. Arrange the list of responses in logical order.
e. Indicate in the directions the basis for matching the responses and premises.
7
f. Place all the items for one matching exercises on the same page.
3. Multiple- Choice
a. The stem of the item should be meaningful by itself and should present a definite
problem.
b. The stem should be free from irrelevant material.
c. Use a negatively stated stem only when significant learning outcomes require it.
d. Highlight negative words in the stem for emphasis.
e. All the alternatives should be grammatically consistent with the stem of the item.
f. An item should only have correct or clearly best answer.
g. Items used to measure understanding should contain novelty, but beware of too much.
h. All distracters should be plausible.
i. Verbal associations between the stem and the correct answer should be avoided.
j. The relative length of the alternatives should not provide a clue to the answer.
k. The alternatives should be arranged logically.
l. The correct answer should appear in alternate positions and approximately equal
number of times but in random order.
m. Use of special alternatives such as “none of the above” or “all of the above” should be
done sparingly.
n. Do not use multiple choice items when other types are more appropriate.
o. Always have the stem and alternatives on the same page.
p. Break any of these rules when you have a good reason for doing so.