IJCA - Volume I - Flipbook - Page 29
28 The International Journal of Conformity Assessment
be considered an incorrect answer (D). For each
question, examiners should count the number of
total test takers that obtained marks in the A, B, C,
and D categories.
The indices’ facility value (FV) and discrimination
index (DI) are calculated in the following formulas.
Facility Value (FV): This is the number in the group
answering a question right. Facility value—also
called a difficulty index—measures a question’s level
of ease or difficulty. The higher the FV, the easier the
question.
It is denoted as:
2022 | Volume 1, Issue 1
Step 3. Develop the Exam
After the job analysis survey is evaluated, the results
are used to develop valid certification exams.
Specifications for certification exams are based
on the results of the job analysis and reflect how
often a task, knowledge, skill, or ability is needed in
practice and how much impact it has on effective
job performance.
Step 4. Establish the Passing (Cut) Score
The cut score is defined as the minimum score
required to pass an exam. Defining the cut score
required for certification is one of the most
important but difficult aspects of the validation
process.
HAG : High-ability group
Setting the Passing (Cut) Score of an Exam
LAG :Low-ability group
Standard setting is the process used to select a
passing score for an exam. Of all the steps in the
test development process, the standard setting
phase may be the one most like art, rather than
science; while statistical methods are often used in
conducting a standard setting, the process is also
greatly impacted by judgment and policy.
N : Total number of considered test takers
The FV value is expressed as a percentage. Its range
is 0-100. Its recommended value is 45-60 and its
acceptable value is 25-75.
Discrimination Index (DI): This index indicates the
ability of a question to discriminate between test
takers with higher and lower abilities.
It is denoted as:
The DI value is expressed as a fraction. Its range is
0-1.0. Its maximum value is 1.0, which indicates an
ideal question with perfect discrimination between
HAG and LAG. Its value could extend from -1.00
to +1.00. The minus value—also called negative
discrimination—means that more test takers in
the lower group are answering an item correctly
compared to test takers in the higher group.
Recommended value: > 0.25
Acceptable with revision: 0.15-0.25
Discard the question: < 0.15
This item analysis helps to detect specific technical
flaws in the questions and provides information
for improvement. It also increases the itemwriting skills of examiners. There are no clear-cut
guidelines in formulating the item analysis. However,
regular exercise over this analysis would contribute
to a personnel certification body’s formulation of
appropriate questions.
The passing score (also known as the passing
point, cutoff score, or cut-score) is used to classify
examinees as either masters or non-masters. An
examinee’s score must be equal to or greater than
the passing point, in order for the examinee to be
classified as a master or to pass the test. If an
examinee is misclassified, that is referred to as a
classification error.
Typically, the passing score is set at a score point
on the exam that the judges determine reflects the
minimum level of competency to protect the public
from harm or to provide minimal competency at the
occupational level being assessed. For the standard
setting to be conducted successfully, the panel
of judges should be carefully selected and then
thoroughly prepared and trained for their task.
There are a number of approaches to standard
setting, including: informed judgment, conjectural,
and contrasting groups methods. All of these
methods require the insight of a representative panel
of competent practitioners representing appropriate
demographics and experience, ranging from those
who have recently entered the profession to those
who have competently practiced for many years.
Methods for Standard Setting
Types of Classification Error: The passing score for
a test should be set in accordance with the purposes
of the exam and with consideration to relative
risks to the public from incompetent practice. It
should not be set arbitrarily, but rather should be
carefully determined by a panel of judges who are
familiar with the content of the exam as well as the
characteristics of the occupation concerned.
Two types of classification errors can occur when
the passing score is applied.
One type of misclassification is termed a falsepositive (i.e., an error of acceptance). An example
of a false-positive error would be an examinee who
was not minimally competent, but who passed the
test.
The second type of misclassification is termed a
false-negative (i.e., an error of rejection). In this type
of misclassification, an examinee who actually has
the level of competence fails the test.
Depending upon the nature of the exam program,
one of these types of errors may be far more
problematic than the other. Awareness of these
potential consequences may be used to influence
the determination of the final passing score, after
the panel of judges has made its recommendation.
Policymakers of the exam program may adjust that
recommended passing point based on other factors,
and possibly include operational test score data
when it becomes available .
Informed Judgment Method: The informed judgment
method is a test-based approach. A panel of judges,
or stakeholders, reviews the overall test and its
content. Based on their holistic reviews, the judges
then each suggest a percentage of items on the test
that ought to be correctly answered by a minimally
competent examinee. This percent-correct score
on the total test can be viewed as each judge’s
recommended passing score. These recommended
passing scores from the panel, along with possible
additional information, can be used to set the final
passing score. The informed judgment method
might be difficult to rationally defend when it is used
in isolation. However, it may be a very appropriate
method for use in combination with other methods,
particularly the contrasting groups method
Conjectural (Modified-Angoff) Method: The
modified-Angoff method is the most commonly
used of the conjectural methods, all of which are
29
item-based approaches to standard setting. A
panel of judges is assembled and asked to review
the test, one item at a time. For each item, each
judge gives an estimate of the probability that a
minimally competent examinee would be likely
to respond correctly. (Alternatively, the judges
may be asked to imagine a hypothetical group
of minimally competent examinees and then to
indicate the percentage of the group that would be
likely to respond to the given item correctly.) When
judges are not in agreement regarding the pass/fail
standard, those with disparate ratings are given the
opportunity to explain their rankings, with the voting
process repeated, building consensus. Typically, one
or more additional rounds of review are undertaken.
These passing scores are then averaged across
the individual judges to arrive at the full panel’s
recommended final passing score.
Contrasting Groups Method: The contrasting groups
method is an examinee-based approach to standard
setting. This method in particular requires that the
panel of judges be highly familiar with the target
test population. The panel of judges identifies a
set of examinees who are clearly non-masters
and another set of examinees who are clearly
masters; borderline examinees are not included.
It is especially important that the non-masters be
carefully selected. While the non-master examinees
would not yet be considered minimally competent
in the occupational area, they should nevertheless
be members of the target test population. If, instead,
the examinees identified as non-masters are
completely unknowledgeable in the exam’s content
area, the passing score may be set at an artificially
low point. After the two groups of examinees have
been identified, the test is administered to them.
The two resulting test score frequency distributions
are plotted on the same continuum. The passing
score can be set at the intersection point of the
two distributions; or, alternatively, the final passing
score can be adjusted somewhat, based on the
relative cost of false-positive and false-negative
classification errors. While the contrasting groups
method can be used independently, it may also be
used as a complement to the informed judgment or
other standard setting method.
For example, consider an exam with 20 participants
that contains 45 multiple-choice questions. A list
can be created, including the descending order of
scores of experienced test takers (pictured below in
blue) and the ascending order of scores of other test