Sampling
Precision and Sample
Size
Or
here if above not available
Because
TI MSS is fundamentally a study of mathematics and science
achievement
among fourth and eighth grade students, the precision of
survey
estimates of student achievement and characteristics was of primary
importance.
However, TI MSS also reports extensively on school, teacher,
and classroom
characteristics, so it is necessary to have sufficiently large
samples
of schools and classes. The TI MSS standards for sampling precision
require
that all student samples have an effective sample size of at least 400
students
for the main criterion variable, which is mathematics and science
achievement.
In other words, all student samples should yield sampling
errors
that are no greater than would be obtained from a simple random
sample
of 400 students.
Given
that sampling error, when using simple random sampling, can be
expressed
as SESRS = S / n where S gives the population standard deviation
and n the sample size, a simple random sample
of 400 students would yield
a 95 percent
confidence interval for an estimate of a student-level mean of
±10
percent of its standard deviation ( 1.96 g S /
400 ). Because the TI MSS
achievement
scale has a standard deviation of 100 points, this translates into
a ±10 points
confidence limit (or a standard error estimate of approximately
5 points).
Similarly, sample estimates of student-level percentages would have
a confidence
interval of approximately ±5 percentage points.
Notwithstanding
these precision requirements, TI MSS required that
all student
sample sizes should not be less than 4,000 students. This was
necessary
to ensure adequate sample sizes for analyses where the student
population
was broken down into many subgroups. For countries involved in
the previous
TI MSS cycle in 2003, this minimum student sample size was set
to 5,150
students in order to compensate for participaton in
the TI MSS 2007
Bridging Study.
Furthermore, since TI MSS planned to conduct analyses at the
school
and classroom level in addition to the student level, all school sample
sizes were
required to be not less than 150 schools, unless a complete census
failed
to reach this minimum. Under simple random sampling assumptions,
a sample of
150 schools yields a 95 percent confidence interval for an estimate
of a
school-level mean that is ±16 percent of a standard deviation.
Although
the TI MSS sampling precision requirements are such that
they would be
satisfied by a simple random sample of 400 students, sample
designs
such as the TI MSS 2007 school-and-class design, typically require
much larger student samples to achieve the
same level of precision. Because
students
in the same school and even more so in the same class, tend
to be more
like each other than like other students in the population,
sampling
a single class of 30 students will yield less information per student
than a random
sample of students drawn from across all students in the
population.
TI MSS uses the intraclass correlation, a statistic
indicating
how much
students in a group are similar on an outcome measure, and a
related
measure known as the design effect to adjust for this “clustering”
effect
in planning sample sizes.
For
countries taking part in TI MSS for the first time in 2007, the
following
mathematical formulas were used to estimate how many schools
should
be sampled to achieve an acceptable level of sampling precision:
VarPPS = Deff gVarSRS =
Deff g S2
n
≅
1+ (mcs −1) ⎡⎣
⎤⎦
g S2
n
≅
1+ (mcs −1) ⎡⎣
⎤⎦
g S2
a gmcs
ń ń
where Deff is a compensation factor for using a
sample selection method
that differs
from a simple random sample (also called design effect), S2 gives
the variance
of the population, ń measures
the intraclass correlation between
clusters,
mcs corresponds to the average number of
sampled students per
class, and a gives the number of schools to sample.
Incorporating the
precision
requirements described earlier into this equation, which translates
into VarPPS = (0.05)2 g S2 , gives the number of schools required
as:
(1)
a = 400 g
1+ (mcs −1) ⎡⎣
⎤⎦
mcs
ń
For
planning purposes, the intraclass correlation
coefficient usually was
set to 0.3 if
no other information was available. For example, with a mcs of
20
students and a ń of 0.3,
equation (1) gives 134 schools.
Equation
(1) is a model for determining how many schools were required
for the TI MSS
2007 sample under the assumption that the standard error of
the criterion
variable (student mathematics and science achievement) reflects
only sampling
variance—the usual situation in sample surveys. However,
because
of its complex matrix-sampling assessment design, standard errors
in TI MSS
include an imputation error component in addition to the usual
sampling
error component (see Chapter 11). To keep the standard error
within
the prescribed precision limits, the number of schools determined
by equation
(1) has to be increased, as shown in equation (2):
(2) airt = (400 g 0.5)/mcs