Reliability as the degree of accuracy with which a characteristic is measured.

More on the subject of quality criteria

  • objectivity
  • validity


The reliability (reliability) of a measurement process is defined as the degree of accuracy with which a feature is measured. A feature is considered reliable if the determined value is only slightly error-prone, regardless of whether the test measures what it claims to be measuring. (This corresponds to the validity)

Deficiencies in reliability

The following measurement defects can lead to Decrease in reliability to lead.

  1. Deficiencies in instrumental consistency
  2. Defects in the constancy of features
  3. Defects in constancy of conditions

1. Defects in instrumental consistency

Under errors in the instrumental consistency are understood to mean those errors that either affect the measuring device itself or any errors that arise from incorrect operation of the device.

  • Error in the measuring device (Measurement in the narrower sense, e.g. no calibration, errors in lactate measuring devices, manual stop vs. electronic stop)
  • Error in operating the device (Measuring in a broader sense, e.g. incorrect use of a stopwatch, errors in the evaluation)

2. Defects in the constancy of features

Defects in the Feature constancy occur particularly strongly when athletes / test persons do not achieve approximately the same result when repeating measurements.
E.g. for several sprints over 10m. of an athlete, even with constant external conditions, the same value is never always measured. Question: What time corresponds to the true value.

Notice: The more demanding the task in terms of coordination, the higher the error in the constancy of features (Example free throws basketball vs. sprint performance)

Note further: The higher the qualification of the athlete, the lower the deficiencies in the constancy of characteristics. (Constancy of characteristics increases)

3. Defects in the constancy of conditions

If external conditions change, this almost always leads to a falsification of the measurement results. One speaks of one Condition fluctuation (material-specific, milieu-specific, psychophysical)


  • Ball throw leather vs. rubber
  • Jumping power on sprung floor vs. asphalt
  • Running on tartan or asphalt
  • Fitness test at different temperatures or wind conditions

Reliability values ​​for practice

In order to be able to work with sufficiently reliable data, the following values ​​are recommended for practice. The measurement error is then still in the acceptable range.

  • r? .50 for group comparison
  • r? .70 (generally in research)
  • r? .90 in individual diagnosis

Methods for determining reliability

The following methods are used in practice to determine reliability:

  1. Retest method (Test takers complete the same test twice under the same conditions)
  2. Parallel test method (The raw scores of two tests are correlated with each other)
  3. (Test halving method (A test is divided into two equivalent halves. The two halves are correlated with each other)
  4. Consistency analysis (A test is carried out once on a sample and broken down into as many parts as there are items. Then the items are correlated with one another)

1. Retest method

A test and that Retest is carried out at different times under identical conditions. A change of the experimenter enables simultaneous determination of objectivity and reliability.

  • Question: How much time should elapse between the two tests?
  • problem: In a retest, any learning experience gained from the first test can be incorporated. (E.g. learning effects, exercise effects, but also fatigue effects, motivation effects)

2. Parallel test method

Two different tests with the same goal (identical scope) are carried out on the same sample. (Parallel test reliability)


  • Deep start - hard start
  • Medicine ball throw - medicine ball kick

Note: Not all tests can be regarded as parallel tests.

3. Test halving method

The requirement for the Test halving method is that the test can be broken into two equivalent halves. (Ex. 20 free throws from the free throw line in basketball).

In some tests, halving is not possible (e.g. squats)

Both test halves are summed up and correlated with one another.

Options for halving the test:

  • Halving after straight and odd numbers
  • Halving after Random principle

4. Consistency analysis

In the Consistency analysis the test is broken down into the number of parts as there are tasks. The measure of the internal consistency is that Alpha coefficient to Cronbach.