Psychometric Theories: TCT And IRT

Patient evaluation is often strongly influenced by the results obtained from the test administered. But how were these tests formulated and how to verify their effectiveness?

Tests are administered as psychological assessment tools. Just as we use the meter to measure lengths, we could use a test to measure a person’s intelligence, memory, attention levels. However, the tests are not so easy to design or apply. Psychometric theories deal with this aspect .

Similarly, just as a single measurement does not allow us to precisely define the volume of an object, not even the administration of a single test allows us to formulate a precise diagnosis or to propose an exact intervention. It follows that tests are useful assessment tools, but they are not decisive.

It is here that the psychologist takes over, who has the task of reworking the data obtained in the test and from other sources to formulate a more precise assessment. In other words, the specialist will have to integrate the results of the different sources, possible thanks to the skills but also to the years of experience.

Brief history of psychometric theories

The origin of the psychometric tests is traced back to the tests that the Chinese emperors administered in 3000 BC. C. to officers in their service. The aim was to assess their professional skills. (1)

More modern tests have more recent origins, such as the tests performed by Galton (1822-1911) in his laboratory. Nonetheless, it was James Cattell who first used the term “psychological test” in 1890.

These tests were not accurate regarding the subject’s cognitive abilities, which is why researchers Binet and Simon (1905) introduced cognitive tests to evaluate aspects such as judgment, understanding and reasoning.

This paved the way for a tradition of individual rating scales. In addition to cognitive tests, there has been considerable progress in the area of personality testing.

What are psychometric theories for?

As testing flourished as an evaluation tool, several theories also developed to evaluate its effectiveness. Driven by the need to develop tools that had the least possible margin of error, psychometrics arose. Psychometric theories claim validity and reliability from any test or measurement tool as such.

We remind you that reliability means the stability or consistency of the measurements during the repetition of the same. In other words, a test will be as reliable the greater the replication of the same results obtained on different occasions.

On the other hand, validity refers to the degree to which empirical evidence and theory support the interpretation of the test score (2).

There are two psychometric theories (or approaches) aimed at analyzing and defining these tools: the classical test theory (TCT) and the item response theory (IRT).

Psychometric theories

Classical test theory (TCT)

This is the dominant theory in test construction and analysis. Its origins date back to the works of Spearman, at the beginning of the 20th century . Subsequently, in 1968, researchers Lord and Novick reformulated this theory and paved the way for a new approach, that of the IRT.

This theory is based on the classical linear model proposed by Spearman according to which “the score obtained by a person in a test – which we will call empirical score and which is usually indicated with the letter X – is made up of two elements (2).

On the one hand, we find the true score obtained by the subject during the test (V) and on the other hand the error (e). It is expressed with the following formula: X = V + e ”.

Spearman adds three postulates to this theory:

We define the true score (V) as the mathematical expectation of the empirical score: it is the score a person would have to get on a test if he repeated it an infinite number of times.

There is no relationship between the actual score and the magnitude of errors.

The measurement errors of a test are not associated with those of a different test.

Finally, Spearman defines parallel tests as those tests that measure the same variable, but on the basis of different items.

Psychometric theories: limits of the classical approach

The first of the classical theory is the measurements are not invariant with respect to the applied instrument. This means that if a psychologist evaluated the intelligence of three people with a different test for each, the results would not be comparable. But why?

Because the results of the three measurement tools are not based on the same rating scale. In order to be able to compare the intelligence of the three people, it will be necessary to convert the score obtained into another scale.

The problem arises from the fact that we assume that the normative criteria with which the scales of the different tests have been elaborated are comparable – same mean, same standard deviation -, which is difficult to guarantee in practice. (1).

Thus, the new approach of the RTI represented a great advance in this regard. It allows, in fact, that the results obtained through different tools are evaluated on the basis of the same scale.

The second limitation of this approach is the absence of invariance of the properties of the tests with respect to the subjects used for the estimation. This aspect has a partial solution in the RTI approach.

The Item Response Theory (IRT)

The Item Response Theory (IRT) was born as a supplement to the classical test theory. The IRT offers us a much more balanced tool, the problem is that this paradigm is associated with higher costs and requires the participation of specialized personnel.

The IRT foresees several scenarios, but the underlying idea is that any measurement tool has a functional relationship between the values of the variable that measure the items and the probabilities that correspond to the truth. This function is called the Item Characteristic Curve (CCI). What conclusions can we draw?

TCT does not evaluate. For example, the most difficult items would be those to which only the most intelligent subjects respond. On the other hand, an item to which all subjects respond would be useless, because it would have no discriminating value. That is, it would not offer any information.

To better observe the differences between the psychometric theories described, we can take as a reference the table by José Muñiz (2010):

Table 1. Differences between TCT and IRT (Muñiz, 2010)

Wait	TCT	IRT
Template	Linear	Not linear
Hypothesis	Weak (easy to satisfy with data)	Strong (hard to please with data)
Invariance of measurements	No	Yup
Invariance of test properties	No	Yup
Score scale	Between 0 and the maximum test score	Infinity
Emphasis	Test	Item
Item-test relationship	Not specifiable	Characteristic curve of the item
Item description	Indices of difficulty and discrimination	Parameters a, b, c
Measurement errors	Typical measurement errors, common to all samples	Information functions (varies according to aptitude levels)
Sample size	Potentially working with samples composed of a number ranging from 200 to 500 subjects approximately	A number that exceeds 500 subjects is recommended

Although they are almost contemporary, it seems clear that IRT arises as a response to the limitations or possible problems presented by TCT. In any case, it seems clear that psychometric research still has a long way to go.

Brief history of psychometric theories

What are psychometric theories for?

Psychometric theories

Classical test theory (TCT)

Psychometric theories: limits of the classical approach

The Item Response Theory (IRT)

Table 1. Differences between TCT and IRT (Muñiz, 2010)

The Anatomy Of Fear: Physiological And Psychological Basis

Anne Frank, Biography Of A Resilient Girl

Related Articles

Leave a Reply Cancel reply