The Thurstone Scale is among the earliest formal instruments for the quantitative measurement of attitudes, devised by the American psychometrician Louis Leon Thurstone in a sequence of papers published between 1928 and 1931, the foundational statement being his 1928 article "Attitudes Can Be Measured" in the American Journal of Sociology. Thurstone's central claim — radical for its time — was that subjective dispositions toward an object, institution, or social group could be located on a numerical continuum possessing the properties of an interval scale, where the distances between points are psychologically equal. He developed the method jointly with Ernest J. Chave, and their 1929 monograph The Measurement of Attitude applied it to attitudes toward the church. The technique drew on Thurstone's "law of comparative judgment," which provided the psychophysical justification for treating verbal opinions as measurable magnitudes rather than mere categories.
The classical procedure, known as the method of equal-appearing intervals, proceeds in clearly defined steps. First, the researcher assembles a large pool of opinion statements — frequently between 100 and 200 — expressing every shade of feeling toward the attitude object, from extreme hostility to extreme favourability. Second, a panel of judges, ideally numbering 50 or more, independently sorts each statement into 11 piles representing equal psychological intervals, where pile 1 denotes the most unfavourable position, pile 11 the most favourable, and pile 6 a neutral midpoint. Crucially, judges sort statements according to the degree of favourability the statement expresses, not according to their own personal agreement. Third, for each statement the researcher computes the median pile value across all judges as its scale value, and the interquartile range as a measure of ambiguity. Fourth, statements with high dispersion — those judges could not agree on — are discarded, and a final instrument of roughly 20 to 22 items is selected so that their scale values are spread evenly across the continuum.
Once the instrument is constructed, administration is simple: respondents are presented with the final statements, usually in random order and stripped of their scale values, and asked merely to check those with which they agree. The respondent's overall attitude score is the median (or mean) of the scale values of the items endorsed. Thurstone developed two principal variants beyond the method of equal-appearing intervals: the method of paired comparisons, in which judges compare every statement against every other, and the method of successive intervals. The paired-comparison approach yields more precise scaling but becomes combinatorially unwieldy as the statement pool grows, since the number of comparisons rises with the square of the number of items, which is why equal-appearing intervals became the dominant practical form.
In contemporary social science the Thurstone procedure survives primarily in specialised and clinical instruments rather than mass surveys. Health-services researchers have used Thurstone scaling to construct quality-of-life and pain-assessment indices; marketing and public-opinion firms occasionally employ it where interval-level precision matters. Within Indian civil-services preparation, the scale appears in the General Studies Paper IV (Ethics, Integrity and Aptitude) syllabus introduced by the Union Public Service Commission after the 2013 examination reform, where attitude, its structure, and its measurement are explicitly enumerated topics. Candidates are expected to distinguish Thurstone's approach from the Likert and Bogardus methods and to understand its judge-based logic.
The Thurstone Scale is most usefully understood against its adjacent alternatives. The Likert scale, introduced by Rensis Likert in 1932, dispenses with judges entirely: respondents rate each statement on an agreement continuum (typically five points from "strongly disagree" to "strongly agree"), and scores are summed, which is why Likert scaling is also called the method of summated ratings. Likert items are ordinal and far cheaper to build, accounting for their overwhelming dominance today. The Bogardus social distance scale, devised by Emory Bogardus in 1925, measures specifically the willingness to admit members of a group to varying degrees of social intimacy and is cumulative in the Guttman sense. The Guttman scale (scalogram analysis) seeks unidimensionality through a deterministic cumulative pattern. The semantic differential of Charles Osgood (1957) measures connotative meaning along bipolar adjective pairs. Thurstone's distinctive contribution is its claim to genuine interval measurement secured through the external judging panel.
The method's principal controversies concern the judges. Critics, beginning with Leon Festinger and later Carolyn Sherif, argued that judges' own attitudes contaminate their sorting — a strongly committed judge may displace statements toward the extremes, a phenomenon explored in social judgment theory's assimilation-contrast effects. The labour of recruiting and managing a large judging panel makes the scale costly and slow, and identical respondent scores can arise from endorsing quite different combinations of items, undermining the assumption of a single underlying dimension. These weaknesses, more than any theoretical defect, explain why summated-rating methods displaced it. Modern psychometrics has partly absorbed Thurstone's insights into item response theory, which formalises the relationship between latent traits and item endorsement that Thurstone anticipated intuitively.
For the working practitioner — whether a policy researcher designing a perception survey, a desk officer interpreting public-opinion data, or a civil-services aspirant — the Thurstone Scale matters less as a tool to deploy than as a conceptual benchmark. It demonstrates the difference between ordinal and interval measurement, clarifies why attitude scores are constructions rather than direct readings, and supplies the vocabulary — scale value, equal-appearing intervals, item ambiguity — that recurs across survey methodology. Understanding its judge-based construction sharpens critical reading of any attitude statistic, reminding the analyst that the meaning of a "favourability score" depends entirely on how its underlying instrument was built.
Example
In their 1929 monograph The Measurement of Attitude, L.L. Thurstone and E.J. Chave applied the method of equal-appearing intervals to construct a scale measuring American attitudes toward the church, using a panel of several hundred student judges.
Frequently asked questions
The Thurstone Scale uses an external panel of judges to assign interval-level scale values to statements before administration, and respondents merely check items they agree with. The Likert Scale dispenses with judges, has respondents rate each item on an agreement continuum, and sums the ratings, producing ordinal rather than interval data.
Keep learning