How does a Likert scale differ from a Thurstone scale?

Thurstone scaling assigns each statement a pre-determined interval value through a panel of expert judges before administration, producing equal-interval data. The Likert method derives item weights empirically from respondent agreement after administration and yields ordinal summated scores, making it faster to construct but less metrically precise.

Should a Likert scale include a neutral midpoint?

Including a midpoint allows genuine ambivalence but invites central-tendency bias and satisficing. Omitting it (a four- or six-point forced-choice scale) compels a directional response and is preferred when the researcher believes neutrality masks evasion. The choice should be justified by the construct and reported explicitly.

Why are some Likert items reverse-worded?

Negatively keyed items are interspersed to detect acquiescence bias—the tendency to agree regardless of content—and to keep respondents attentive. Their scores are reversed before summation so a high composite consistently signals a favourable attitude. Failure to reverse-key inflates or corrupts the aggregate score.

Likert Scale for Attitude Measurement

The Likert scale originates in the 1932 doctoral research of the American social psychologist Rensis Likert, published as A Technique for the Measurement of Attitudes in the journal Archives of Psychology (No. 140). Likert sought a simpler, more reliable alternative to the laborious Thurstone equal-appearing-interval method then dominant in attitude research. His central insight was that a respondent's attitude toward an object—a policy, a social group, an institution—could be inferred by summing responses to a battery of evaluative statements, each rated on a graded continuum of agreement. The instrument is therefore properly termed a summated rating scale, and "Likert scale" strictly denotes the aggregate of items, whereas a single item is a "Likert-type item." For the UPSC General Studies Paper IV (Ethics, Integrity and Aptitude) candidate, the concept anchors the syllabus unit on attitude: content, structure, function and its measurement.

The procedural mechanics proceed in defined stages. First, the investigator assembles a large pool of declarative statements expressing clearly favourable or unfavourable positions toward the attitude object, avoiding neutral or factual statements. Second, each statement is presented with a fixed set of ordered response categories—classically five: Strongly Agree, Agree, Undecided/Neutral, Disagree, Strongly Disagree. Third, numerical weights (commonly 1 through 5) are assigned to the categories, with the scoring direction reversed for negatively worded items so that a high score consistently signals a favourable attitude. Fourth, the respondent's item scores are summed (or averaged) to yield a composite attitude score. Fifth, in scale construction the researcher administers the draft pool to a pilot sample and performs item analysis—retaining only statements whose item-total correlation is high and which discriminate between high and low scorers.

Variants proliferate around the number and labelling of points. A four- or six-point scale omits the midpoint to force a directional choice and eliminate central-tendency bias; a seven- or nine-point scale increases discrimination at the cost of respondent fatigue. The semantic anchors may shift from agreement to frequency (Never–Always), satisfaction (Very Dissatisfied–Very Satisfied), or importance. A Likert item is sometimes rendered visually as a numbered row or a graphic continuum. Reverse-keyed items are deliberately interspersed to detect acquiescence (the tendency to agree regardless of content). Reliability is assessed through Cronbach's alpha, and validity through correlation with external criteria or factor analysis confirming unidimensionality.

In contemporary practice the instrument is ubiquitous in governmental and diplomatic survey work. The Pew Research Center's Global Attitudes Survey and its annual measurements of favourability toward states and leaders rest on Likert-type batteries. India's NITI Aayog and the National Sample Survey Office deploy Likert items in citizen-satisfaction and governance-perception instruments. The OECD's Government at a Glance trust indicators and the European Commission's Eurobarometer, conducted since 1973, use agree-disagree and satisfaction scales to track public sentiment across member states. The World Values Survey and Edelman Trust Barometer similarly structure cross-national attitude comparison around graded agreement. Foreign ministries commission such surveys to gauge soft-power perception and diaspora sentiment.

The Likert scale must be distinguished from adjacent attitude-measurement techniques. Unlike the Thurstone scale, which assigns each statement a pre-determined interval value through expert judges and treats items as equal-interval, the Likert method derives weights empirically from respondent agreement and yields ordinal data. Unlike the Guttman scale (cumulative scalogram), which arranges items in a hierarchy such that agreement with a strong item implies agreement with all weaker ones, the Likert summated score does not presume cumulativeness. The semantic differential of Osgood measures connotative meaning between bipolar adjective pairs rather than agreement with propositions. Recognising that Likert data are ordinal—not interval—is the single most contested methodological point, bearing directly on which statistics are admissible.

Edge cases and controversies persist. Whether Likert data may be analysed with parametric statistics (means, t-tests, ANOVA) or must be confined to non-parametric methods (median, Mann-Whitney, chi-square) divides methodologists; the prevailing compromise treats a summated multi-item scale as approximately interval while a single item remains strictly ordinal. Response biases—central tendency, acquiescence, social desirability, and extreme-responding—threaten validity, and cross-cultural research documents systematic differences in response style between, for instance, East Asian and Western respondents. The presence or absence of a midpoint remains debated, as does the labelling of every point versus only the endpoints.

For the working practitioner—the civil servant designing a citizen-feedback mechanism, the policy researcher interpreting a perception survey, or the GS4 aspirant constructing an answer—the Likert scale offers a defensible, reproducible bridge between an intangible attitude and a quantifiable indicator. Its strengths are economy, ease of administration, high reliability, and intuitive interpretability; its limitations are ordinality, vulnerability to response sets, and the inference gap between expressed agreement and behavioural intention. A literate user states the number of points, justifies the midpoint decision, reverse-keys negatively worded items, reports a reliability coefficient, and chooses statistics appropriate to ordinal data—thereby converting a familiar questionnaire into a methodologically rigorous instrument of measurement.

Frequently asked questions

A single Likert item produces strictly ordinal data, because the psychological distance between 'Agree' and 'Strongly Agree' need not equal that between 'Neutral' and 'Agree'. A summated multi-item scale is widely treated as approximately interval, permitting parametric statistics, though this remains methodologically contested.

Frequently asked questions

Frequently asked questions

Go deeper than the definition

Frequently asked questions

Go deeper than the definition