Can Likert Scale Data Be Considered as Continuous?

One of the most frequent questions from students doing academic research is whether it is acceptable or legitimate to use parametric statistical tests and procedures for Likert scale data. A typical Likert scale item has 5 points, but may have up to 11 points, depending on the researcher. For the newbie, common examples of 5-point Likert scales used in survey questionnaires are (1= Strongly disagree, 2 = Disagree, 3 = Neutral, 4 = Agree and 5 = Strongly agree) or (1= Very poor, 2 = Poor, 3 = Average, 4 = Good and 5 = Excellent), etc.

The main issue with the Likert scale is that it is inherently ordinal (categorical), because Likert scale points represent a set of ordered categories, unlike a rating scale, which is simply numerical or interval (metric).

 

What’s the big deal?

The ongoing debate between two school of thoughts is that one group strictly maintains that, though the Likert scale is ordinal, i.e., the intervals or gaps between its points are not necessarily equal, so that Likert scale data cannot lend itself to parametric methods like computation of means, correlations and other numerical operations. According to Jamieson (2004), only non-parametric statistics should be used on Likert scale data.

The other group asserts that, since the Likert scale item is technically ordered, it is therefore valid to use it in parametric tests in some situations, supported by two examples. On the basis of assumptions on skewness, number of categories, etc., Lubke and Muthen (2004) found that it is possible to find true parameter values in factor analysis with Likert scale data. Also, Glass et al. (1972) found that F tests in ANOVA (Analysis of Variance) could return accurate p-values on Likert items under certain conditions.

 

What do we do?

First of all, let me tell you that I belong to the first group. i.e., the one that thinks that the intervals or gaps between its points are almost always not equal - we cannot use parametric methods on Likert scale data. Maybe you'll think that I am a tad too rigid, but I prefer to be on the safe side when it comes to accuracy, reliability and especially generalisability of research findings.

I therefore totally agree with Grace-Martin (2008) who says that we have to make absolutely sure that Likert scale data satisfies required conditions, before agreeing that parametric methods may yield reliable results when applied to Likert scale data.

You must understand the fundamental difference between a Likert-type item and a Likert Scale. A true Likert scale is made up of many items, which all measure the same attitude. However, the term Likert Scale is too often used to refer to a single item, thus bringing more confusion about what a Likert Scale is, and undoubtedly refuelling this raging debate between the two groups of researchers.

Secondly, take utmost care in researching the consequences of using your procedure on Likert scale data from your study design, ensuring that you do not merely justify its use in your research on the basis of consensual validation. Accept that there are some circumstances and procedures for which it is more relevant and applicable than others.

On a more technical note, if you really have no other choice but to use parametric methods to Likert scale data,

  • Let the reader or reviewer understand that your underlying concept is continuous and ensure that, at least, the item is measured on at least 5 points (obviously, the use of 7 points or more is definitely better).
  • Provide a strong indication that the intervals between points might be approximately equal.
  • Make sure that other assumptions like normality, equal variance of residuals, etc. are met.
  • Cross-check your results by running non-parametric equivalent to your tests – this will increase your confidence about your conclusions if you happen to get the same results.
  • Make sure that you have very significant results before making whatsoever claims. For example, you could use more stringent alpha levels like 0.01 or even 0.005, instead of a relaxed .05. When you have “clear-cut” p-values of the order of 0.001 or 0.30, there is no doubt about your result and conclusion, even though parameter estimates are slightly biased. However, it might be a lot trickier to make conclusions when p-values are close to 0.05.

 

What do I care?

More importantly, consider the consequences of reporting inaccurate results. One of the most important aspects of research is ethics. Conducting research is not only about being awarded a degree, obtaining a certificate and putting it altogether in a drawer somewhere thereafter – bear in mind that the idea behind any research is to share your findings with the research community and add a new perspective to your research topic.

Finally, ask yourself the following questions:

  • Will anyone ever read your research dissertation/thesis?
  • Will your research ever be published?
  • Will it help society and academia gain a new, if not better, insight into your topic of investigation?
  • How would you feel if you built your literature review on erroneous findings?

 

References

Glass GV, Peckham PD and Sanders JR (1972) “Consequences of failure to meet assumptions underlying the analyses of variance and covariance”, Review of Educational Research, Vol. 42, No. 3, pp. 237-288.

Grace-Martin, K (2008) “Can Likert Scale Data ever be Continuous?” [online] Available from https://www.theanalysisfactor.com/can-likert-scale-data-ever-be-continuous/>

Jamieson, S (2004) “Likert scales: How to (ab)use them”, Medical Education, Vol. 38, No. 12, pp. 1212-1218.

Lubke, GH and Muthen, BO (2004) “Applying Multigroup Confirmatory Factor Models for Continuous Outcomes to Likert Scale Data Complicates Meaningful Group Comparisons”, Structural Equation Modeling, Vol. 11, No. 4, pp. 514-534.