Kerala Journal of Ophthalmology

STATISTICS CORNER
Year
: 2022  |  Volume : 34  |  Issue : 2  |  Page : 178--181

Statistical analysis of quantitative and qualitative variables—A quick glimpse


Sandhya Somasundaran 
 Department of Ophthalmology, Govt Medical College Kozhikode, Kerala, India

Correspondence Address:
Dr. Sandhya Somasundaran
Department of Ophthalmology, Govt Medical College Kozhikode, Kerala
India




How to cite this article:
Somasundaran S. Statistical analysis of quantitative and qualitative variables—A quick glimpse.Kerala J Ophthalmol 2022;34:178-181


How to cite this URL:
Somasundaran S. Statistical analysis of quantitative and qualitative variables—A quick glimpse. Kerala J Ophthalmol [serial online] 2022 [cited 2022 Dec 3 ];34:178-181
Available from: http://www.kjophthal.com/text.asp?2022/34/2/178/355036


Full Text



Karl Pearson, one of the founding fathers of statistics, once said—”Statistics is the grammar of science.” Just like we need good grammar for effective communication, we also need proper statistics for advancing science.[1],[2]

In previous issues, we dealt with types of variables and risk. In this issue, we will be dealing with the analysis of both quantitative and qualitative variables [Figure 1] and [Figure 2].{Figure 1}{Figure 2}

 Analysis of Quantitative Variables



Normality tests

The distribution of continuous quantitative variables should be checked. This can be performed with charts (histogram, Q-Q plots, or box plot) or with statistical tests (Kolmogrov–Smirnov or Shapiro–Wilk test). In normal distributions, we compare the mean using parametric tests. In non-normal distributions, the median is compared using parametric tests.[3]

Descriptive statistics

In descriptive statistics, we describe the baseline parameters of the study groups. We use mean and standard deviation to describe the data in normal distributions. For non-normal distributions, median and interquartile range are a better measure.

Analytic statistics[1],[2],[3],[4]

Analytic tests can be carried out for finding:

Difference between two groups [Figure 1]Correlation between variables.

Parametric tests

Difference between the means

Comparing two groups—T-testsComparing more than 2 groups—analysis of variance (ANOVA)

Both T-tests and ANOVA can be performed only if the data fulfills the following assumptions:

Data is quantitativeVariables are normally distributed.Samples are randomVariance is equal in all samples[1],[2],[5]

T-test

There are 2 types of T-tests:

Unpaired T-test—two unrelated samplesPaired T-test—two related samples (before and after)

Unpaired T-test

This is sub-divided into one-sample T-test and two-sample T-test.

The one-sample T-test compares the mean of a sample population to its standard value, for example, comparing for mean hemoglobin values of a group of diabetic patients to a standard value of 13.The two-sample T-test compares the means of two unrelated samples, for example, comparing central foveal thickness in two groups (bevacizumab group and ranibizumab group).

Paired T-test

This test is used when there are two related samples, for example, comparing the intraocular pressure before and after exercise.

2. ANOVA

ANOVA is used to detect the difference in means between three or more independent groups. There are three types of ANOVA:

One-way ANOVA: It can detect the difference in means between three or more groups, for example, comparing central foveal thickness measurement in three intravitreal injection groups—bevacizumab, ranibizumab, and aflibercept.

However, the one-way ANOVA only shows if the means are different from each other. To know which pair is different, post-hoc tests such as Tukey or Scheffe tests are used.[1],[4],[6]Two-way ANOVA: It compares two or more independent categorical groups that can be divided into sub-groups, for example, comparing central foveal thickness in three injection groups (bevacizumab, ranibizumab, and aflibercept) and also between males and females in these 3 groups. Here, we have two independent groups: type of injection and gender.Repeated-measure ANOVA.

It is similar to the paired T-test. It is used to find the difference in means when more than two observations are made in the same subject, for example, comparing intraocular pressure before exercise, 1 hour after exercise, and 3 hours after exercise.

Correlation between variables

Correlation tests

Correlation tests are used to find relationships between variables and to quantify this relation. The correlation coefficient ranges from −1 to +1. If the correlation coefficient is zero, there is no correlation. If it is −1, there is perfect negative correlation, and if it is +1, there is perfect positive correlation.

Cohen suggested the following interpretation of the absolute value of correlation:

−0.3 to +0.3: Weak

−0.5 to −0.3 or +0.3 to +0.5: Moderate

−0.9 to −0.5 or +0.5 to +0.9: Strong

−1.0 to −0.9 or +0.9 to +1.0: Very strong

The most common test used for finding correlation is the Pearson correlation coefficient. In this test, both dependent and independent variables should be continuous and normally distributed.Regression

Regression is the mathematical prediction of a dependent variable) using the value of an independent variable. There are different types of regression analytical models.Simple linear regression is used for continuous variables with one independent and one dependent variable.Multiple linear regression is used for continuous variables with multiple independent variables and a single dependent variable.

Non-parametric tests

If the continuous variable is not normally distributed, we should use non-parametric tests. These include:

Mann–Whitney test (Mann–Whitney–Wilcoxon or Wilcoxon rank-sum test or Wilcoxon–Mann–Whitney test): This test is the non-parametric equivalent test of unpaired T-test.Wilcoxon signed rank test: This test is the non-parametric test equivalent of paired T-test.Kruskal–Wallis test: This is the non-parametric equivalent of one-way ANOVA.Friedman test: This test is the non-parametric equivalent of repeated-measure ANOVA.Spearman correlation.Logistic regression.

Analysis of qualitative variables

The tests used for categorical data are given in [Table 1].{Table 1}

Pearson's chi square test

It is the most commonly used test for categorical variables. It determines the difference in the proportion between two or more independent categorical groups.

Fisher's exact test is an alternative to the chi square test. It is used for small sample sizes.

Chi square test for trend is used when one variable is binary and the other is ordinal. It is used to assess whether the association between variables follows a trend, for example, to know the trend of uveitis among the different age groups.

Chi square goodness-of-fit test

This test is applied to a single categorical variable drawn from a population through random sampling. It determines whether the sample data represents the actual population data.

Spearman correlation

Spearman correlation is the non-parametric equivalent of the Pearson correlation test. It determines whether two variables can predict each other. The test statistics varies from −1 (perfect negative correlation) to +1 (perfect positive correlation).

Logistic regression

This test is similar to linear regression. It is used to predict outcomes from one or more response variables. The variables may be categorical or continuous variables with skewed distribution.

Researchers must plan the statistical analysis at the beginning of the study itself. Choosing the statistical test is of paramount importance in a scientific paper. We hope that this article will give an overview of the various analytical tests in statistics.

Figures below summarize the statistical tests.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.

References

1Gaddis ML, Gaddis GM. Introduction to biostatistics: Part 6, Correlation and regression. Ann Emerg Med 1990;19:1462-8.
2Hazra A, Gogtay N. Biostatistics series module 3: comparing groups: numerical variables. Indian J Dermatol 2016;61:251-60.
3Neideen T, Brasel K. Understanding statistical tests. J Surg Educ 2007;64:93-6.
4Hazra A, Gogtay N. Biostatistics series module 9: survival analysis. Indian J Dermatol 2017;62:251-7.
5Hoffman JI. Biostatistics for Medical and Biomedical Practitioners. 2nd ed. Elsevier Science; 2019.
6Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE. Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. Springer, 2011.