STATISTICS CORNER Year : 2022  Volume : 34  Issue : 2  Page : 178181 Statistical analysis of quantitative and qualitative variables—A quick glimpse Sandhya Somasundaran Department of Ophthalmology, Govt Medical College Kozhikode, Kerala, India Correspondence Address:
Analysis of Quantitative Variables Normality tests The distribution of continuous quantitative variables should be checked. This can be performed with charts (histogram, QQ plots, or box plot) or with statistical tests (Kolmogrov–Smirnov or Shapiro–Wilk test). In normal distributions, we compare the mean using parametric tests. In nonnormal distributions, the median is compared using parametric tests.[3] Descriptive statistics In descriptive statistics, we describe the baseline parameters of the study groups. We use mean and standard deviation to describe the data in normal distributions. For nonnormal distributions, median and interquartile range are a better measure. Analytic statistics[1],[2],[3],[4] Analytic tests can be carried out for finding: Difference between two groups [Figure 1]Correlation between variables. Parametric tests Difference between the means Comparing two groups—TtestsComparing more than 2 groups—analysis of variance (ANOVA) Both Ttests and ANOVA can be performed only if the data fulfills the following assumptions: Data is quantitativeVariables are normally distributed.Samples are randomVariance is equal in all samples[1],[2],[5] Ttest There are 2 types of Ttests: Unpaired Ttest—two unrelated samplesPaired Ttest—two related samples (before and after) Unpaired Ttest This is subdivided into onesample Ttest and twosample Ttest. The onesample Ttest compares the mean of a sample population to its standard value, for example, comparing for mean hemoglobin values of a group of diabetic patients to a standard value of 13.The twosample Ttest compares the means of two unrelated samples, for example, comparing central foveal thickness in two groups (bevacizumab group and ranibizumab group). Paired Ttest This test is used when there are two related samples, for example, comparing the intraocular pressure before and after exercise. 2. ANOVA ANOVA is used to detect the difference in means between three or more independent groups. There are three types of ANOVA: Oneway ANOVA: It can detect the difference in means between three or more groups, for example, comparing central foveal thickness measurement in three intravitreal injection groups—bevacizumab, ranibizumab, and aflibercept. However, the oneway ANOVA only shows if the means are different from each other. To know which pair is different, posthoc tests such as Tukey or Scheffe tests are used.[1],[4],[6]Twoway ANOVA: It compares two or more independent categorical groups that can be divided into subgroups, for example, comparing central foveal thickness in three injection groups (bevacizumab, ranibizumab, and aflibercept) and also between males and females in these 3 groups. Here, we have two independent groups: type of injection and gender.Repeatedmeasure ANOVA. It is similar to the paired Ttest. It is used to find the difference in means when more than two observations are made in the same subject, for example, comparing intraocular pressure before exercise, 1 hour after exercise, and 3 hours after exercise. Correlation between variables Correlation tests Correlation tests are used to find relationships between variables and to quantify this relation. The correlation coefficient ranges from −1 to +1. If the correlation coefficient is zero, there is no correlation. If it is −1, there is perfect negative correlation, and if it is +1, there is perfect positive correlation. Cohen suggested the following interpretation of the absolute value of correlation: −0.3 to +0.3: Weak −0.5 to −0.3 or +0.3 to +0.5: Moderate −0.9 to −0.5 or +0.5 to +0.9: Strong −1.0 to −0.9 or +0.9 to +1.0: Very strong The most common test used for finding correlation is the Pearson correlation coefficient. In this test, both dependent and independent variables should be continuous and normally distributed.Regression Regression is the mathematical prediction of a dependent variable) using the value of an independent variable. There are different types of regression analytical models.Simple linear regression is used for continuous variables with one independent and one dependent variable.Multiple linear regression is used for continuous variables with multiple independent variables and a single dependent variable. Nonparametric tests If the continuous variable is not normally distributed, we should use nonparametric tests. These include: Mann–Whitney test (Mann–Whitney–Wilcoxon or Wilcoxon ranksum test or Wilcoxon–Mann–Whitney test): This test is the nonparametric equivalent test of unpaired Ttest.Wilcoxon signed rank test: This test is the nonparametric test equivalent of paired Ttest.Kruskal–Wallis test: This is the nonparametric equivalent of oneway ANOVA.Friedman test: This test is the nonparametric equivalent of repeatedmeasure ANOVA.Spearman correlation.Logistic regression. Analysis of qualitative variables The tests used for categorical data are given in [Table 1].{Table 1} Pearson's chi square test It is the most commonly used test for categorical variables. It determines the difference in the proportion between two or more independent categorical groups. Fisher's exact test is an alternative to the chi square test. It is used for small sample sizes. Chi square test for trend is used when one variable is binary and the other is ordinal. It is used to assess whether the association between variables follows a trend, for example, to know the trend of uveitis among the different age groups. Chi square goodnessoffit test This test is applied to a single categorical variable drawn from a population through random sampling. It determines whether the sample data represents the actual population data. Spearman correlation Spearman correlation is the nonparametric equivalent of the Pearson correlation test. It determines whether two variables can predict each other. The test statistics varies from −1 (perfect negative correlation) to +1 (perfect positive correlation). Logistic regression This test is similar to linear regression. It is used to predict outcomes from one or more response variables. The variables may be categorical or continuous variables with skewed distribution. Researchers must plan the statistical analysis at the beginning of the study itself. Choosing the statistical test is of paramount importance in a scientific paper. We hope that this article will give an overview of the various analytical tests in statistics. Figures below summarize the statistical tests. Financial support and sponsorship Nil. Conflicts of interest There are no conflicts of interest. References


