In certain cases, even when the use of parametric methods is justified, non-parametric methods may be easier to use. Due both to this simplicity and to their greater robustness, non-parametric methods are seen by some statisticians as leaving less room for improper use and misunderstanding. The wider applicability and increased robustness of non-parametric tests comes at a cost: in cases where a parametric test would be appropriate, non-parametric tests have less power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence.
Non-parametric or distribution-free inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the probability distributions of the variables being assessed. When our data is normally distributed, the mean isequal to the median and we use the mean as our measure ofcenter. However, if our data is skewed, then the median is a much better measure of center. Therefore, justlike the Z, t and F tests made inferences about the population mean(s),nonparametric tests make inferences about the population median(s).
Given below are the various nonparametric tests: * Chi square(? 2) * Kolmogorov -Smirnov test * median test * Kruskal-Wallis one-way analysis of variance by ranks * Friedman two-way analysis of variance by ranks * Kuiper’s test * Mann-Whitney U * Wilcoxon signed-rank test * Wilcoxon matched-pairs test * Wald- Wolfowitz runs test The details of some of the commonly used nonparametric is given below: The Sign test (for 2 repeated/correlated measures) The sign test is one of the simplest nonparametric tests. It is for use with 2 repeated (or correlated) measures (see the example below), and measurement is assumed to be at least ordinal.
For each subject, subtract the 2nd score from the 1st, and write down the sign of the difference. (That is write “-” if the difference score is negative, and “+” if it is positive. ) The usual null hypothesis for this test is that there is no difference between the two treatments. If this is so, then the number of + signs (or – signs, for that matter) should have a binomial distribution1 with p = . 5, and N = the number of subjects. In other words, the sign test is just a binomial test with + and – in place of Head and Tail (or Success and Failure). Large sample sign test
The sampling distribution used in carrying out the sign test is a binomial distribution with p =q = . 5. The mean of a binomial distribution is equal to Np, and the variance is equal to Npq. As N increases, the binomial distribution converges on the normal distribution (especially when p = q = . 5). When N is large enough (i. e. , greater than 30 or 50, depending on how conservative one is), it is possible to carry out a z-test version of the sign test as follows: z2 is equal to ? 2 with df = 1. Therefore This formula can be expanded with what Howell (1997) calls “some not-so-obvious algebra” to yield:
Note that X equals the observed number of p-events, and Np equals the expected number of pevents under the null hypothesis. Similarly, N-X equals the observed number of q-events, and Nq = the expected number of q-events under the null hypothesis. Therefore, we can rewrite equation in a more familiar looking format as follows: Wilcoxon Signed-Ranks Test (for 2 repeated/correlated measures) One obvious problem with the sign test is that it discards a lot of information about the data. It takes into account the direction of the difference, but not the magnitude of the difference between each pair of scores.
The Wilcoxon signed-ranks test is another nonparametric test that can be used for 2 repeated (or correlated) measures when measurement is at least ordinal. But unlike the sign test, it does take into account (to some degree, at least) the magnitude of the difference. Let us return to the data used to illustrate the sign test. The 14 difference scores were: -20, -7, -14, -13, -26, +5, -17, -9, -10, -9, -7, +3, +2, -17 If we sort these on the basis of their absolute values (i. e. , disregarding the sign), we get the results shown in Table 3. 2.
The statistic T is found by calculating the sum of the positive ranks, and the sum of the negative ranks. T is the smaller of these two sums. In this case, therefore,T = 6. If the null hypothesis is true, the sum of the positive ranks and the sum of the negative ranks are expected to be roughly equal. But if H0 is false, we expect one of the sums to be quite small–and therefore T is expected to be quite small. The most extreme outcome favourable to rejection of H0 is T = 0. Large-sample Wilcoxon signed ranks test The following are known to be true about the sampling distribution of T, the statistic used in the Wilcoxon signed ranks test:
If N > 50, then the sampling distribution of T is for practical purposes normal. And so, a z-ratio can be computed as follows: Friedman ANOVA This test is sometimes called the Friedman two-way analysis of variance by ranks. It is for use with k repeated (or correlated) measures where measurement is at least ordinal. The null hypothesis states that all k samples are drawn from the same population, or from populations with equal medians. It may be useful at this point to consider what kinds of outcomes are expected if H0 is true.
H0 states that all of the samples (columns) are drawn from the same population, or from populations with the same median. If so, then the sums (or means) of the ranks for each of the columns should all be roughly equal, because the ranks 1, 2, and 3 would be expected by chance to appear equally often in each column. In this example, the expected _R for each treatment would be 10 if H0 is true. (In general, the expected sum of ranks for each treatment is N(k +1)/2. ) The Friedman ANOVA assesses the degree to which the observed _R’s depart from the expected _R’s.
If the departure is too extreme (or not likely due to chance), one concludes by rejecting H0. The Fr statistic is calculated as follows TESTS OF GOODNESS OF FIT The purpose of the test of goodness of fit is the comparison of the distribution form (shape) of two features in one population or of one feature in two populations. The solution of many statistic problems is “simpler”, if the analysed feature has a normal distribution. Different statistical analyses require fulfilling the assumptions on the distribution normality of the analysed variable (T-Student tests, analysis of variance, analysis of regression, canonical analysis etc. . That is why we must previously carry out the verification of the distribution character every time we want to apply statistical analyses requiring the data of a determined distribution. Then, the circumstances of applying non-parametric tests of goodness of fit can be as follows: * As a start point for applying some specific models of parametric tests (verification of the mean value, variances of the variable distribution etc. , * As one of the elements of the verification of a mathematical model structure, correctness, for example in the case of modelling a real estate market (verification of the model remainders normality distribution), * A comparison of distributions in two different populations in order to draw conclusions on their similarity, * other practical issues, like verification of the dice symmetry. Between non-parametric tests of goodness of fit, we distinguish, among the others, the following: * Chi-square Pearson test, Kolomogorow test, * Kolomogorow-Smirnow test, * Kolomogorow-Lillieforse test, * Shapiro-Wilk test, * Wilcoxon test. Chi-square test of goodness of fit Chi-square test of goodness of fit requires a large market database because of its low power. We can apply it, for example, to examine the distribution of prices of a determined real estate type in a time interval, aiming to verify the assumptions of a selected parametric test, used to verify, for example, the basic distribution parameters of this variable on a given local arket (mean price, its dispersion and the like). The run of this test can be described as follows: • Classification of the values of the feature X: x1, x2, x3, …, xn gathered in a random sample (creation of a distributive series), • Formulation of the zero hypothesis H0: cumulative distribution function of the examined feature is the function F0(x); if the hypothesis H0 is true, the probability pi that the variable X would take a value belonging to the i-th class (gi-1, gi) is: pi = F0(gi) – F0(gi-1).
Statistics in this test has the form: it is a measure of differences between experimental ni and theoretical npi sizes of individual classes and it has a chi-square distribution, thus, we compare it with the critical values read off from the tables of this distribution. KRUSKAL-WALLIS TEST When we can assumethat our data is normally distributed and that the population standard deviations are equal, we can testfor a difference among several populations by using the One-way ANOVA F test.
However,when ourdatais not normal,orwe aren’t sure if it is, we can use the nonparametric Kruskal-Wallis test to compare more than two populationsas long as our data comefroma continuous distribution. In the One-way ANOVA F test, we are testing to see if our population means are equal. Since our data might not necessarily be symmetric in the nonparametric setting, it is better touse the median as the measure of center, and so in theKruskal-Wallis test weare testing to see ifour population medians are equal.
Recall the analysis ofvarianceidea: wewritethetotal observed variation in the responses as the sum of two parts,one measuring variation amongthe groups (sum of squares for groups, SSG) and one measuring variation among individual observations withinthe samegroup (sumof squares for error,SSE). The ANOVA F testrejects the null hypothesis that the mean responses are equal in all groups if SSG is large relative to SSE. The idea of the Kruskal-Wallis ranktest isto rank all theresponses fromall groups togetherandthen apply one-way ANOVA to theranks rather than to the original bservations. Ifthere are N observations in all,the ranks are always the whole numbers from1 to N. The total sumofsquares for the ranks is therefore a fixed number no matter what the data are. So wedo not need to look at both SSG and SSE. Although it isn’t obvious without someunpleasant algebra, the Kruskal-Wallistest statistic is essentially just SSGfor the ranks. When SSGis large, that is evidence that the groups differ. Draw independent SRSs of sizes n1,n2,… ,nIfrom Ipopulations. There are N observations in all. Rank all N observations and let Ribe the sum of the ranks.
When the sample sizesniare large and all Ipopulations have the samecontinuous distribution, H has approximately the chi-square distribution with I-1 degrees of freedom. The Kruskal-Wallis testrejects the null hypothesis thatall populationshave thesame distribution when H is large. So liketheWilcoxon rank sumstatistic,theKruskal-Wallistest statistic is based onthe sumsof the ranks for the groups we are comparing. The more different these sumsare, the stronger is the evidence thatresponses are systematically larger in somegroups than in others.
As usual, we again assign average ranks to tied observations Advantages of nonparametric tests Siegel and Castellan (1988, p. 35) list the following advantages of nonparametric tests: 1. If the sample size is very small, there may be no alternative to using a nonparametric statistical test unless the nature of the population distribution is known exactly. 2. Nonparametric tests typically make fewer assumptions about the data and may be more relevant to a particular situation. In addition, the hypothesis tested by the nonparametric test may be more appropriate for the research investigation. . Nonparametric tests are available to analyze data which are inherently in ranks as well as data whose seemingly numerical scores have the strength of ranks. That is, the researcher may only be able to say of his or her subjects that one has more or less of the characteristic than another, without being able to say how much more or less. For example, in studying such a variable as anxiety, we may be able to state that subject A is more anxious than subject B without knowing at all exactly how much more anxious A is.
If data are inherently in ranks, or even if they can be categorized only as plus or minus (more or less, better or worse), they can be treated by nonparametric methods, whereas they cannot be treated by parametric methods unless precarious and, perhaps, unrealistic assumptions are made about the underlying distributions. 4. Nonparametric methods are available to treat data which are simply classificatory or categorical, i. e. , are measured in a nominal scale. No parametric technique applies to such data. 5.
There are suitable nonparametric statistical tests for treating samples made up of observations from several different populations. Parametric tests often cannot handle such data without requiring us to make seemingly unrealistic assumptions or requiring cumbersome computations. 6. Nonparametric statistical tests are typically much easier to learn and to apply than are parametric tests. In addition, their interpretation often is more direct than the interpretation of parametric tests. Disadvantages of nonparametric tests
Nonparametric tests do have at least two major disadvantages in comparison to parametric tests. Nonparametric tests are less powerful Because parametric tests use more of the information available in a set of numbers. Parametric tests make use of information consistent with interval scale measurement, whereas parametric tests typically make use of ordinal information only. As Siegel and Castellan (1988) put it, “nonparametric statistical tests are wasteful. ” Second, parametric tests are much more flexible, and allow you to test a greater range of hypotheses.
For example, factorial ANOVA designs allow you to test for interactions between variables in a way that is not possible with nonparametric alternatives. There are nonparametric techniques to test for certain kinds of interactions under certain circumstances, but these are much more limited than the corresponding parametric techniques. Therefore, when the assumptions for a parametric test are met, it is generally (but not necessarily always) preferable to use the parametric test rather than a nonparametric test