10 Wilcoxon Signed Rank Test I

Mr Taranga Mukherjee

epgp books
epgp books

 

 

 

 

1 Wilcoxon Signed Rank Test

 

This is another nonparametric alternative to Student’s t test with the additional assumption of symmetry. This test is developed by Frank Wilcoxon(1945) but popularized by Sidney Siegel(1956). The procedure utilizes the signed rank of the observations and provides a distribution free test for location.

 

1.1  Assumptions & the hypothesis

 

Suppose X1; X2; ::; Xn are iid observations from a symmetric location family of distributions F . Then F (x) = F (x (F )), where = (F ) is a location parameter. F is assumed to be continuous and symmetric , that is, F (x) + F ( x) = 1 for all x. Under symmetry is, therefore, the median of F. Then our objective is to provide a test for H0 : (F ) = 0 against the usual one or two sided alternatives. However, without any loss of generality, we can take 0 = 0

 

1.2  Signed Rank

 

The continuity assumption ensures that P (Xi = 0) = 0 for all i and that the observations are distinct with probability one. Suppose the observations are ranked in order of absolute value and ranked accordingly. Suppose Ri+ is the rank of jXij among fjX1j; jX2j; ::; jXnjg. Then signed rank of an observation is the rank of its absolute value multiplied by the sign of the original observation. If Zi = I(Xi > 0), then the signed rank of the i th observation is ZiRi+; i = 1; 2; ::; n. The signed-rank sum T is de ned as the sum of the signed ranks, i.e.

 

Then the signed rank sum is T = 37.

 

1.3  T as a test statistic

 

Note that Xi is expected to be larger under > 0 than under = 0. Thus a large(small) value of T implies that most of the large deviations from 0 are positive(negative). Therefore, a large(small) T is an indicator of positive(negative) . Then it seems reasonable to reject the null hypothesis against Ha : > 0(Ha : < 0) if T tend to be too large(or too small). Similarly too large and too small values of T indicates possible rejection of the null hypothesis against Ha : 6= 0.

 

2 T is distribution free!

 

Now we shall show that the distribution of T does not depend on any F under the null hypothesis. Before we proceed further, we introduce the concept of antirank or inverse rank. If R = (R1; R2; ::; Rn) is a rank vector, the antirank vector is D, provided R o D = (RD1 ; RD2 ; ::; RDn ) = (1; 2; ::; n).

 

Consider an example with n = 5. Suppose R = (3; 2; 4; 1; 5) then

 

Since Zi and jXjj are independent for any i 6= j, the desired independence follows. Now, being a function of jXij; i = 1; 2; ::; n, R+ is independent of Zi; i = 1; 2; ::; n. Thus Zi; i = 1; 2; ::; n and Di; i = 1; 2; ::; n are independently distributed.

 

Here D is a random permutation over P, the set of all n! permutations of f1; 2; ::; ng. Again Zi Bernoulli(12 ) for every i. Thus

Thus ZDi ; i = 1; 2; ::; n are iid random variables with Bernoulli(12 ) distribution.

 

Hence T is the weighted sum of iid random variables ZDi ; i = 1; 2; ::n. Under the symmetry about origin, ZDi ; i = 1; 2; ::n are iid Bernoulli(12 ) variables. Thus under H0 : = 0, distribution of ZDi ; i = 1; 2; ::n and hence distribution of T is independent of any F. Thus T is exactly distribution free under the null hypothesis. Therefore tests based on T are exactly nonparametric.

 

3 Exact distribution of T

 

Next assume n=3, then possible values of T are 0,1,2,..,6. Then b(0; 3) = 1 = b(1; 3) = 1. Now using the recursion relation and the values of b(:; 2), we get b(2; 3) = b(2; 2) +b( 1; 2) = 1 + 0 = 1, b(3; 3) = b(3; 2) + b(0; 2) = 1 + 1 = 2, b(4; 3) = b(4; 2) + b(1; 2) = 0 + b(1; 2) = 1, b(5; 3) = b(5; 2) + b(3; 2) = 0 + 1 and b(6; 3) = b(6; 2) + b(3; 2) = 0 + 1 = 1. Then we have the following distribution:

3.1 Symmetry of the distribution of T

 

If we look at the distribution for n = 3, we can observe symmetry about n(n + 1)=4 = 3 De ne 0 = n(n + 1)=4, then

4  Dierent Tests

 

T has a discrete distribution and hence tests based on it will be randomized. We list below the tests for di erent alternatives. For the alternative Ha : > 0, a size test can be expressed as 0 = I(T > T ) + aI(T = T );

 

 

4.1 p values

 

Suppose Tobs is the observed value of T . Then for the alternative Ha : > 0, one can report the one sided p value is PH0 (T Tobs). We accept the null hypothesis if this p value exceeds .For the alternative Ha : < 0, corresponding one sided p value is PH0 (T Tobs) and we accept the null hypothesis if this p value exceeds . But for the alternative Ha : 6= 0, the two sided p value is 2minfPH0 (T Tobs); PH0 (T Tobs)g. Thus we reject the null hypothesis if this p value does not exceed .

 

4.2 Presence of ties

 

Presence of zero and tied observations undermine the validity of the test. Pratt(1959) rec-ommended a modi cation of T in such a situation. He suggested to use the zeros and average ranks for the tied observations for the modi cation of the usual statistic.

 

To be speci c, suppose 0 < u1 < u2 < ::: < um be the distinct absolute magnitudes of the data X1; X2; ::; Xn. Suppose f0 is the frequency of zeroes in the data. fi+(fi ) is the frequency of positive(negative) ui and fi = fi+ + fi is the total frequency of fre-quency of ui in the data. Then Pratt suggested to use the statistic  = Pm  wifi+, there i=1 wj = f0 + f1 + :: + fj 1 + (fj + 1)=2. However, only large sample tests are suggested based on the asymptotic normality of the standardised statistic.

you can view video on Wilcoxon Signed Rank Test I