Robust Hypothesis Testing via Lq-Likelihood
Yichen Qin and Carey E. Priebe




Abstract
:
In this article, we introduce a robust testing procedure --- the Lq-Likelihood Ratio test (LqLR) --- and show that, for the special case of testing the location parameter of a symmetric distribution in the presence of gross error contamination, our test dominates the Wilcoxon-Mann-Whitney test at all levels of contamination.


Keywords:
Lq-likelihood, relative efficiency, robustness, hypothesis testing.


Our Test:

Given a data set
data setfollowing a distribution distribution, in order to test the hypotheses hypothesis, we propose our Lq-likelihood ratio (LqLR) test statistic to be

LqLR_test_statistic,

where
Lq functionwith q<1.  In the article, we have proved the robustness properties of the proposed LqLR test statistic via analysis of the asymptotic distribution.  By adaptively selecting q via the methodology described in the paper, the LqLR test provides protection of the power and size when gross error contamination is brought into the data.


Our Main Results:

Suppose we want to test the hypotheses
simulation hypothesis.  We conduct the following experiment to show the advantage of the LqLR.  With the sample size of n=50, we simulate data from a gross error model gross_error_modelwhere varphiis a normal distribution with mean theta, the first component in h(x) is our "idealized" model, the second component in h(x) is the comtamination

At different levels of contamination
epsilon, we first set thetato 0, and then generate data sets 3000 times to calculate the sizes of the tests for: (1) the Lq-likelihood ratio test (LqLR), (2) the t test i.e., the log-likelihood ratio, (3) the Wilcoxon test, and (4) the sign test.  We further change thetato 0.34 and repeat the previous procedure to calculate the powers for these tests.  The resutls are displayed in the following figure.

Comparison
First note that the sizes of all tests are successfully controlled at 0.05.

At zero contamination (i.e.,  epsilon= 0), the t test (log-likelihood ratio) has the highest power. The LqLR has almost the same power (only slightly less than the t test).  The Wilcoxon and the sign tests have the third and the fourth highest powers, but not comparable to the two likelihood ratio tests.  As the contamination becomes more serious (i.e.,
epsilonincreases away 0), the t test degrades the fastest.  Its power quickly drops below all other tests.  The Wilcoxon test and the sign test both show good robustness and their powers degrade at much slower rates.  However, the LqLR shows a remarkable robustness.  It degrades slower than the Wilcoxon test (i.e., the blue curve is flatter than the green curve), and only slightly faster than the sign test (i.e., the blue curve is steeper than the maroon curve).  Since the power of the LqLR at epsilon= 0 is above that of the Wilcoxon test and the sign test, the power of the LqLR dominates both the Wilcoxon test and the sign test at all levels of contamination!  This implies that, not only can the LqLR preserve efficiency almost perfectly at epsilon= 0, but it also obtains robustness comparable to these nonparametric tests which are known to be very robust.  We conclude that, by losing a little bit efficiency at epsilon= 0, we have traded for great robustness at epsilon> 0.  Our LqLR can be considered as a combination of the log-likelihood ratio test (at epsilon= 0) and the nonparametric tests (at epsilon> 0).  The reason our test beats nonparametric tests uniformly is that we can control the amount of information to use by selecting q, whereas the Wilcoxon test always uses the rank information, and the sign test always uses the information about whether each data point is below or above the hypothesized mean.


Conclusions:
We have introduced a robust testing procedure --- the Lq-likelihood ratio test (LqLR) --- and demonstrated its advantage over the traditional likelihood ratio test (the t test), the Wilcoxon test, and the sign test in the context of the gross error model.

To the extent that the robustness of the Wilcoxon test (minimum asymptotic relative efficiency (ARE) of the Wilcoxon test vs the t test is 0.864) suggests that the Wilcoxon test should be the default test of choice (rather than “use Wilcoxon if there is evidence of non-normality,” the default position should be “use Wilcoxon unless there is good reason to believe the normality assumption”), these new results suggest that the LqLR test should become the new default go-to test for practitioners everywhere!


Miscellaneous:
The website is created with the help of CODECOGS.