Given N ordered data points Y1, Y2, … YN, the empirical distribution function is defined as En = ni /N where ni is the number of points less than Yi where Yi are ordered from the smallest to the largest value. This is a step function that increases by 1/N at the value of each ordered data point. The null hypothesis is such that the dataset follows a specified distribution, while the alternate hypothesis is that the dataset does not follow the specified distribution. The hypothesis is tested using the KS statistic defined as where F is the theoretical cumulative distribution of the continuous distribution being tested that must be fully specified (i.e., location, scale, and shape parameters cannot be estimated from the data).
The hypothesis regarding the distributional form is rejected if the test statistic, KS, is greater than the critical value obtained from the table below. Notice that 0.03 to 0.05 are the most common levels of critical values (at the 1%, 5%, and 10% significance levels). Thus, any calculated KS statistic less than these critical values implies that the null hypothesis is not rejected and that the distribution is a good fit. There are several variations of these tables that use somewhat different scaling for the KS test statistic and critical regions. These alternative formulations should be equivalent, but it is necessary to ensure that the test statistic is calculated in a way that is consistent with how the critical values were tabulated. However, the rule of thumb is that a KS test statistic less than 0.03 or 0.05 indicates a good fit.