KS-Disc is a Python library for the discrete version of the Kolmogrov-Smirnov test. For reasons hidden from us lowly mortals it is not included in SciPy. So I coded together a simple version of it.
It runs on Python 3, and is installed using pip:
pip install ksdisc
It can perform a 1-sample test, in which a sample is compared against an analytical function. It can also perform a 2-sample test, in which two samples are compared against each other using a permutation test.
The 1-sample test takes an observed sample and a CDF. It then returns the p-value of the sample being from the given distribution.
ks_disc(y, cdf)
y is an n-length array, where each element is a sampled number drawn from a distribution.
cdf is a function that takes a number as an input. Note that it must be an increasing function in the span zero to one.
from ksdisc import ks_disc
from random import randint
# 1-sample test
y = [randint(1, 3) for _ in range(20)] # Uniform in [1, 3]
_cdf = lambda x: 0.0 if x < 0 else min(0.25*x, 1.0) # Uniform in [1, 4]
out = ks_disc(y, _cdf)
The 2-sample test takes two observed samples. It then returns the p-value of the samples being from the same distribution.
ks_disc_2sample(samples1, samples2)
samples1 is an n-length array, where each element is a sampled number drawn from a distribution.
samples2 is an m-length array, where each element is a sampled number drawn from a distribution.
from ksdisc import ks_disc_2sample
from random import randint, random
# 2-sample test
samples1 = [randint(1, 15) for _ in range(1000)]
samples2 = [randint(1, 15) if random()<0.95 else 3 for _ in range(1000)]
out = ks_disc_2sample(samples1, samples2)