class documentation

class TestHistogramOptimBinNums: (source)

View In Hierarchy

Provide test coverage when using provided estimators for optimal number of bins

Method test_empty Undocumented
Method test_incorrect_methods Check a Value Error is thrown when an unknown string is passed in
Method test_limited_variance Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.
Method test_novariance Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case
Method test_outlier Check the FD, Scott and Doane with outliers.
Method test_scott_vs_stone Verify that Scott's rule and Stone's rule converges for normally distributed data
Method test_signed_integer_data Undocumented
Method test_simple Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change
Method test_simple_range Straightforward testing with a mixture of linspace data (for consistency). Adding in a 3rd mixture that will then be completely ignored. All test values have been precomputed and the shouldn't change.
Method test_simple_weighted Check that weighted data raises a TypeError
Method test_small Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.
def test_empty(self): (source)

Undocumented

def test_incorrect_methods(self): (source)

Check a Value Error is thrown when an unknown string is passed in

def test_limited_variance(self): (source)

Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.

def test_novariance(self): (source)

Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case

def test_outlier(self): (source)

Check the FD, Scott and Doane with outliers.

The FD estimates a smaller binwidth since it's less affected by outliers. Since the range is so (artificially) large, this means more bins, most of which will be empty, but the data of interest usually is unaffected. The Scott estimator is more affected and returns fewer bins, despite most of the variance being in one area of the data. The Doane estimator lies somewhere between the other two.

def test_scott_vs_stone(self): (source)

Verify that Scott's rule and Stone's rule converges for normally distributed data

@pytest.mark.parametrize('bins', ['auto', 'fd', 'doane', 'scott', 'stone', 'rice', 'sturges'])
def test_signed_integer_data(self, bins): (source)

Undocumented

def test_simple(self): (source)

Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change

def test_simple_range(self): (source)

Straightforward testing with a mixture of linspace data (for consistency). Adding in a 3rd mixture that will then be completely ignored. All test values have been precomputed and the shouldn't change.

def test_simple_weighted(self): (source)

Check that weighted data raises a TypeError

def test_small(self): (source)

Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.