numpy.lib.tests.test_histograms.TestHistogramOptimBinNums

class documentation

class TestHistogramOptimBinNums: (source)

Provide test coverage when using provided estimators for optimal number of bins

Method	`test_empty`	Undocumented
Method	`test_incorrect_methods`	Check a Value Error is thrown when an unknown string is passed in
Method	`test_limited_variance`	Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.
Method	`test_novariance`	Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case
Method	`test_outlier`	Check the FD, Scott and Doane with outliers.
Method	`test_scott_vs_stone`	Verify that Scott's rule and Stone's rule converges for normally distributed data
Method	`test_signed_integer_data`	Undocumented
Method	`test_simple`	Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change
Method	`test_simple_range`	Straightforward testing with a mixture of linspace data (for consistency). Adding in a 3rd mixture that will then be completely ignored. All test values have been precomputed and the shouldn't change.
Method	`test_simple_weighted`	Check that weighted data raises a TypeError
Method	`test_small`	Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.

def test_empty(self): (source) ¶

Undocumented

def test_incorrect_methods(self): (source) ¶

Check a Value Error is thrown when an unknown string is passed in

def test_limited_variance(self): (source) ¶

Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.

def test_novariance(self): (source) ¶

Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case

def test_outlier(self): (source) ¶

Check the FD, Scott and Doane with outliers.

The FD estimates a smaller binwidth since it's less affected by outliers. Since the range is so (artificially) large, this means more bins, most of which will be empty, but the data of interest usually is unaffected. The Scott estimator is more affected and returns fewer bins, despite most of the variance being in one area of the data. The Doane estimator lies somewhere between the other two.

def test_scott_vs_stone(self): (source) ¶

Verify that Scott's rule and Stone's rule converges for normally distributed data

@pytest.mark.parametrize('bins', ['auto', 'fd', 'doane', 'scott', 'stone', 'rice', 'sturges'])
def test_signed_integer_data(self, bins): (source) ¶

Undocumented

def test_simple(self): (source) ¶

Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change

def test_simple_range(self): (source) ¶

Straightforward testing with a mixture of linspace data (for consistency). Adding in a 3rd mixture that will then be completely ignored. All test values have been precomputed and the shouldn't change.

def test_simple_weighted(self): (source) ¶

Check that weighted data raises a TypeError

def test_small(self): (source) ¶

Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.