During a Computational Vision lab, while comparing histograms, I stumbled upon a peculiar behavior. The histograms pairwise kernel matrix – which is just a fancy name for the matrix holding histograms correlations one with another – did not have ones on the diagonal. This means that one histogram was not fully correlated to itself, which is weird.
And histograms were generated basing on this call schema:
bins_num = 35 red, binsred = np.histogram(np.ndarray.flatten(I[:,:,0]), bins_num, density=True) green, binsgreen = np.histogram(np.ndarray.flatten(I[:,:,1]), bins_num, density=True) blue, binsblue = np.histogram(np.ndarray.flatten(I[:,:,2]), bins_num, density=True) h = np.concatenate((red, green, blue), axis=0)
So I started digging into the code I was provided to find where the issue lay. I noticed that sum(h) was not 1, so this was really the issue. Even considering one histogram as a whole, its sum was not what it was supposed to be.
Looking at the doc for numpy.histogram, I found this remark for the density parameter:
If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen.
Indeed, trying as bins number 256/3, sum(h) = 1, and the pairwise matrix had ones on the diagonal. I first used 256 as bins number, which seemed like a more correct choice, but it sum(h) = 3 that way, because of the RGB decomposition.
One better solution, independent from the choice of the size of the bins, is to slightly change the metric hik function to account for the size of the bins.
In fact, instead of simply doing
hi += min(x[i], y[i])
hi += (min(x[i], y[i])*(256/len(x)))
so that it scales the output depending on the number of bins. This way, all histograms sum to one regardless of the number of bins chosen!
And here is the final metric definition: