I have Hashtable<String, Integer> ht
.
How to find the values' (Integer) median in this hashtable efficiently?
I have Hashtable<String, Integer> ht
.
How to find the values' (Integer) median in this hashtable efficiently?
There is no meaningful ordering in a hash table: the whole point of a hash table is to scatter values uniformly in buckets, according to their key. Finding an element giving the key is very fast, near constant time (i.e. O(1)) but inequality based algorithms, say finding all the elements e such that key(e) < K for a given key value K, in general require a table scan, which is O(N).
You can load all the keys (and only the keys) in an array and then use an (O(N)) algorithm to find the key corresponding to the median. Once you have the median key, you can use it to retrieve the median element from your hash table.
Note that O(N) is demonstrably the best you can do to find the median of an un-ordered set. If you need to often find the median of the set, then an ordered representation, e.g. based on balanced trees, is the way to go. Red-black trees are normally used to implement such ordered maps. Key lookups will be O(log(N)), which is slower than O(1) but still pretty fast, but the set is already ordered and finding the median is easy, and usually provided as a built in operation.
The fast median finding algorithm I know is based on the same pivoting strategy used in Quicksort. Here is another one I just found:
http://www.cs.cornell.edu/courses/cs2110/2009su/Lectures/examples/MedianFinding.pdf
You can use the The Apache Commons Mathematics Library
There is a complete API for all Mathematical tools that you might need such as the median, mean, standard deviation, etc...
Hope that helped.