I was assigned this lab in which I needed to create a hash function, and count the number of collisions that occur when hashing a file ranging up to 30000 elements. Here is my code so far
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
long hashcode(string s){
long seed = 31;
long hash = 0;
for(int i = 0; i < s.length(); i++){
hash = (hash * seed) + s[i];
}
return hash % 10007;
};
int main(int argc, char* argv[]){
int count = 0;
int collisions = 0;
fstream input(argv[1]);
string x;
int array[30000];
//File stream
while(!input.eof()){
input>>x;
array[count] = hashcode(x);
count++;
for(int i = 0; i<count; i++){
if(array[i]==hashcode(x)){
collisions++;
}
}
}
cout<<"Total Input is " <<count-1<<endl;
cout<<"Collision # is "<<collisions<<endl;
}
I am just not sure of how to count the number of collisions. I tried storing every hashed value to an array and then search that array, but it resulted in like 12000 collisions when there were only 10000 elements. Any advice at all on how to count the collisions or even if my hash function could use improvement, would be appreciated. Thank you.