Questions tagged [hash-collision]

a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest.

See also the wiki tag.

212 questions
11
votes
1 answer

How was the hash collision issue in ASP.NET fixed (MS11-100)?

As reported by Slashdot, MS issued an update to ASP.NET to fix the hash collision attack today. (Listed as "Collisions in HashTable May Cause DoS Vulnerability - CVE-2011-3414" on the linked Technet page.) The problem is that the POST data are…
svick
  • 214,528
  • 47
  • 357
  • 477
11
votes
2 answers

Chance of a duplicate hash when using first 8 characters of SHA1

If I have an index of URLs, and ID them by the first 8 characters of a SHA1 hash, what is the probability of two different URLs having identical IDs?
zino
  • 772
  • 2
  • 8
  • 24
11
votes
5 answers

Are hash collisions with different file sizes just as likely as same file size?

I'm hashing a large number of files, and to avoid hash collisions, I'm also storing a file's original size - that way, even if there's a hash collision, it's extremely unlikely that the file sizes will also be identical. Is this sound (a hash…
SqlRyan
  • 30,939
  • 32
  • 109
  • 190
11
votes
1 answer

Horrific collisions of adler32 hash

When using adler32() as a hash function, one should expect rare collisions. We can do the exact math of collisions probability, but roughly speaking, since it is a 32-bits hash function, there should not be many collisions on a sample set of a few…
Paul Oyster
  • 943
  • 8
  • 21
10
votes
5 answers

CHECKSUM() collisions in SQL Server 2005

I've got a table of 5,651,744 rows, with a primary key made of 6 columns (int x 3, smallint, varchar(39), varchar(2)). I am looking to improve the performance with this table and another table which shares this primary key plus an additional column…
Cade Roux
  • 83,561
  • 38
  • 170
  • 259
9
votes
6 answers

md5 hash collisions.

If counting from 1 to X, where X is the first number to have an md5 collision with a previous number, what number is X? I want to know if I'm using md5 for serial numbers, how many units I can expect to be able to enumerate before I get a collision.
John Lewis
  • 712
  • 7
  • 15
8
votes
4 answers

Looking for a good 64 bit hash for file paths in UTF16

I have a Unicode / UTF-16 encoded path. the path delimiters is U+005C '\'. The paths are null-terminated root relative windows file system paths, e.g. "\windows\system32\drivers\myDriver32.sys" I want to hash this path into a 64-bit unsigned…
Dominik Weber
  • 699
  • 5
  • 13
8
votes
2 answers

Understanding cyclic polynomial hash collisions

I have a code that uses a cyclic polynomial rolling hash (Buzhash) to compute hash values of n-grams of source code. If i use small hash values (7-8 bits) then there are some collisions i.e. different n-grams map to the same hash value. If i…
csprajeeth
  • 227
  • 2
  • 10
8
votes
5 answers

How to handle a dict variable with 2^50 elements?

I have to find SHA256 hashes of 2^25 random strings. And then look for collision (using birthday paradox for the last, say, 50 bits of the hash only). I am storing the string:hash pair in a dict variable. Then sorting the variable with values (not…
ritratt
  • 1,334
  • 3
  • 19
  • 33
7
votes
3 answers

Recursive MD5 and probability of collision

I wonder if it is 'safe' to hash a bunch of MD5 hash values together to create a new hash or whether this will in any way increase the probability of collisions. The background: I have a couple of files with dependencies. Each file has an associated…
Janick Bernet
  • 17,974
  • 2
  • 24
  • 53
7
votes
1 answer

How unique are the first 8-12 characters of SHA256 hashes?

Take this hash for example: ba7816bf 8f01cfea 414140de 5dae2223 b00361a3 96177a9c b410ff61 f20015ad It's too long for my purposes so I intend to use a small chunk from it, such as: ba7816bf8f01 ba7816bf Or similar. My intended use case: Video…
7
votes
3 answers

Open Addressing vs. Separate Chaining

Which hashmap collision handling scheme is better when the load factor is close to 1 to ensure minimum memory wastage? I personally think the answer is open addressing with linear probing, because it doesn't need any additional storage space in case…
user191776
7
votes
2 answers

How can I evenly distribute distinct keys in a hashtable?

I have this formula: index = (a * k) % M which maps a number 'k', from an input set K of distinct numbers, into it's position in a hashtable. I was wondering how to write a non-brute force program that finds such 'M' and 'a' so that 'M' is minimal,…
Pavel
  • 1
  • 2
  • 14
  • 43
7
votes
5 answers

Why isn't randomized probing more popular in hash table implementations?

According to various sources, such as Wikipedia and various .edu websites found by Google, the most common ways for a hash table to resolve collisions are linear or quadratic probing and chaining. Randomized probing is briefly mentioned but not…
dsimcha
  • 64,236
  • 45
  • 196
  • 319
7
votes
6 answers

What does it mean by "the hash table is open" in Java?

I was reading the Java api docs on Hashtable class and came across several questions. In the doc, it says "Note that the hash table is open: in the case of a "hash collision", a single bucket stores multiple entries, which must be searched…
derrdji
  • 10,619
  • 20
  • 61
  • 76
1
2
3
14 15