Can you show me two actual, non-trivial strings that produce the same MD5 or SHA1 hash?_问答_开发者

Can you show me two actual, non-trivial strings that produce the same MD5 or SHA1 hash?

开发者 https://www.devze.com 2023-03-29 02:52 出处：网络

...and if not, why not? So here\'s the question behind the question. I understand that the likelihood of accidental collisions in MD5 and SHA1 is small (though less likely in SHA1 than in MD5). I a

...and if not, why not?

So here's the question behind the question.

I understand that the likelihood of accidental collisions in MD5 and SHA1 is small (though less likely in SHA1 than in MD5). I also understand that deliberate collisions are theoretically possible.开发者_StackOverflow社区 Is it practically possible? Could I go through some process to deliberately generate two messages with the same hash, in either of these algorithms? What process would I go through?

Collisions necessarily exist for a given hash function, in a mathematical sense: there are more possible inputs than possible outputs, so there must be two inputs which map to the same output. Now proving the existence of a collision, and actually finding one, are two different things. If I drop a diamond in the middle of the ocean, I positively know that there is now a diamond somewhere in the ocean -- but I am quite at a loss if I want to recover it.

For a "generic" hash function with an output of n bits, there are generic methods to find a collision, with average cost 2^n/2 evaluations of the function (see this page). Depending on n, this can range from the easy to the totally unfeasible. MD5 has an output of 128 bits, and 2⁶⁴ is "quite high": you can do it, but it will require a few thousands of machines and months of computations.

Now there are known weaknesses in MD5, i.e. some internal structure which can be exploited to produce collisions much more easily. Best attack on MD5 known so far requires a bit less than 2²¹ function invocations, so this is a matter of a few seconds (at most) on a basic PC. @Omri points in his response to a great example of an MD5 collision, in which the colliding messages are actually executable files with widely different behaviors.

For SHA-1, the output has size 160 bits. This means that a generic collision attack has cost about 2⁸⁰, which is not attainable with existing technology (well, Mankind could do it, but certainly not discreetly: it should be doable with, say, the equivalent of one year of budget for the whole US Army). However, SHA-1, like MD5, has known weaknesses. Right now, these weaknesses are still theoretical, in that they lead to a collision attack with cost 2⁶¹, which is too expensive for any single crypto research lab, and thus has not been fully conducted yet (there was an announced attack with cost 2⁵¹ but it seems that it was a dud -- the analysis was flawed). So no actual collision to show (but researchers are pretty sure that the 2⁶¹ attack is correct and would work, if someone found the budget).

With SHA-256, there is no known weakness, and the 256-bit output size implies a generic cost of 2¹²⁸, far away into the undoable with today's and tomorrow's technology.