Functions, Hash Functions, Cryptography - Discussion

Gbit-lab · June 3, 2020, 6:32am

I came accross a site a while back, that claimed that there had been an algorithm developed, that eliminated the possibility of collisions. Intuitively this seems impossible to me. I don’t remember enough about what the site said to research this, and I can’t find it again.

Any one know anything about this?

Also, what are the probabilities of a public key collision from separate private keys? Have there ever been collisions with any type of asymmetric cryptography, pgp, bitcoin, etc? What, other than probability, keeps public key collisions from happening?

Is there any real way to know that Linus Torvalds and Bill Gates are not, in some accidental way, the same person? Just kidding on that last one obviously. The other questions I really meant.

Mauro · June 3, 2020, 5:57pm

If I understood you correctly, you are saying that there is an infinite amount of inputs that will give us the same output. I would assume that is true, but there is so many possibilities in sha-256 that it does not matter. As long as there is no exploit for a hash function we can say that each file has its own unique fingerprint.

I am trying to understand all your other questions as well, but they are really hard questions. I would suggest maybe trying to post some of your question on a purely cryptography reddit such as this one: https://www.reddit.com/r/crypto/ Also I do recommend the book by Andreas Antonopoulos called “Mastering Bitcoin”. It might have some of the answers you are looking for. Hope that helps out. I will try an ask other moderators to see if they can help.

Mauro · June 3, 2020, 6:13pm

The problem is that hashing is a one way function. Hashing only gives us a fingerprint of the data. If you keep changing you browser history it will be impossible to find that input. We can’t simply hash something and then reverse it back. If we wanted to do something like that we would need to use encryption, which is a 2 way fucntion.

Example:

Hashing a big file (2GB) with SHA255 will result in a 32byte hash. This 32byte hash has no original information of the original input, and can’t be reversed in any way. The only way to get this hash again is to stored the original information somewhere.

Encrypting a big file (2GB) will result in a 2GB file again. But this time, we will have a scrambled file which can be reversed with a private key. All the date is there, it is just mixed.

Due to a hash not storing any original information, you still need a data storage to get that hash again.

I don’t think your idea is possible. If I got something wrong, feel free to correct me.

Mauro · June 3, 2020, 6:28pm

If we make a secure hash algorithm with more possibilities than there are atoms in the whole universe, I don’t see any reason to care about collisions. The only collision we should worry about are the ones that are gotten trough a weakness of the hash function itself…

See: MD5 Hash Weakness

" Originally discovered in late 2008 by a team of security researchers, including independent researcher Alex Sotirov, the flaw exploits a weakness in the MD5 hash algorithm that under special circumstances allows a collision attack in the form of a duplicate fake digital fingerprint."

Example: MD5 Collision

Public keys are generated with the help of the Elliptic curve cryptography. This is a hard question and I can’t really give you the correct answer. Since we encrypt a file using a public key, and a public key is derived from a private key, I would say there is only 1 private key that can decrypt that data. I am not sure if there can be an overflow.

Haha.

FrankB · June 3, 2020, 6:30pm

Thank you, Mauro, for your effort and the additional information. The reason why I was asking this question is basically given in my second question. In order for the birthday-paradox formula to be applicable you need to have, it seems to me, (approximately) equal probabilities for the hash output values, and in order to define these output-value probabilities you need to have some kind of a probability measure on the underlying sample space of all finite strings. I realize, though, that my first two questions are more mathematical in nature. Some of the questions further down (especially 5, 6, 7, and 8) are more important as far as the meaning and use of hash functions and blocks are concerned. I was not sure at all, for example, what you meant by stating (in the reading) that Merkle trees can be used for efficient transaction searches. If you could clarify that issue, then that would be extremely helpful. I was completely unable to make any sense of the pertinent explanation in the reading, and I would really like to understand it. All the help you can offer will be very much appreciated.

coin_digger · June 3, 2020, 7:51pm

Thank you for taking the time to reply!

FrankB · June 3, 2020, 8:13pm

It seems to me that the only way to eliminate collisions in the domain of a hash function is to reduce the number of strings (or input elements) in the domain to a number less than or equal to the number of output hashes. This is actually simple mathematics (trust me—I’m a mathematician —believe it or not—but you really don’t need to be a math guy to understand this): if f is a function from a set A to a finite set B, then f can be a one-to-one function (or an injective function or a function without collisions in the domain or a function for which f(x)=f(y) always implies x=y) only if the number of elements in A is less than or equal to the number of elements in B. Think about it: there can be no one-to-one function from A={1,2,3} to B={1,2} because at least to of the three elements in A must produce the same output in B. So if the site you are referring to claimed that collisions can be eliminated by some kind of an algorithm, then that algorithm must pertain to a hash function that is restricted to a domain the number of elements of which is less than or equal to the number of output hashes in the range. Without restricting the domain, it cannot be done: the set of finite strings (the domain of a hash function) is infinite and can therefore not be mapped one-to-one onto the finite set of output hashes.

monymike · June 5, 2020, 1:48pm

Wondering…can it be used for photos or artwork?
I think probably but would like confirmation
If yes=tokenization of everything. Yes?

Mercurydan · June 10, 2020, 5:11am

I read the block geeks article but it doesn’t talk about whether there is 1 sha 256 algorithm or many. What is the difference between the algorithm and the function? If you broke Sha256 on bitcoin would you automatically be breaking sha256 for banks or anywhere else sha256 is used? How was Sha 256 created?

Lrrr · June 12, 2020, 11:02am

Getting into the hash functions right now, I think I can grasp the topic in general but trying to learn the specifics in a bit… see you later

Gemstar · June 13, 2020, 6:25am

Question: Although hash functions are one-way functions that can’t be reversed, is it possible to recover original input from output if hacked or by brute-force?

Ingridharp · June 14, 2020, 3:47pm

Hallo Ivan, math is a subject I am unfamiliar with… please explain what this sign is ment ^ … In the reading Hash functions I see this …
2^128 … what does it mean?

Mauro · June 14, 2020, 4:05pm

There are many hash functions. Check the table of hash functions on the wiki page:

If you managed to break SHA-256 you would be able to break anything that uses the SHA-256.

Mauro · June 14, 2020, 4:16pm

Not really. Even if you brute forced (which takes way too much time with today’s technology speed) you still can’t be sure if that was the original data or not. Hash functions can have collisions, which means that In theory there can be infinite amount of different inputs that will give the same output. The hash is just a fingerprint. It saves nothing from the input.

Encryption on the other hand is a 2 way function which saves the original data by scrambling it. If you have the key to it, you can unscramble it.

Let’s say we want to hash and encrypt a file that is 2GB

A hash function makes a 32bytes hash out of it. (Sha256) (No way to go back, there is no way near enough information in the 32bytes of information to know what was the original input, even if you guess it, you still can’t be sure.)
A encryption takes that and scrambles it. The size will still be 2GB. (Can easily go back, as all the data is still there. We just need a key to reverse the function back. We can be certain that we got the original input.)

Hope this helps you out.

Mauro · June 14, 2020, 4:21pm

Hi.

2^128 means 2 to the power of 128. That is equal to multiplying the number 2, 128 times by 2. 2x2x2x2x…

Try using https://www.desmos.com/calculator to graph a function 2^x. You will realize that it grows exponentially. Meaning that 2^128 is not double of 2^64. It is exponentially larger.

Hope this help you out. If you have any more questions feel free to message me back. If you are not into math, don’t worry too much about the graph.

Ingridharp · June 14, 2020, 5:08pm

THank you!! I just had no clue what the ^ment… now I understand :))

Naz · June 14, 2020, 5:50pm

Hi.

Replying about the math:

^ means raised to the power.

2^4= 2x2x2x2= 16
So
2^128= 2x2x2x2x2x2x2x… = 340,282,366,920,938,463,463,374,607,431,768,211,456

On a calculator you use the key Xy (the y is raised).

I hope this helps.
Cheers.

Gemstar · June 14, 2020, 6:00pm

Thanks, Mauro. Just wondered since nothing seems to be unhackable nowadays.

Ingridharp · June 14, 2020, 7:20pm

Wow great !! Thank you for explaining me:smiley:

Mercurydan · June 14, 2020, 9:12pm

Mauro,
The way I understand it is that it is industry standard to use SHA 256 for banking and many other secure things. What my question was about, is what if the NSA ALWAYS had the function/algorithm to SHA256, but made sure people never thought that they knew it. They would have to be very careful how they use it, but it would explain how they can easily hack into anything they want. The NSA agents like Edward Snowden wouldn’t know that they were using the SHA256 back door, they just know the program they are using works to crack anything.