I want to use hash() to calculate/find the similarity between two strings.
In PHP there are many supported hashing algorithms. List can be obtain with hash_algos().
Which is the best recommended algorithm to use for?
Advertisement
Answer
Your question is too ambiguous.
Firstly, you say you want to calculate similarity between the two strings. This does not require hashing at all. You can just use compare
, equality
, levenshtein distance
, edit distance
etc. for that.
Why do we use hashing
If there is some sensitive data which we cannot store in cleartext and we don’t need to use the data in any processing, calculations or modify the data but only need to compare it to exact equality, we use hashing.
eg. storing user passwords, which would need only comparison with the password string once he tries logging in
Parameters
Speed, security (and maybe, popularity)
A few of most popular hashes include md5, SHA-1, SHA-256 and SHA-512
. This is the order of them being secure and relatively slow.
fast, less secure | md5 < SHA-1 < SHA-256 < SHA-512 | relatively slow, more secure
I would recommend using SHA-1 or SHA-256, which are fast enough and enough secure as well.
Also, use a secret salt to increase security manyfold (using salt while hashing increases security exponentially, for obvious reasons).