SimHash and Hamming distance calculator

Simple free online tool to calculate the similarity between two texts using SimHash and Hamming distance algorithms.

Smaller distance means more similar texts

Notes

This is a simplified version of what search engines are doing. There is no stemming and we use TF instead of weight calculation.

The code uses FNV-1a algorithm for computing hashes of individual words.

Read more about SimHash on Wikipedia.

Feedback

Leave a comment here: @ugnich