Question 1

How does the duplicate detection work?

Accepted Answer

Each image is downscaled to a 16×16 grayscale grid (256 pixels). The mean brightness is computed; each pixel is encoded as 1 if brighter than the mean, 0 if darker. The result is a 256-bit fingerprint — the average hash (aHash). Two images are considered duplicates when their hashes differ by no more than the threshold bits (Hamming distance).

Question 2

Will it catch resized or recompressed copies?

Accepted Answer

Yes. aHash is robust to resize and modest recompression because it normalises to brightness on a coarse grid. A 4000 px JPEG and its 800 px WebP downscale typically score a Hamming distance under 5. Heavy recompression or aggressive colour grading can push the distance up — increase the threshold.

Question 3

Will it catch rotated or flipped duplicates?

Accepted Answer

No. Average hash and perceptual hash both depend on pixel position. A 90° rotated copy of the same photo has a completely different hash. For rotation-invariant matching you need feature-based algorithms (SIFT, ORB) which are too heavy for browser use.

Question 4

What threshold should I use?

Accepted Answer

Start at 10. Below 5 catches only near-identical dupes (different filename, same bytes after recompression). Above 20 catches visually similar but distinct shots (consecutive burst frames). The optimal threshold depends on your library — review a few groups to calibrate.

Question 5

Why doesn't it just compare file hashes?

Accepted Answer

MD5 / SHA-1 byte hashes catch only exact-byte duplicates. The moment a file is recompressed, resized, or has metadata changed, the byte hash changes — even though it's visually the same image. Perceptual hashing (aHash, pHash, dHash) compares what the image looks like, not its bytes.

Question 6

Can it delete duplicates for me?

Accepted Answer

No. Browsers cannot modify the filesystem. The tool identifies and groups duplicates; you delete the unwanted copies in your file manager. This is intentional — automated deletion of photos is dangerous and irreversible.

Question 7

Are my files uploaded?

Accepted Answer

No. Hashing runs entirely in your browser using the Canvas API. The 256-bit hashes themselves never leave the page. Safe for confidential or unreleased imagery.

Question 8

How large a batch can I process?

Accepted Answer

Browsers comfortably handle a few thousand images. The hashing step is O(n); the grouping step is O(n²) so 10 000+ files start to feel slow. For massive libraries, process in folders of 2 000–5 000.

Free Duplicate Image Finder — Perceptual Hash, Browser-Based

What is the Duplicate Image Finder — Perceptual Hash, Browser-Based?

Key features

How it works

Why use this tool

Common use cases

How to use this tool

Who should use this

How to get started

Best practices

Pro tips

Expert insights

💡 Pro Tip: Start at threshold 10

💡 Pro Tip: Review before deleting

💡 Pro Tip: Rotation = no match

Limitations to be aware of

Frequently asked questions