Hi Roberto,
I can share the method I use for PixInsight's File Explorer, which has proven to be very efficient and secure. This is the member function that computes a cache hash (in C++ but pretty easy to implement in JavaScript):
IsoString FileExplorerCache::Hash( const String& filePath )
{
const fsize_type blockSize = 256*1024;
const fsize_type halfBlockSize = blockSize >> 1;
const fsize_type dataSize = 4*blockSize;
try
{
File file = File::OpenFileForReading( filePath );
fsize_type n = file.Size();
ByteArray data;
if ( n > dataSize )
{
data = ByteArray( dataSize + sizeof( fsize_type ) );
file.Read( data.Begin(), blockSize );
file.Seek( n/3 - halfBlockSize, SeekMode::FromBegin );
file.Read( data.At( blockSize ), blockSize );
file.Seek( 2*n/3 - halfBlockSize, SeekMode::FromBegin );
file.Read( data.At( 2*blockSize ), blockSize );
file.Seek( n - blockSize, SeekMode::FromBegin );
file.Read( data.At( 3*blockSize ), blockSize );
memcpy( data.At( dataSize ), &n, sizeof( fsize_type ) );
}
else
{
data = ByteArray( n );
file.Read( data.Begin(), n );
}
file.Close();
return IsoString::ToHex( SHA1().Hash( data ) );
}
catch ( ... )
{
// Propagate no filesystem exceptions here.
return IsoString();
}
}
For files smaller than 1 MiB, the function computes the SHA1 digest for the entire file. For larger files, the function computes an SHA1 digest for 1 MiB of file data read from 4 blocks of 256 KiB each, distributed uniformly. The last 8 bytes are set equal to the 64-bit file size, which introduces an additional dependency on the exact file size in the computed hash. The probability that two different image files generate the same hash is virtually zero for all practical cases. That has not happened so far, AFAIK.
Let me know if you need help to implement this in JavaScript, in case you decide to use the same method.