Jump to content

Looking for a VBS script to generate MD5 of a file


Recommended Posts

Thanks for the program. It works in stock Windows 2000. I've also tested it in NT 4.0 but it doesn't work there (it generates same output regardless of the switch used).

Link to comment
Share on other sites


Thanks Glenn9999. In the interest of smallest possible app size for those that only need md5, would you mind making a version with only md5 supported? I would also love to see the source if you don't mind sharing.

Here is the effective source I used for the *supported* API calls in Delphi to do an MD5 hash of a file. The source in the exec I posted is very close (but a little more generic to call the API with different algorithms).

Thanks for the program. It works in stock Windows 2000. I've also tested it in NT 4.0 but it doesn't work there (it generates same output regardless of the switch used).

Probably something wrong with the API calls related to what is available on the OS. They should exist or you'd have real nasty errors. I don't do a lot of error-trapping on the API calls (interest of size), but that probably should get addressed now that I think on it more (SHA256, SHA384, SHA512 are only available on Windows XP SP3 and above).

But like I wrote above, I have a (relatively good) non-API implementation of MD5 I can throw into an exec and post which *should* work on any 32-bit Windows OS.

Edited by Glenn9999
Link to comment
Share on other sites

C Source code (found on the web not mine) for those who might need it.

/* MD5 routines, after Ron Rivest */
/* Written by David Madore <david.madore@ens.fr>, with code taken in
* part from Colin Plumb. */
/* Public domain (1999/11/24) */

/* Note: these routines do not depend on endianness. */

/* === The header === */

/* Put this in md5.h if you don't like having everything in one big
* file. */

#ifndef _DMADORE_MD5_H
#define _DMADORE_MD5_H

struct md5_ctx {
/* The four chaining variables */
unsigned long buf[4];
/* Count number of message bits */
unsigned long bits[2];
/* Data being fed in */
unsigned long in[16];
/* Our position within the 512 bits (always between 0 and 63) */
int b;
};

void MD5_transform (unsigned long buf[4], const unsigned long in[16]);
void MD5_start (struct md5_ctx *context);
void MD5_feed (struct md5_ctx *context, unsigned char inb);
void MD5_stop (struct md5_ctx *context, unsigned char digest[16]);

#endif /* not defined _DMADORE_MD5_H */

/* === The implementation === */

#define F1(x, y, z) (z ^ (x & (y ^ z)))
#define F2(x, y, z) F1(z, x, y)
#define F3(x, y, z) (x ^ y ^ z)
#define F4(x, y, z) (y ^ (x | ~z))

#define MD5STEP(f, w, x, y, z, data, s) \
{ w += f (x, y, z) + data; w = w<<s | (w&0xffffffffUL)>>(32-s); \
w += x; }

void
MD5_transform (unsigned long buf[4], const unsigned long in[16])
{
register unsigned long a, b, c, d;

a = buf[0]; b = buf[1]; c = buf[2]; d = buf[3];
MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478UL, 7);
MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756UL, 12);
MD5STEP(F1, c, d, a, b, in[2] + 0x242070dbUL, 17);
MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceeeUL, 22);
MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0fafUL, 7);
MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62aUL, 12);
MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613UL, 17);
MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501UL, 22);
MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8UL, 7);
MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7afUL, 12);
MD5STEP(F1, c, d, a, b, in[10] + 0xffff5bb1UL, 17);
MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7beUL, 22);
MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122UL, 7);
MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193UL, 12);
MD5STEP(F1, c, d, a, b, in[14] + 0xa679438eUL, 17);
MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821UL, 22);
MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562UL, 5);
MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340UL, 9);
MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51UL, 14);
MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aaUL, 20);
MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105dUL, 5);
MD5STEP(F2, d, a, b, c, in[10] + 0x02441453UL, 9);
MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681UL, 14);
MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8UL, 20);
MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6UL, 5);
MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6UL, 9);
MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87UL, 14);
MD5STEP(F2, b, c, d, a, in[8] + 0x455a14edUL, 20);
MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905UL, 5);
MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8UL, 9);
MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9UL, 14);
MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8aUL, 20);
MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942UL, 4);
MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681UL, 11);
MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122UL, 16);
MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380cUL, 23);
MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44UL, 4);
MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9UL, 11);
MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60UL, 16);
MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70UL, 23);
MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6UL, 4);
MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127faUL, 11);
MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085UL, 16);
MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05UL, 23);
MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039UL, 4);
MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5UL, 11);
MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8UL, 16);
MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665UL, 23);
MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244UL, 6);
MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97UL, 10);
MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7UL, 15);
MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039UL, 21);
MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3UL, 6);
MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92UL, 10);
MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47dUL, 15);
MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1UL, 21);
MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4fUL, 6);
MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0UL, 10);
MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314UL, 15);
MD5STEP(F4, b, c, d, a, in[13] + 0x4e0811a1UL, 21);
MD5STEP(F4, a, b, c, d, in[4] + 0xf7537e82UL, 6);
MD5STEP(F4, d, a, b, c, in[11] + 0xbd3af235UL, 10);
MD5STEP(F4, c, d, a, b, in[2] + 0x2ad7d2bbUL, 15);
MD5STEP(F4, b, c, d, a, in[9] + 0xeb86d391UL, 21);
buf[0] += a; buf[1] += b; buf[2] += c; buf[3] += d;
}

#undef F1
#undef F2
#undef F3
#undef F4
#undef MD5STEP

void
MD5_start (struct md5_ctx *ctx)
{
int i;

ctx->buf[0] = 0x67452301UL;
ctx->buf[1] = 0xefcdab89UL;
ctx->buf[2] = 0x98badcfeUL;
ctx->buf[3] = 0x10325476UL;
ctx->bits[0] = 0;
ctx->bits[1] = 0;
for ( i=0 ; i<16 ; i++ )
ctx->in[i] = 0;
ctx->b = 0;
}

void
MD5_feed (struct md5_ctx *ctx, unsigned char inb)
{
int i;
unsigned long temp;

ctx->in[ctx->b/4] |= ((unsigned long)inb) << ((ctx->b%4)*8);
if ( ++ctx->b >= 64 )
{
MD5_transform (ctx->buf, ctx->in);
ctx->b = 0;
for ( i=0 ; i<16 ; i++ )
ctx->in[i] = 0;
}
temp = ctx->bits[0];
ctx->bits[0] += 8;
if ( (temp&0xffffffffUL) > (ctx->bits[0]&0xffffffffUL) )
ctx->bits[1]++;
}

void
MD5_stop (struct md5_ctx *ctx, unsigned char digest[16])
{
int i;
unsigned long bits[2];

for ( i=0 ; i<2 ; i++ )
bits[i] = ctx->bits[i];
MD5_feed (ctx, 0x80);
for ( ; ctx->b!=56 ; )
MD5_feed (ctx, 0);
for ( i=0 ; i<2 ; i++ )
{
MD5_feed (ctx, bits[i]&0xff);
MD5_feed (ctx, (bits[i]>>8)&0xff);
MD5_feed (ctx, (bits[i]>>16)&0xff);
MD5_feed (ctx, (bits[i]>>24)&0xff);
}
for ( i=0 ; i<4 ; i++ )
{
digest[4*i] = ctx->buf[i]&0xff;
digest[4*i+1] = (ctx->buf[i]>>8)&0xff;
digest[4*i+2] = (ctx->buf[i]>>16)&0xff;
digest[4*i+3] = (ctx->buf[i]>>24)&0xff;
}
}

/* === The main program === */

#include <stdio.h>

int
main (int argc, const char *argv[])
{
int i, j;
struct md5_ctx context;
unsigned char digest[16];
FILE *f;
const char *bogus_argv[] = { "zoinx", "-" };
const char hexdigits[17] = "0123456789abcdef";

if ( argc == 1 )
{
argc = 2;
argv = bogus_argv;
}
for ( i=1 ; i<argc ; i++ )
{
if ( argv[i][0] == '-' && argv[i][1] == 0 )
f = stdin;
else
{
f = fopen (argv[i], "rb");
if ( ! f )
{
fprintf (stderr, "Error opening %s\n", argv[i]);
continue;
}
}
MD5_start (&context);
while ( 1 )
{
int ch;

ch = getc (f);
if ( ch == EOF )
break;
MD5_feed (&context, ch);
}
MD5_stop (&context, digest);
for ( j=0 ; j<16 ; j++ )
{
putchar (hexdigits[digest[j]>>4]);
putchar (hexdigits[digest[j]&0xf]);
}
printf (" %s\n", argv[i]);
}
return 0;
}

Compiled striped and upxed will give that nice 7k .exe.

md5sum.7z

Edited by allen2
Link to comment
Share on other sites

I went ahead and got my non-API implementation of MD5 and posted it. Tested it on a Windows ME VM. Worked. Put it on a good diet between some changes (along with UPX compressing as I have been). Did the same dietary changes to the other file, reposting that, too.

For the MD5_file one, all you need to do is put the file name you want after it. Works the same as the other one for output. Still thinking on how to handle error-trapping the API calls on hash_file.

(Files Pulled)

Edited by Glenn9999
Link to comment
Share on other sites

This one:

still seems the winner (re: size) BUT the one by Glenn9999 is MUCH FASTER :thumbup and actually almost as fast as FCIV.

On a 512.000.000 bytes file:

md5file.exe 2.67

md5_file.exe 1.75

fciv.exe 1.58

dsfo.exe 2.03 (but it also shows progress and time)

On a 2,615,529,472 bytes file:

md5file.exe 37.49

md5_file.exe 19.92

fciv.exe 19.71

dsfo.exe 35.45

jaclaz

Edited by jaclaz
Link to comment
Share on other sites

still seems the winner (re: size) BUT the one by Glenn9999 is MUCH FASTER :thumbup and actually almost as fast as FCIV.

Cool. So I'm not dreaming on my benchmarking. Of course, I spent a lot of time trying to get it to run quicker. Usually, though, to match what Microsoft has done is pretty good since I'm sure they spend a lot of time on their implementations as well. The problem comes in thinking of ways to improve things (along with reading and finding those standard ways) since Microsoft has many more minds at work. The only thing I could think of after I posted that was to optimize the memory move, and I ended up equalling FCIV after that (on my machine here). But there's the issue of size there, too. I think, though, most people have walled out the MD5 algorithm in terms of that and I would expect Microsoft's work to be representative.

SHA-1 is another kettle of fish, though. I have a non-API IA32 implementation of that here that beats FCIV on IA32 machines by about 200ms on similar size files, but not on SSE3 capable machines.

These kinds of algorithms are great illustrations of understanding exactly what you are doing with programming and spending time to do things both right and efficiently.

Edit: Something I've often found is that code size is often not very reflective of execution speed. Sometimes it is, but you might have to do something to get speed (e.g. unrolling loops or inlining code) which will increase the code size and ultimately increase the executable size. The ironic thing with MD5 is that the most effective change for speed that I found increased the code size by about 2K. The reason for this should be easily seen in looking at what has been posted above. Though I might get my stock MD5 implementation and see what the executable size is, just for fun.

Edited by Glenn9999
Link to comment
Share on other sites

While Glenn9999's solution indeed seems the winner both on speed, (very close on size), and versatility since it works on systems with stock Win2K+, just out of curiosity I wonder how the solution I've been using (the 2Kb one that does not work on stock Win2K systems attached here) compares in speed on your system with the same files, jaclaz? (and it does do folders of files recursively as well which is sometimes handy)

Cheers and Regards

Link to comment
Share on other sites

While Glenn9999's solution indeed seems the winner both on speed, (very close on size), and versatility since it works on systems with stock Win2K+, just out of curiosity I wonder how the solution I've been using (the 2Kb one that does not work on stock Win2K systems attached here) compares in speed on your system with the same files, jaclaz? (and it does do folders of files recursively as well which is sometimes handy)

Cheers and Regards

It is EXACTLY as fast as FCIV! :thumbup

A few runs place it within +/- 2 hundredth of seconds with FCIV, well within the possible timing/repeat loop error

Now, it would be interesting to understand why exactly it doesn't work on 2K. :unsure:

A quick run of BINText seems to imply it uses the procedures:

  • MD5init
  • MD5Final
  • MD5Update

which possibly belong to Advapi32.dll, compare with:

http://www.autoitscript.com/forum/topic/112711-gui-md5forfile/

Which should mean that blackwingcats' Kernelex:

Happy new year. :thumbup

I planed to customize Windows 2000 kernel32.dll.

....

v15l

....

Added A_SHAFinal, A_SHAInit, A_SHAUpdate, MD5Update, MD5Final, MD5Init in advapi32.dll

....

would do.

I found here:

http://d-h.st/bk4

a "very similar" 2048 bytes program, actually I believe it is the same one re-compiled (and possibly "optimized", but it is as fast as the one bhplt found)

jaclaz

Edited by jaclaz
Link to comment
Share on other sites

Which should mean that blackwingcats' Kernelex:

[...]

would do.

Which would explain why:

... It does, of course, work in my modified system ...

I found here:

http://d-h.st/bk4

a "very similar" 2048 bytes program, actually I believe it is the same one re-compiled (and possibly "optimized", but it is as fast as the one bhplt found)

Thanks for finding that jaclaz! I had searched for hours. You truly are "The Finder"! I don't know what optimization was done, if any (since the code size is exactly the same though the md5 sig is different), but the command syntax and the recursive ability is the same. Does this one work on stock Win2K? If so, I guess we have a new "winner" (speed and size for md5 only that works in stock Win2K+)? If not, I also wonder what commands it/they use different than FCIV, since FCIV does work with stock Win2K, and why the different commands were chosen? Could it just be a matter of how it was compiled, and if so, if we could find the source it might could be re-compiled in such a way to work with stock Win2K? [i know, I'm asking a bunch of hypothetical questions no one probably knows the answer to. I'm just thinking out loud.]

EDIT: I also noticed that the same source you used also posted a different version of md5sum (28160 bytes) with supposedly "better precision" here -- http://d-h.st/JR6 -- but I don't have any idea what "better precision" can possibly mean.

Cheers and Regards

Edited by bphlpt
Link to comment
Share on other sites

I don't know what optimization was done, if any (since the code size is exactly the same though the md5 sig is different), but the command syntax and the recursive ability is the same. Does this one work on stock Win2K? If so, I guess we have a new "winner" (speed and size for md5 only that works in stock Win2K+)? If not, I also wonder what commands it/they use different than FCIV, since FCIV does work with stock Win2K, and why the different commands were chosen? Could it just be a matter of how it was compiled, and if so, if we could find the source it might could be re-compiled in such a way to work with stock Win2K? [i know, I'm asking a bunch of hypothetical questions no one probably knows the answer to. I'm just thinking out loud.]

EDIT: I also noticed that the same source you used also posted a different version of md5sum (28160 bytes) with supposedly "better precision" here -- http://d-h.st/JR6 -- but I don't have any idea what "better precision" can possibly mean.

Cheers and Regards

Well, the "new" version is actually:

MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+

Copyright © 2001-2005 Jem Berkes - http://www.pc-tools.net/

If you do a hex compare of the two 2048 bytes files, it is evident that they are actually the "same", most probably compiled with some different switch or on different machines.

Most probably user CNexus simply uploaded a couple MD5 tools that he/she had around.

Since the existance of two ("same") builds of the same MD5sum.exe exists should mean that the Source Code is *somewhere* around, and it would be probably useful to Glen9999.

jaclaz

Link to comment
Share on other sites

Since the existance of two ("same") builds of the same MD5sum.exe exists should mean that the Source Code is *somewhere* around, and it would be probably useful to Glen9999.

If it's the same MD5Sum I looked at up above, then the real source wouldn't be too useful because it's just API calls, wherein the real work is happening. Not really thinking Microsoft will release the source to that. Though it might be quicker to call those instead of calling the supported API calls to do the MD5 (which the other exec I posted uses).

Edit: Just did some timings myself. YMMV I suppose.

Machine 1:

timer fciv -md5 cddata.dat

Program executed in 9281 ms. Press ENTER

timer hash_file -md5 cddata.dat

Program executed in 9359 ms. Press ENTER

timer md5_file cddata.dat

Program executed in 9656 ms. Press ENTER

timer md5sum cddata.dat

Program executed in 12468 ms. Press ENTER

Machine 2:

timer fciv -md5 cddata.dat

Program executed in 2203 ms. Press ENTER

timer hash_file -md5 cddata.dat

Program executed in 2218 ms. Press ENTER

timer md5_file cddata.dat

Program executed in 2437 ms. Press ENTER

timer md5sum cddata.dat

Program executed in 2203 ms. Press ENTER

Edited by Glenn9999
Link to comment
Share on other sites

Thanks for everything :)

I actually thought that FCIV was rather slow but now I can see that it's not true, especially compared to some of the other tools tested by jaclaz which turned out to be very slow.

Link to comment
Share on other sites

Well, the "new" version is actually:

MD5sums 1.2 freeware for Win9x/ME/NT/2000/XP+

Copyright © 2001-2005 Jem Berkes - http://www.pc-tools.net/

FWIW, I just tried this one on Machine 2 with a 734,003,200 byte file (as all the other ones). 3824ms. Some of that I'm sure is the fancy progress indicator, but I'm sure a lot of it is that the algorithm is just the main stock one with no optimizations whatsoever.

Edit: In searching, I think the "Md5sum" name is pretty standard because that's how it's appeared in Unix/Linux, etc. That something is named that is probably not indicative of whether it is a specific version of code or not, since just about any implementation is probably being considered as a OS candidate or a clone/replacement/whatever.

Edit2: dfso.exe on Machine 2: 2796 ms. My stock MD5 implementation, no optimizations whatsoever: 5312 ms. Just interesting to compare.

Edited by Glenn9999
Link to comment
Share on other sites

Just for the record, Paul Houle created a verify fast (fastest ?) implementation of par2 and most of the improvement seem to come from md5 part.

Executable and c++ source code are on the link above.

Got a look at it. Can't say for sure the rest of it is good, but the MD5 part looks relatively standard coding. If there's much appreciable difference from different good implementations (i.e. nothing completely stupid is done), it's probably going to be from the degree of quality of the assembler/compiler used (best being full ASM of course). To that end, I do notice a small speed overhead from UPX-packed executables, so what you'll end up with will be *slightly* faster if it's not UPX-compressed.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...