Software needed - MD5 CRC32 Hash generator file list with export and compare

Discussion in 'other software & services' started by dscrap, Jan 21, 2009.

Thread Status:
Not open for further replies.
  1. dscrap

    dscrap Registered Member

    Joined:
    Nov 3, 2004
    Posts:
    156
    I manage a lot of different systems on several different networks. I have been looking for a software application that can create a MD5 or CRC32 hash of all files on a particular drive and then export the list to later import and compare with lists of other drives to find and delete duplicates. Does anyone know of an application that can do this? I have tried several duplicate finder programs that compare hashes from different files, but they will only allow me to export a list of duplicates found. I need something that will generate the hashes of all files on a drive an then be able to compare them with a list of hashes from another drive to find and delete duplicate files. I've tried searching the net, but cannot find a solution. I have used DiskState, Easy Duplicate Finder, and Duplicate Cleaner, most recently. All can find duplicates, but none allow me to export a list of all the hashes of all of the files on different drives. I am sure there is something out there, but I just can't find it.

    Thanks in advance...

    DiskState - 29.95 US
    Easy Duplicate Finder - Free
    Duplicate Cleaner - Free
     
  2. pandlouk

    pandlouk Registered Member

    Joined:
    Jul 15, 2007
    Posts:
    2,565
    Check here. ;)

    Panagiotis
     
  3. dscrap

    dscrap Registered Member

    Joined:
    Nov 3, 2004
    Posts:
    156
    Thanks Panagiotis... I downloaded the linked software and played with it for a bit. It seems to be able to generate, export, and import hash lists of drive or directories fine, but I am unsure if it can load 2 directories and find duplicate files or compare hash lists. If it can, can you describe how to do it?

    Thanks!
     
  4. pandlouk

    pandlouk Registered Member

    Joined:
    Jul 15, 2007
    Posts:
    2,565
    You are welcome. :)

    It can load more than one directories and/or hashs lists but it will not autotag the duplicate files. For this you will need a duplicate finder application. The most powerfull and accurate ones (at least from my tests) are:

    Duplicate Files Searcher free and OS indipendent (mentioned here)
    Duplicate File Detector paid
    Duplicate File Detective paid

    hope it helps,
    Panagiotis
     
  5. cet

    cet Registered Member

    Joined:
    Sep 3, 2006
    Posts:
    867
    Location:
    Turkey/İzmir
    Maybe ViceVersa free or paid can do what you want.I am using the free version to compare 2 folders.
     
  6. dscrap

    dscrap Registered Member

    Joined:
    Nov 3, 2004
    Posts:
    156
    Thanks all for the replies. I am still looking for that all in one application that will do what I want it to. I looked at all of the different applications you fellas listed, but none do it all. I am looking for an application that will do the following,

    1. can create a hash, MD5 or CRC-32, for all files in a directory or on a drive
    2. export said hash list
    3. import multiple hash lists and compare them for duplicates
    4. allow me to choose a duplicate file to delete
    5. when a file is marked for deletion, first determine if the file exists and is it indeed identical by recreating the hash for all duplicates and comparing them again
    6. update the hash list of the directory or drive to be exported again after a file is deleted

    Actually the lists don't really need to be exported if the application saves the data in a project folder for instance....

    I know, its a lot to ask of an application, but there are so many applications that do different parts of what I want and it really isn't that big of a leap to combine them into one application. Hopefully someone has already come up with one and now all I need to do is find it. Please keep this post in mind when looking at different applications and post anything you feel could fit the bill.

    Thanks!
     
  7. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    I don't know of a program that does this, but here's what you can do instead:

    On each computer, run FileVerifier++ (it's a portable app by the way), create a list of desired files to check, and save the results as type 'MD5 Hashes'. Uncheck relative path option. Name the file of md5 hashes with the computer name, so you'll know which computer the results are from later. You can save the md5 file to a USB thumb drive if you have one. You should have one md5 file for each computer.

    Example file from compter named 'oberon':
    01821ef8637b86f8ac114ecfea82e999 *f:\Brian's Documents\Example1.xls
    01ebab76621c8685b7d7da4b967bfa5e *f:\Brian's Documents\Example2.xls

    Install Notepad++.

    For each md5 file, in Notepad++ do a search and replace, to insert the computer name in each line in between the md5 hash and filename.

    Example file from compter named 'oberon':
    01821ef8637b86f8ac114ecfea82e999 oberon *f:\Brian's Documents\Example1.xls
    01ebab76621c8685b7d7da4b967bfa5e oberon *f:\Brian's Documents\Example2.xls

    Example file from compter named 'mona':
    01ebab76621c8685b7d7da4b967bfa5e mona *f:\Mona's Documents\Ex2.xls
    00d113fc7a860ac82f1557523fe28803 mona *f:\Mona's Documents\Song1.mp3

    Make a new file, combining each one of the md5 files you generated above. Sort the entire resulting file by selecting all text then using TextFX->TextFX Tools->Sort lines case insensitive. You'll end up with a file sorted by md5 hash.

    Combined file sorted:
    00d113fc7a860ac82f1557523fe28803 mona *f:\Mona's Documents\Song1.mp3
    01821ef8637b86f8ac114ecfea82e999 oberon *f:\Brian's Documents\Example1.xls
    01ebab76621c8685b7d7da4b967bfa5e oberon *f:\Brian's Documents\Example2.xls
    01ebab76621c8685b7d7da4b967bfa5e mona *f:\Mona's Documents\Ex2.xls

    You can find adjacent lines that contain the same md5, copying each duplicate file to delete to a separate text file. I have a .vbs script which I could modify that would list only the md5s of duplicate files, to make this chore easier. Let me know if you want me to do this. When you're done, your new file should have a list of all files to delete. If the md5 parts of each line are still present, you can use the Alt key to select a block of md5 text to delete. Finally, sort the remaining file, which should sort the files by computer name. A .vbs script could also be written to automate the file deletion task.

    Final file listing files to delete:
    mona *f:\Mona's Documents\Ex2.xls
     
  8. dscrap

    dscrap Registered Member

    Joined:
    Nov 3, 2004
    Posts:
    156
    Interesting... Thanks for the reply MrBrian. I will look it over and give it a try. I'll let you know how it works...
     
  9. dscrap

    dscrap Registered Member

    Joined:
    Nov 3, 2004
    Posts:
    156
    Wow, that was simple. Good thinking MrBrian. I created a text file of the MD5's for a directory using FileVerifier. I saved it and then duplicated it and deleted some entries. I opened the now two files in Notepad++, sorted and then highlighted everything and used the compare plugin. This listed and then highlighted the duplicates in green, missing files in between were greyed out, and extra files on the larger list were highlighted red. Now All I would have to do is delete the files.

    You said you could create a script that could automate the duplicate finding of a particular text file. Would it create a new file with the duplicates? Or would it just list them together in the original file? The reason for having them sorted in the original file would be to make it easier to delete then re-save the file with the duplicates omitted. Maybe not... Do you know how many open files Notepad++ could open at one time? Would the compare plugin work across multiple text files? I guess I could open 2 at a time and then compare and update to a master list or something and then compare the master to another file. I'm going to play around with it a bit and see if it will work as is. I appreciate your reply and advice.

    Thanks!
     
  10. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    You're welcome :)

    If you want to compare files from only 2 computers, using a file comparison program such as the one you mentioned would work fine. I don't know of a program that compares more than 2 files, so if you use this approach you will have to choose 2 file at a time to compare.

    What I recommended though, since I thought you might have perhaps many more than 2 computers involved, is combine all of the FileVerifier++ output files together into one file (after modifying each individual file as I indicated to insert computer name or description), and sort by md5, which will group all identical files together by md5.

    The easiest changes to my existing script would do the following: If there are n versions of a given md5 listed in the combined sorted file, you will get a separate file with (n-1) copies of the md5 in a separate file, without filenames. Then you could go back to your combined file and copy which filenames you wanted to delete. For example, if there are 4 files with md5 01ebab76621c8685b7d7da4b967bfa5e, the script would produce a separate file with these contents:

    01ebab76621c8685b7d7da4b967bfa5e
    01ebab76621c8685b7d7da4b967bfa5e
    01ebab76621c8685b7d7da4b967bfa5e

    Because there are 4 versions of the same file, at most you would want to delete 3 of them (right?). In this case, you would go back to your master file, choose at most 3 files with md5 01ebab76621c8685b7d7da4b967bfa5e, and copy and paste their filenames to the file the script produced.
     
  11. EASTER

    EASTER Registered Member

    Joined:
    Jul 28, 2007
    Posts:
    5,633
    Location:
    U.S.A. (South)
    Have you tried INTEGRITY CHECKER?

    (free) http://www.portablefreeware.com/?sc=57

    THIS IS THE PORTABLE VERSION

    I use it all the time to confirm the hashes are in sync. You can inventory entire FOLDERS AND ALL THEIR FILES AS WELL AS ADD A CONTEXT MENU FOR YOUR CONVENIENCE.


    Regards EASTER
     

    Attached Files:

    • RR.gif
      RR.gif
      File size:
      16.3 KB
      Views:
      1,765
  12. MrBrian

    MrBrian Registered Member

    Joined:
    Feb 24, 2008
    Posts:
    6,032
    Location:
    USA
    A different idea for the proposed script: it could create a file listing all duplicates, including filename and md5. Then you could manually choose which to delete (but you have to remember to not delete all versions of a given file).
     
Loading...
Thread Status:
Not open for further replies.