Automating File Verification with Md5deep: Tips and Scripts

How to Use Md5deep for Fast Integrity Checks and Forensics

Overview

md5deep is a cross-platform suite (md5deep, sha1deep, sha256deep, hashdeep, etc.) that recursively computes hashes, creates/verifies hash lists, and supports positive/negative matching—useful for quick integrity checks and basic forensic audits.

Install

  • Debian/Ubuntu: sudo apt install md5deep (or install from source: https://github.com/jessek/hashdeep)
  • macOS: brew install md5deep (or build from source)
  • Windows: use packaged binary or build from source

Common workflows and commands

  • Generate recursive hash list (relative paths):

    Code

    md5deep -rl /path/to/dir > hashes.md5
  • Verify against a stored list (show files that do NOT match):

    Code

    md5deep -rX hashes.md5 /path/to/dir
  • Positive match (list files that match known hashes):

    Code

    md5deep -r -m knownhashes.txt /path/to/dir
  • Generate SHA-256 instead (recommended over MD5 for stronger integrity):

    Code

    sha256deep -rl /path/to/dir > hashes.sha256
  • Produce DFXML (forensic XML) output:

    Code

    md5deep -d -r /path/to/dir > output.xml
  • Only process regular files (expert mode):

    Code

    md5deep -o f -r /path/to/dir
  • Null-terminated output (for safe scripting with filenames containing newlines):

    Code

    md5deep -0 -rl /path/to/dir > hashes0.md5

Tips for forensic use

  • Use stronger algorithms (sha256deep) for critical evidence; MD5 is collision-prone.
  • Store hash lists on write-once or offline media to prevent tampering.
  • Record metadata (command, host, timestamp); use -d to generate DFXML with provenance.
  • Use relative paths (-l) to make hash files portable between systems.
  • Use null-terminated output (-0) when piping to tools that expect safe delimiters.
  • Combine with other tools (rsync, git, forensic suites) for workflow automation.

Example: create baseline and later verify

  1. Baseline:

    Code

    sha256deep -rl /srv/www > baseline.sha256
  2. Later verify and list changed files:

    Code

    sha256deep -rX baseline.sha256 /srv/www

Limitations & security notes

  • MD5 and SHA-1 are vulnerable to collisions; prefer SHA-256 or stronger for high-assurance needs.
  • md5deep/hashdeep identify identical content by hash but cannot prove provenance or detect sophisticated tampering that produces colliding hashes.

Sources: md5deep/hashdeep documentation and Linux/security articles.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *