Friday, October 24, 2025

How to Extract Hidden Metadata from Images using Kali Linux — A Step-by-Step Tutorial

 

How to Extract Hidden Metadata from Images using Kali Linux — A Step-by-Step Tutorial

How to Extract Hidden Metadata from Images using Kali Linux — A Step-by-Step Tutorial


Disclaimer & ethics: extracting metadata and hidden data from images can reveal sensitive information (GPS coordinates, camera make/model, editing history, hidden files, or even private messages). Use these techniques only on images you own, images you have explicit permission to analyze, or for legitimate security and forensic purposes. Unauthorized analysis of someone else’s media may be illegal in your jurisdiction.

This tutorial walks you through practical, hands-on steps to discover visible metadata (EXIF/IPTC/XMP) and hidden content inside image files (embedded files, steganography, LSB, appended archives) using Kali Linux tools. I’ll show commands, explain outputs, and give tips for cleaning or safely extracting embedded content.

What you’ll need

  • A machine running Kali Linux (or any Linux with the same tools installed).
  • Terminal access and basic familiarity with bash.
  • Root or sudo privileges for installing packages (if not already installed).
  • Tools used in this guide (most are preinstalled on Kali):
    • exiftool (metadata swiss-army knife)
    • exiv2 or exif (alternate metadata viewers)
    • file, hexdump, xxd (file identification / raw view)
    • strings (extract readable text from binaries)
    • binwalk (scan for embedded files and compressed data)
    • foremost / scalpel (carving embedded files)
    • steghide, stegseek, stegdetect, zsteg, stegsolve (steganography tools)
    • gimp or imagemagick (image inspection / manip)
    • hashdeep or sha256sum (integrity checks)
  • A safe working directory to copy and analyze images (do not analyze originals; work on copies).

Quick setup (installing any missing tools)

Open a terminal and run:

sudo apt update
sudo apt install exiftool exiv2 exif binwalk foremost steghide stegseek zsteg imagemagick gimp

If a specific tool isn’t in Kali's repos or needs Ruby/Python gems (like zsteg), follow the tool’s README. Many Kali images already include the core tools.

Step 1 — Make a copy & preserve integrity

Never work on the only copy of an evidence file. Copy the image to your working folder and compute hashes:

mkdir ~/image_analysis
cp /path/to/original.jpg ~/image_analysis/
cd ~/image_analysis
cp original.jpg working.jpg        # work on working.jpg
sha256sum original.jpg > original.sha256
sha256sum working.jpg > working.sha256

Comparing hashes later helps detect accidental modification.

Step 2 — Basic file identification

Start by asking the filesystem what this file claims to be:

file working.jpg
identify -verbose working.jpg | head -n 20   # ImageMagick identify

file will report the container type (JPEG, PNG, TIFF, WebP). identify -verbose gives image dimensions, color profile, etc. If type mismatches extension, be cautious — an image container can hide other data.

Step 3 — Read EXIF/IPTC/XMP metadata (human-readable)

The most common useful metadata lives in EXIF, IPTC, and XMP tags. exiftool is the best all-around tool:

exiftool working.jpg

This lists camera manufacturer, model, creation timestamps, GPS coordinates, software used to edit, resolution, thumbnails, and many other tags.

Key things to look for:

  • CreateDate, DateTimeOriginal — when photo was taken
  • Model, Make — camera or phone used
  • GPSLatitude, GPSLongitude — embedded geolocation
  • Software or ProcessingSoftware — editing apps used
  • Artist, Copyright, ImageDescription — user-supplied tags
  • Thumb* fields — embedded thumbnails that may contain original unedited image

If you want XML/JSON output:

exiftool -j working.jpg   # JSON
exiftool -x rdf:Image-EXIF working.jpg  # XML

Alternative viewers:

exiv2 -pa working.jpg    # prints metadata
exif -m working.jpg      # simpler listing

Step 4 — Search readable strings and hidden text

Files may contain plain text (comments, hidden messages):

strings -n 5 working.jpg | less

-n 5 shows strings >=5 characters. Look for email addresses, URLs, base64 blobs, or suspicious keywords (BEGIN RSA PRIVATE KEY, PK (zip), JFIF, Exif, etc).

If you find base64 blobs, decode and inspect:

echo 'BASE64STRING' | base64 -d > decoded.bin
file decoded.bin
strings decoded.bin | less

Step 5 — Inspect the raw bytes (hex view) to find appended data

Many files hide extra data by appending files after the legitimate image data (e.g., a ZIP appended after JPEG). Use hexdump or xxd to inspect the file tail:

xxd -g 1 -s -512 working.jpg | less
# or show entire file headers:
xxd -l 256 working.jpg

Search for signatures:

  • ZIP: 50 4B 03 04 (PK..)
  • PDF: %PDF
  • PNG chunks: IDAT / IEND
  • JPEG end: FF D9 — anything after FF D9 may be appended data.

If you find a ZIP signature after the image, try extracting the appended data:

# carve the ZIP out (example offset)
dd if=working.jpg of=embedded.zip bs=1 skip=OFFSET
unzip embedded.zip

You can also let binwalk find and extract:

binwalk -e working.jpg
# extracted files appear in _working.jpg.extracted/

binwalk -e tries to detect embedded files and extract them. Always review extracted files in a sandbox.

Step 6 — Recover hidden files with carving tools

If binwalk shows compressed streams or you suspect embedded files but extraction fails, use carving:

foremost -t all -i working.jpg -o foremost_out
# or
scalpel working.jpg -o scalpel_out

These tools scan for file signatures and reconstruct files. Output often contains recovered JPEGs, PNGs, ZIPs, PDFs, etc.

Step 7 — Steganography detection and extraction

Steganography hides messages within pixels or audio data. Kali’s toolbox helps detect common methods.

7A — Detect LSB / simple stego heuristics

Use stegdetect or stegsolve (GUI) to detect LSB stego in JPEGs:

stegdetect working.jpg

stegdetect looks for common LSB patterns in JPEGs (works on many steg tools). False positives occur, so treat as indicator.

stegsolve is a Java GUI that lets you visually inspect color planes, bit planes, and filters. Start it and load the image, then flip planes — hidden messages sometimes appear on certain bit planes.

7B — zsteg for PNG analysis

If the file is PNG, zsteg (Ruby gem) inspects LSBs and color channels:

zsteg working.png

It identifies possible encodings (LSB, RGB LSB, palette LSB) and can dump payloads.

7C — steghide (common stego tool)

steghide embeds files into images and audio using passphrases. Check for steghide data:

steghide info working.jpg
# if it reports "embedded data" you can try extracting:
steghide extract -sf working.jpg -xf extracted.dat
# steghide will prompt for passphrase (try empty passphrase first)

If you don't know the passphrase, you may try steghide brute force with steghide_cracker or stegseek (if supported), but note brute forcing may be time consuming and legally questionable on others' files.

7D — stegseek to search for hidden messages (attack known payloads)

stegseek can try to recover messages if you suspect a particular payload or password list:

stegseek working.jpg wordlist.txt

It attempts steghide-style extraction with each password from the wordlist.

Step 8 — Extract embedded thumbnails and previous versions

Many camera images include embedded thumbnails or original unedited images (useful if the displayed image was altered). exiftool can extract the thumbnail:

exiftool -b -ThumbnailImage working.jpg > thumbnail.jpg

Also, look for PreviewImage, JpegThumbnail tags and extract them similarly.

Step 9 — Check for hidden data in metadata fields (base64, json, scripts)

Sometimes malicious or interesting info is hidden inside metadata tags as base64 blobs, JSON or scripts. Use exiftool to dump all tags and search:

exiftool -a -u -g1 working.jpg | less
# -a: show duplicate tags; -u: unknown; -g1: group names

If you find long base64 fields, decode them (as shown earlier) and inspect contents.

Step 10 — Image analysis and visualization

Use image tools to expose hidden content visually:

  • Open the image in GIMP and inspect channels, layers, and filters. Use color/contrast adjustments to reveal faint overlays.
  • Use imagemagick to transform and inspect bit planes:
convert working.jpg -separate channel_%d.png
# or extract a specific bit plane
convert working.jpg -depth 8 -colorspace RGB -separate +channel channel_R.png

You can also normalize contrast, sharpen, or apply histogram equalization to reveal faint watermarks or stego artifacts:

convert working.jpg -normalize -contrast -sharpen 0x1 enhanced.png

Step 11 — Document findings and preserve evidence

If you’re performing forensic analysis, record each step, timestamps, commands used, file hashes, and extracted artifacts. Keep chain-of-custody notes if the work is legal evidence.

Example minimal log entry:

2025-10-14 10:12 IST — Copied original.jpg -> working.jpg (sha256: ...)
exiftool working.jpg -> found GPSLatitude/GPSLongitude: 12.9716,77.5946
binwalk -e working.jpg -> extracted embedded.zip (sha256: ...)
steghide info working.jpg -> embedded data present

Step 12 — Remove metadata (if you need to protect privacy)

If your goal is privacy, remove metadata safely:

# remove all metadata (destructive)
exiftool -all= -overwrite_original target.jpg

# to remove GPS only:
exiftool -gps:all= -overwrite_original target.jpg

Verify by re-running exiftool target.jpg — tags should be gone. Note -overwrite_original replaces file; keep backups.

For thorough removal, re-encode the image (which often removes extra chunks):

convert target.jpg -strip cleaned.jpg

-strip removes profiles and ancillary chunks.

Additional tips & pitfalls

  • False positives: Tools like stegdetect can signal stego where none exists. Always corroborate with multiple methods (visual inspection, different tools).
  • Image recompression: Editing and saving images via editors can alter or remove metadata; always work on copies.
  • Non-image containers: Some “images” are wrappers for other data. file and xxd are quick ways to spot mismatches.
  • Legal & ethical concerns: Don’t attempt password cracking or brute-force extraction on files you don’t own unless authorized.
  • Automate scan pipelines: For many files, script a pipeline: fileexiftoolstringsbinwalkzsteg/steghide. Log outputs to structured files for review.

Example workflow (compact)

  1. cp image.jpg ./work/ && cd work
  2. sha256sum image.jpg > image.sha256
  3. file image.jpg && identify -verbose image.jpg | head -n 10
  4. exiftool image.jpg > metadata.txt
  5. strings -n 5 image.jpg > strings.txt
  6. binwalk -e image.jpg
  7. hexdump -C -n 512 -s -512 image.jpg | less
  8. steghide info image.jpg → if embedded: steghide extract -sf image.jpg
  9. zsteg image.png (if PNG)
  10. gimp image.jpg / convert image.jpg -normalize enhanced.jpg

Conclusion

Kali Linux offers a powerful, open toolbox to discover both visible metadata and more deeply hidden content inside images. exiftool is your first stop for human-readable EXIF/IPTC/XMP tags; binwalk, strings, hexdump, and carving tools help find appended or embedded files; steganography tools like steghide, zsteg, and stegsolve tackle hidden payloads within pixel data.

Always respect privacy and law — use these techniques for defensive, forensic, or educational purposes only. When in doubt, ask for explicit permission before analyzing files that aren’t yours.

Great — I’ve prepared:

  1. A ready-to-use, well-commented bash script that automates a safe scanning pipeline on Kali Linux (creates a working copy, computes hashes, runs file/identify/exiftool/strings/binwalk/steghide/zsteg/foremost, extracts thumbnails, and writes logs).
  2. A printable one-page checklist you can use during manual investigations.
  3. Short instructions on how to save and run the script (no background tasks — everything you need is right here).

Use the script only on images you own or have explicit permission to analyze.

1) Save-and-run script (automated scan)

Save the following to a file named image_forensic_scan.sh. Make it executable and run it on Kali.

#!/usr/bin/env bash
# image_forensic_scan.sh
# Usage: ./image_forensic_scan.sh /path/to/image.jpg
# Kali-friendly forensic scan pipeline (safe, read-only by default)
# NOTE: Run on copies of originals; the script creates a working dir and logs actions.

set -euo pipefail
IFS=$'\n\t'

if [ $# -lt 1 ]; then
  echo "Usage: $0 /path/to/image"
  exit 2
fi

ORIG_PATH="$1"
TIMESTAMP=$(date -u +"%Y%m%dT%H%M%SZ")
BASENAME="$(basename "$ORIG_PATH")"
WORKDIR="$PWD/image_scan_${BASENAME%.*}_$TIMESTAMP"
LOG="$WORKDIR/scan.log"

mkdir -p "$WORKDIR"
echo "Working directory: $WORKDIR"
exec > >(tee -a "$LOG") 2>&1

echo "==== Image forensic scan ===="
echo "Original file: $ORIG_PATH"
echo "Timestamp (UTC): $TIMESTAMP"
echo

# 1. Make safe copy
COPY_PATH="$WORKDIR/${BASENAME}"
cp -a "$ORIG_PATH" "$COPY_PATH"
echo "[+] Copied original to: $COPY_PATH"

# 2. Hash originals and copy
echo "[+] Computing hashes..."
sha256sum "$ORIG_PATH" | tee "$WORKDIR/original.sha256"
sha256sum "$COPY_PATH" | tee "$WORKDIR/working.sha256"

# 3. Basic file identification
echo; echo "=== file / identify ==="
file "$COPY_PATH" | tee "$WORKDIR/file_output.txt"
if command -v identify >/dev/null 2>&1; then
  identify -verbose "$COPY_PATH" | head -n 40 > "$WORKDIR/identify_head.txt" || true
  echo "[+] ImageMagick identify saved to identify_head.txt"
else
  echo "[!] ImageMagick 'identify' not found; skipping."
fi

# 4. EXIF/IPTC/XMP metadata
echo; echo "=== exiftool (metadata) ==="
if command -v exiftool >/dev/null 2>&1; then
  exiftool -a -u -g1 "$COPY_PATH" > "$WORKDIR/exiftool_all.txt" || true
  exiftool -j "$COPY_PATH" > "$WORKDIR/exiftool.json" || true
  echo "[+] exiftool output saved (text + json)"
else
  echo "[!] exiftool not found; install it (sudo apt install libimage-exiftool-perl)"
fi

# 5. Strings (readable text)
echo; echo "=== strings (readable text) ==="
if command -v strings >/dev/null 2>&1; then
  strings -n 5 "$COPY_PATH" > "$WORKDIR/strings_n5.txt" || true
  echo "[+] strings output saved"
else
  echo "[!] strings not found; skipping."
fi

# 6. Hex tail check for appended content
echo; echo "=== hex tail check ==="
if command -v xxd >/dev/null 2>&1; then
  xxd -g 1 -s -1024 "$COPY_PATH" | tee "$WORKDIR/hex_tail.txt" || true
  echo "[+] last 1024 bytes saved to hex_tail.txt"
else
  echo "[!] xxd not found; skipping hex output."
fi

# 7. Binwalk extraction (embedded files)
echo; echo "=== binwalk (scan & extract) ==="
if command -v binwalk >/dev/null 2>&1; then
  mkdir -p "$WORKDIR/binwalk"
  binwalk -e "$COPY_PATH" -C "$WORKDIR/binwalk" | tee "$WORKDIR/binwalk_stdout.txt" || true
  echo "[+] binwalk extraction saved under $WORKDIR/binwalk"
else
  echo "[!] binwalk not installed; install (sudo apt install binwalk) to enable embedded file extraction."
fi

# 8. Carving (foremost)
echo; echo "=== foremost (carving) ==="
if command -v foremost >/dev/null 2>&1; then
  mkdir -p "$WORKDIR/foremost_out"
  foremost -i "$COPY_PATH" -o "$WORKDIR/foremost_out" || true
  echo "[+] foremost output saved to foremost_out/"
else
  echo "[!] foremost missing; install (sudo apt install foremost) to enable carving."
fi

# 9. Steganography tools: steghide / zsteg / stegdetect
echo; echo "=== steghide / steg tools ==="
if command -v steghide >/dev/null 2>&1; then
  echo "Running: steghide info (may prompt if interactive)"

  # run info non-interactively
  steghide info "$COPY_PATH" > "$WORKDIR/steghide_info.txt" 2>&1 || true
  echo "[+] steghide info -> steghide_info.txt"
else
  echo "[!] steghide not installed (sudo apt install steghide) - skipping."
fi

# zsteg is PNG-specific (Ruby gem). Run if it's a png and zsteg exists
MIME=$(file --brief --mime-type "$COPY_PATH")
if [[ "$MIME" == "image/png" ]] && command -v zsteg >/dev/null 2>&1; then
  echo; echo "=== zsteg (PNG LSB analysis) ==="
  zsteg "$COPY_PATH" > "$WORKDIR/zsteg.txt" 2>&1 || true
  echo "[+] zsteg output saved"
else
  if [[ "$MIME" == "image/png" ]]; then
    echo "[!] zsteg not found; consider installing (gem install zsteg)"
  fi
fi

# 10. Extract embedded thumbnail (exiftool)
echo; echo "=== Extract embedded thumbnail / preview ==="
if command -v exiftool >/dev/null 2>&1; then
  exiftool -b -ThumbnailImage "$COPY_PATH" > "$WORKDIR/thumbnail.jpg" 2>/dev/null || true
  exiftool -b -PreviewImage "$COPY_PATH" > "$WORKDIR/preview.jpg" 2>/dev/null || true
  # verify files
  for f in thumbnail.jpg preview.jpg; do
    if [ -s "$WORKDIR/$f" ]; then
      echo "[+] extracted $f"
    else
      rm -f "$WORKDIR/$f"
    fi
  done
else
  echo "[!] exiftool not installed; cannot extract thumbnails."
fi

# 11. Quick sanity: check for ZIP/PDF signatures in strings or hex_tail
echo; echo "=== Quick signature checks ==="
if grep -q "PK" "$WORKDIR/strings_n5.txt" 2>/dev/null || grep -q "PK" "$WORKDIR/hex_tail.txt" 2>/dev/null; then
  echo "[!] 'PK' signature spotted: possible embedded ZIP. Inspect hex_tail.txt and binwalk output."
fi
if grep -q "%PDF" "$WORKDIR/strings_n5.txt" 2>/dev/null; then
  echo "[!] '%PDF' signature found in strings -> possible embedded PDF"
fi

# 12. Save a short summary
echo; echo "=== Summary report ==="
SUMMARY="$WORKDIR/summary.txt"
{
  echo "Scan summary for: $COPY_PATH"
  echo "Timestamp (UTC): $TIMESTAMP"
  echo
  echo "file output:"
  file "$COPY_PATH"
  echo
  echo "Top exif tags (sample):"
  if command -v exiftool >/dev/null 2>&1; then
    exiftool -S -s -DateTimeOriginal -Make -Model -GPSLatitude -GPSLongitude -Software "$COPY_PATH" | sed '/^$/d'
  else
    echo "exiftool missing"
  fi
  echo
  echo "Binwalk extract dir: $WORKDIR/binwalk"
  echo "Foremost dir: $WORKDIR/foremost_out"
  echo "Steghide info: $WORKDIR/steghide_info.txt"
  echo
  echo "End of summary."
} > "$SUMMARY"

echo "[+] Summary created at $SUMMARY"
echo "All outputs and logs are in: $WORKDIR"
echo "Scan finished."

# Reminder / safety note
echo
echo "=== Reminder ==="
echo "Work only on copies. Do not attempt password cracking on files you don't own without permission."

How to run:

  1. Save the file: nano image_forensic_scan.sh → paste → save.
  2. Make executable: chmod +x image_forensic_scan.sh
  3. Run: ./image_forensic_scan.sh /path/to/image.jpg
  4. Inspect the created working directory (named image_scan_<name>_<timestamp>) for logs and extracted artifacts.

2) Printable one-page checklist (copy/print)

Use this as your quick reference when you need to run manual checks or verify automated script results.

  1. Prepare

    • Work on a copy. Create a working directory.
    • Compute and save file hashes (SHA256) for original and working copy.
  2. Identify file & basic info

    • file image.jpg
    • identify -verbose image.jpg (ImageMagick)
    • Note differences between extension and actual container.
  3. Read visible metadata

    • exiftool image.jpg → dump to text and JSON.
    • Look for DateTimeOriginal, Make, Model, GPS*, Software, Artist.
  4. Search readable text

    • strings -n 5 image.jpg | less
    • Check for emails, URLs, PK (zip), BEGIN blocks, base64 strings.
  5. Inspect bytes and tail

    • xxd -s -512 image.jpg | less
    • Locate FF D9 (JPEG end). Anything after end-of-image may be appended data.
  6. Extract embedded files

    • binwalk -e image.jpg → check _image.jpg.extracted/
    • If PK found, carve/extract appended zip (dd by offset or binwalk carve).
  7. Carve and recover

    • foremost -i image.jpg -o foremost_out
    • scalpel as alternative.
  8. Steganography checks

    • steghide info image.jpg → try steghide extract (authorized only).
    • zsteg image.png for PNG LSB inspection.
    • stegsolve GUI for visual bit-plane flipping.
  9. Thumbnails & previews

    • exiftool -b -ThumbnailImage image.jpg > thumbnail.jpg
    • exiftool -b -PreviewImage image.jpg > preview.jpg
  10. Visual inspection & processing

    • Open in GIMP; inspect channels, layers, bit planes.
    • Use convert image.jpg -normalize -contrast enhanced.jpg to reveal faint features.
  11. Document everything

    • Save commands, outputs, timestamps, hashes, and extracted artifacts.
    • Keep chain-of-custody notes if needed.
  12. Cleanup / privacy

    • To remove metadata: exiftool -all= -overwrite_original file.jpg
    • Or convert file.jpg -strip cleaned.jpg (creates new file).

3) Notes, tips & safety reminders

  • The script calls many tools that may not be installed by default on all setups. It prints friendly messages telling you which are missing and how to install them.
  • No brute-force password cracking is included. If you want to attempt password recovery, that requires explicit legal permission and careful resource planning (not included here).
  • For PNG steganography, zsteg (Ruby gem) and visual tools are valuable. For JPEG LSBs, stegsolve and stegdetect help.



How to Extract Hidden Metadata from Images using Kali Linux — A Step-by-Step Tutorial

  How to Extract Hidden Metadata from Images using Kali Linux — A Step-by-Step Tutorial Disclaimer & ethics: extracting metadata and h...