Common Techniques used to hide Private Keys and How To Detect Them

Private keys accidentally (or intentionally) left lying around in production are one of the most common and dangerous mistakes engineering or development teams can make. This post walks through realistic ways developers and engineers might hide or use private keys and more importantly practical detection techniques you can use to find them. Use this as a defensive reference to build host scans, EDR rules, and repo checks.

How people typically hide private keys (high level)

Plaintext files with nonstandard names/paths Keys saved with innocuous names like config.bin, data.dat, backup.bak, .env.local, log.old, or hidden inside dotfolders (.cache/, .local/) and odd application directories.

Renamed / extension-changed files

Keys renamed to .jpg, .dat, .log, or other benign-looking extensions so naïve filename checks miss them.

Embedded in source code / configuration files

Keys placed inline inside JSON, YAML, TOML, JS, Python, or shell files and often split across multiple lines or broken into variables to avoid simple pattern matches.

Base64 / alternative encodings

Key material encoded as base64 or otherwise encoded and stored without the usual BEGIN/END headers so simple string searchers don’t flag them.

Stored inside archives or package files

Keys placed inside zip/tar archives, tarballs, or vendor packages; sometimes added into release artifacts or build outputs.

Packed into binaries / compiled resources

Hidden in Git history

Keys committed and then “deleted” will still exist in the repository’s object history, tags, or reflogs.

Stored in environment variables / process memory

Keys placed into environment variables, systemd unit files, or passed on command lines will make them visible in /proc/*/environ, process listings, or memory scans.

Alternate filesystems / slack space / ADS (platform-specific)

Keys stored in obscure filesystem locations, slack space, alternate data streams (Windows), or on unexpected mounts to avoid casual inspection.

Split across many files

A single key is split into multiple small files or fragments (e.g., part.aa, part.ab) so short-file scans don’t trigger.

Detection artifacts & heuristics — what to look for

Below are robust indicators you can code into scanners, EDR rules, and host checks. Combine multiple signals (headers, entropy, file paths) to reduce false positives.

1. Literal PEM / OpenSSH headers

Look for canonical markers:

----BEGIN OPENSSH PRIVATE KEY-----

----BEGIN RSA PRIVATE KEY-----

----BEGIN EC PRIVATE KEY-----

----BEGIN PRIVATE KEY-----

ASCII marker openssh-key-v1 (may appear in binary or hex dumps)

2. Long base64-like blocks

Detect long continuous base64 strings that often indicate encoded keys or key blobs. Example pattern:


[A-Za-z0-9+/]{120,}={0,2}

(Adjust threshold to your environment; 120–200 chars is a common starting point.)

3. High-entropy blobs

Private keys (or large encrypted blobs) exhibit high entropy. Measuring Shannon entropy on candidate strings/blobs is a useful second-stage filter once you find base64-like runs.

4. Suspicious filenames & locations (heuristic)

Common names and locations worth scanning:

Filenames: id_rsa, id_ed25519, private, server.key, key.pem, .ppk

Paths: ~/.ssh/, /etc/ssh/, /root/.ssh/, /tmp/, /var/tmp/, application config directories, build contexts

5. Git history indicators

Even if a file is removed in the latest commit, it might exist in .git/objects. Scan history with specialized tools to find committed secrets.

6. Process / environment indicators

Look for environment vars or process arguments containing KEY, PRIVATE, SSH, TOKEN, or long base64 strings. Scan /proc/*/environ and unit files for suspicious content (requires appropriate privileges).

Common detection commands & scripts

Drop these into host scans, scheduled cron jobs, or EDR playbooks. Tweak paths, thresholds, and allowlists for your environment.

Find literal PEM / OpenSSH headers:


grep -R --line-number -E "-----BEGIN .*PRIVATE KEY-----|openssh-key-v1" /path/to/scan 2>/dev/null

Find long base64-like strings (requires rg / ripgrep; use grep -P fallback if needed):


rg --no-messages --hidden -U -n '[A-Za-z0-9+/]{200,}={0,2}' /path/to/scan

Detect openssh-key-v1 in binaries or text via strings:


strings /path/to/file | rg -i "openssh-key-v1" && echo "/path/to/file: OpenSSH signature found"

Hex-magic check with xxd (fast binary sniff):


xxd -l 32 file | grep -i '6f70656e7373682d6b65792d7631' && echo "openssh-key-v1 (hex) found in file"
# or simply:
xxd -l 32 file | strings | rg -i "BEGIN OPENSSH"

Check running processes for environment variables that look like keys:


for pid in /proc/[0-9]*; do
  envf="$pid/environ"
  [ -r "$envf" ] || continue
  tr '\0' '\n' < "$envf" | rg -n "PRIVATE|SSH|KEY|TOKEN|SECRET" && echo "pid $pid has suspicious env"
done

Git-history scanning

Use purpose-built tools (recommended):


gitleaks detect -s /path/to/repo --report-format json --report-path gitleaks-report.json
trufflehog --regex --entropy=True /path/to/repo

These tools will search both current tree and history for secrets.

YARA rule (host/EDR)

A compact YARA rule to flag PEM headers, OpenSSH signature, or long base64 runs:


rule PrivateKeyCandidates {
  meta:
    author = "sec-team"
    purpose = "Detect potential private key material"
  strings:
    $pem1 = "-----BEGIN OPENSSH PRIVATE KEY-----" ascii
    $pem2 = "-----BEGIN RSA PRIVATE KEY-----" ascii
    $pem3 = "-----BEGIN PRIVATE KEY-----" ascii
    $openssh = "openssh-key-v1" ascii
    $long_base64 = /[A-Za-z0-9+\/]{120,}={0,2}/
  condition:
    any of ($pem*) or $openssh or $long_base64
}

Tune $long_base64 length and combine with entropy checks where your YARA engine supports it.

Python program example

Also, here is a standalone Python detector. It scans a directory (or single file), looks for PEM/OpenSSH headers, the openssh-key-v1 signature, long base64-like runs, and applies a Shannon-entropy check to reduce false positives. Outputs findings as JSON lines for easy ingestion by SIEM/EDR/CI.

A part of the script below has been generated using AI. Be careful before using it in your environment


#!/usr/bin/env python3
"""
detect_keys.py

Detect likely private-key material on disk.

Scans files under a given path for:
 - PEM headers (BEGIN ... PRIVATE KEY)
 - OpenSSH native signature: 'openssh-key-v1'
 - Long base64-like runs (configurable threshold)
 - Entropy on candidate base64 runs (configurable threshold)

Outputs JSON lines, each with: path, type(s), evidence, sample, score (entropy or -1).

Usage:
    python3 detect_keys.py /path/to/scan --base64_min 150 --entropy 4.5 --max_size 10485760

Notes:
 - Tune --base64_min and --entropy for your environment.
 - Default: skip files > 10 MB to avoid scanning large binaries; adjust with --max_size.
 - Requires Python 3.8+ (uses pathlib).
"""
import argparse
import json
import math
import re
import sys
from pathlib import Path

# Regex patterns
PEM_HEADER_RE = re.compile(rb"-----BEGIN [A-Z0-9 ]*PRIVATE KEY-----")
OPENSSH_SIG_RE = re.compile(rb"openssh-key-v1")
# base64-like run (no newlines required); adjust length threshold via CLI
BASE64_RUN_RE_TEMPLATE = r"([A-Za-z0-9+/]{{{min_len},}}={0,2})"

# Utility functions
def shannon_entropy_bytes(bs: bytes) -> float:
    if not bs:
        return 0.0
    counts = {}
    for b in bs:
        counts[b] = counts.get(b, 0) + 1
    length = len(bs)
    ent = 0.0
    for cnt in counts.values():
        p = cnt / length
        ent -= p * math.log2(p)
    return ent

def find_base64_runs(bs: bytes, min_len: int):
    """Return list of (match_bytes, start, end) for base64-like runs."""
    pattern = re.compile(BASE64_RUN_RE_TEMPLATE.format(min_len=min_len).encode("ascii"))
    return [(m.group(1), m.start(1), m.end(1)) for m in pattern.finditer(bs)]

def scan_file(path: Path, cfg):
    """Scan a single file and return findings (list)."""
    findings = []

    try:
        size = path.stat().st_size
        if cfg.max_size and size > cfg.max_size:
            return findings  # skip very large files
        # Read up to cfg.read_limit bytes (to avoid OOM on huge files),
        # but ensure we cover headers/base64 occurrences near start of file.
        with path.open("rb") as fh:
            raw = fh.read(cfg.read_limit)
    except Exception as e:
        # unreadable files (permission, device, etc.) - ignore or log externally
        return findings

    # 1) PEM header
    if PEM_HEADER_RE.search(raw):
        findings.append({
            "type": "PEM_HEADER",
            "desc": "PEM header found (BEGIN ... PRIVATE KEY)",
            "sample": PEM_HEADER_RE.search(raw).group(0).decode(errors="ignore"),
            "score": -1,
        })

    # 2) OpenSSH signature
    if OPENSSH_SIG_RE.search(raw):
        findings.append({
            "type": "OPENSSH_SIGNATURE",
            "desc": "OpenSSH native signature 'openssh-key-v1' found",
            "sample": "openssh-key-v1",
            "score": -1,
        })

    # 3) Long base64 runs (candidate)
    base64_runs = find_base64_runs(raw, cfg.base64_min)
    for idx, (match_bytes, s, e) in enumerate(base64_runs):
        # compute entropy on the decoded bytes (approx: treat match_bytes as raw ASCII)
        ent = shannon_entropy_bytes(match_bytes)
        # Entropy threshold often ~4.0-4.8 for meaningful data; CLI controls it
        if ent >= cfg.entropy_threshold:
            findings.append({
                "type": "BASE64_HIGH_ENTROPY",
                "desc": f"Long base64-like run ({len(match_bytes)} chars) with entropy {ent:.2f}",
                "sample": (match_bytes[:200].decode("ascii", errors="ignore") + ("..." if len(match_bytes) > 200 else "")),
                "score": ent,
            })
        else:
            # optional: report long base64s that don't meet entropy threshold as low-priority
            if cfg.report_low_entropy:
                findings.append({
                    "type": "BASE64_LOW_ENTROPY",
                    "desc": f"Long base64-like run ({len(match_bytes)} chars) but low entropy {ent:.2f}",
                    "sample": (match_bytes[:200].decode("ascii", errors="ignore") + ("..." if len(match_bytes) > 200 else "")),
                    "score": ent,
                })
    return findings

def walk_and_scan(root: Path, cfg):
    """Walk filesystem and scan files. Yields (path, findings) for matches."""
    # optional path whitelist/blacklist logic can be added here
    files_checked = 0
    for p in root.rglob("*"):
        if p.is_symlink() and not cfg.follow_symlinks:
            continue
        if p.is_dir():
            continue
        files_checked += 1
        if cfg.verbose and files_checked % 1000 == 0:
            print(f"[+] scanned {files_checked} files...", file=sys.stderr)
        # skip some likely-noise filetypes by extension (images, videos) optionally
        if cfg.skip_exts and p.suffix.lower() in cfg.skip_exts:
            continue
        findings = scan_file(p, cfg)
        if findings:
            yield p, findings
    if cfg.verbose:
        print(f"[+] done scanning {files_checked} files", file=sys.stderr)

def parse_args():
    ap = argparse.ArgumentParser(description="Detect likely private key material on disk.")
    ap.add_argument("path", type=str, help="Path (file or directory) to scan.")
    ap.add_argument("--base64_min", type=int, default=150,
                    help="Minimum length for base64-like runs to consider (default: 150).")
    ap.add_argument("--entropy", type=float, dest="entropy_threshold", default=4.5,
                    help="Entropy threshold for base64 runs (default: 4.5).")
    ap.add_argument("--read_limit", type=int, default=262144,  # 256 KiB
                    help="How many bytes to read from each file (default: 262144).")
    ap.add_argument("--max_size", type=int, default=10 * 1024 * 1024,
                    help="Skip files larger than this (bytes). Set 0 for no limit. Default 10MB.")
    ap.add_argument("--report_low_entropy", action="store_true",
                    help="Also report long base64 runs that fail entropy threshold (lower priority).")
    ap.add_argument("--follow_symlinks", action="store_true", help="Follow symlinks while walking.")
    ap.add_argument("--skip_exts", nargs="*", default=[".jpg", ".jpeg", ".png", ".gif", ".mp4", ".mkv", ".iso", ".exe", ".dll"],
                    help="Skip files with these extensions (default list). Use empty string to disable.")
    ap.add_argument("--verbose", action="store_true", help="Verbose progress to stderr.")
    ap.add_argument("--jsonl", action="store_true", help="Output JSON lines (default behavior).")
    return ap.parse_args()

def main():
    cfg = parse_args()
    root = Path(cfg.path)
    if not root.exists():
        print(f"Path not found: {root}", file=sys.stderr)
        sys.exit(2)

    # prepare compiled regex template (global)
    global BASE64_RUN_RE_TEMPLATE
    BASE64_RUN_RE_TEMPLATE = BASE64_RUN_RE_TEMPLATE  # already defined; format used dynamically inside function

    # walk and scan
    for path, findings in walk_and_scan(root, cfg):
        out = {
            "path": str(path),
            "size": path.stat().st_size,
            "findings": findings,
        }
        if cfg.jsonl:
            print(json.dumps(out, ensure_ascii=False))
        else:
            # human-friendly
            print("=" * 80)
            print(f"FILE: {out['path']} (size={out['size']})")
            for f in findings:
                print(f"- {f['type']}: {f['desc']}")
                sample = f.get("sample")
                if sample:
                    print("  sample:", sample)
                score = f.get("score")
                if score is not None:
                    print("  score:", score)
            print()

if __name__ == "__main__":
    main()

Reference:

Photo by Chunli Ju on Unsplash

Conclusion:

🗞️ My Newsletter for interesting engineering and security thoughts

✏️ Or check out my other posts here

📞 1:1 Mentoring And Answering your questions

📞 Want to hang and just plain talk?

📚 I ❤️ reading books. Buy me a 📕