HackingJun 14, 202623 min read

Tank: I Built an AI Partner for the First Security Hire

A local-first onboarding and daily companion for the engineer who just walked into a company blind. Everything gets redacted on your machine before a single byte reaches the model. Nothing leaves in cleartext.

Part 1 of 1— Series

Tank: Your AI Security Engineer Companion

Ongoing

01Tank: I Built an AI Partner for the First Security Hire

Honest disclaimer up front: I built Tank as a side project, the same way I built Nyx. For fun, to scratch an itch, and to keep teaching myself to build instead of just break. It works. I use it. But it is a personal project with rough edges, known gaps, and design choices a senior engineer would side-eye. This is not a polished enterprise product. It is a tool that does enough useful things that I wanted to write about it. Take it for what it is. (And like the Nyx post — fair warning ......this one runs long.)

If you read my Nyx Post , you know the shape of these projects by now: a security problem I actually have, a pile of Claude Code sessions, and a tool at the end that turned out better than I expected. Tank is the second one.

Tank dashboard — Tank Dashboard after using it for a few days

Tank's Today home — your daily companion view.

Why This Exists

For the last six months or so I kept seeing the same job posting over and over. Different companies, same headline: "We're hiring our first security engineer." Or first Head of Security. Or first AppSec hire. Or first detection engineer. First, first, first. And every time I read one I thought about what that job actually looks like on day one.

You walk into a company you don't really know. You don't know the architecture. You don't know who owns what. You don't know which of the forty microservices is the one that'll page you at 3 a.m. You don't know the compliance posture, the IAM sprawl, the detection coverage, the threat model. Spoiler alert there is no threat model. That's why they hired you. You are the security program, and right now the security program is a person with a laptop and a Slack handle.

So what do you do? You read. You read architecture docs and Confluence pages, internal wikis, code commits and old incident write-ups and whatever the last person left behind. You sit in meetings nodding while people reference systems you've never heard of. You're trying to build a mental map of an entire company from scratch, fast, while also being expected to do security from week one.

That's the problem I wanted to solve. Not for a security team that already exists - for the one person who is the team, on day one, drinking from a firehose.

You need an assistant. Something you can feed everything — the docs, the repos, the CMDB, the people, the IAM policies, the detection rules and then ask questions. "Who owns the payments service?", "What talks to the PHI database?", "What should I be worried about in my first 90 days?" Something that read everything so you don't have to hold it all in your head at once.

That's Tank.

Why "Tank"?

Quick tangent, because the name matters to me.

Who's on the other end of the line whenever Neo or Morpheus or Trinity called the Nebuchadnezzar? It was Tank. Who ran the training programs — loaded the kung fu, the helicopter piloting, the whole library — into people's heads? Tank. Who sat at the operator's chair holding the entire knowledge base that trained practically everyone on that ship? Tank.

He was always there, at a moment's notice, with the answer. And if he didn't have the answer, he'd get it. He's a deeply underrated character - no powers, no chosen-one prophecy, just the person who knew everything and was always reachable when it mattered.

That's exactly the role I wanted this tool to play. The operator on the other end of the line. The one who read the whole library so you don't have to. The one you call when you're in the thick of it and need an answer now. Naming it Tank is my little homage to the most quietly essential person on the Neb.

(Nyx got named after the Greek goddess of night, the one who illuminates what others can't see. Tank got named after a guy in a chair with a headset and the entire knowledge base. Both fitting, honestly.)

What Tank Is

Tank is a local-first, privacy-preserving onboarding and daily-companion partner for a new senior, staff, or manager-level security engineer. You point it at your new employer's stuff and it becomes the thing that reads all of it.

It does a lot of small things that add up:

Ingests architecture docs, source repos, CMDB exports, people info, Sigma detection rules, IAM policies, and control frameworks — PDFs, Word docs, Markdown, CSV/JSON/YAML, images, even Mermaid diagrams.
Redacts every internal hostname, email, IP, ARN, cloud account, and secret locally before anything reaches the API.
Builds a typed knowledge graph of your org — services, people, teams, controls, detections, IAM policies — with a trust badge on every fact saying where it came from.
Answers questions in chat over that whole knowledge base, with 15 tools it can call to look things up.
Generates 14 kinds of reports — threat landscape, cross-service gaps, a 30/60/90 plan, stakeholder map, control matrix, ATT&CK mapping, IAM audit, risk register, program roadmap, and more.
Hosts living artifacts — versioned threat models that notice when they've drifted, a decisions log, design reviews, postmortems, tabletops, IR runbooks.
Acts as a daily companion — a journal, kanban boards, meeting prep, a Day-1 brief, nudges, post-mortem notes.
Tracks your security program — a dashboard of health metrics, a risk register, a vulnerability triage queue.
Walks you through onboarding — an intake interview, GitHub/team discovery, a stack audit, a 90-day plan generator.

The goal is not to replace your judgment. It's to be the operator's chair. The place where everything you've fed it lives, ready to answer.

The One Rule: Nothing Leaves in Cleartext

This is the mission statement, and it's non-negotiable.

You're about to feed this thing your employer's most sensitive internal documentation. Architecture. IAM. Secrets that got committed where they shouldn't have. The entire point of the tool requires you to hand it the crown jewels. So the headline guarantee has to be airtight: nothing reaches the Anthropic API in cleartext. Ever.

Here's how that actually works.

Every byte that goes to the model passes through a single chokepoint — app/redact/engine.py. One function, apply_redactions(text). There is exactly one door, and everything walks through it. Tank redacts across 11 categories: emails, internal hostnames, private IPs, AWS account IDs, ARNs, GCP projects, Azure subscriptions, secret tokens, and a few more.

The important design decision: Tank redacts at ingest time, not at send time. When a document comes in, it gets chunked, and each chunk is redacted immediately and stored that way. The column that flows out to Claude (chunks.text_redacted) is already safe before it's ever needed. The original (chunks.text_original) stays on your disk, local-only, for your own forensics. This matters because the alternative — redacting right before each API call means that the day you add a new code path that talks to Claude and forget to redact, you've leaked. Redacting at ingest makes the invariant simple and permanent: if it's in the redacted column, it's already safe. Secrets get special treatment. They are one-way SHA-256 hashed. Not encrypted - hashed.

That means even Tank itself cannot rehydrate a secret. If an API key shows up in a doc, Tank stores a hash of it, sends a placeholder to Claude, and there is no path — internal or otherwise — to turn that placeholder back into the original key. By design. (Everything else is reversibly placeholdered, so when Claude says "the service-a host," Tank swaps the real hostname back in locally before showing you the answer.)

There's defense-in-depth, too: even the results of the chat tools get run through apply_redactions again before they go back to Claude, in case something un-redacted slipped into the graph. And the embeddings never leave either. The whole thing is built so that you could run it on an airplane and the only thing that ever crosses the wire is text with every identifier already scrubbed out. That's it. That's the mission. Hand it the crown jewels; the crown jewels never leave the building.

How It Works

The data flow is the privacy contract made literal:

1parse → CHUNKER → REDACT → SQLite → (retrieve + cache) → Anthropic API
2                              ↓ retrieve                    ↑ (redacted)
3                          cached KB block            rehydrate locally

A document comes in. It gets parsed by a per-format parser, chunked (800 tokens with 120 overlap), redacted, and written to SQLite — into a full-text index, a vector index, and the chunk table. Then Tank runs entity extraction over it to grow the knowledge graph.

When you ask a question, Tank embeds your query locally, does a hybrid search over the knowledge base (vector + keyword + reciprocal rank fusion), pulls the relevant chunks and entity cards, assembles a prompt with a cached system block and a cached knowledge block, streams the answer back, runs any tools Claude asks for, and then rehydrates the response — swapping real names back in for the placeholders — before you ever see it.

You see real hostnames. Claude only ever saw placeholders. That gap is the whole product.

The Features

Tank grew into a lot. Here's the tour of a few features, there are many more and each of these is a real screen you can click into.

Ingest and the knowledge graph

You feed Tank documents (drag-drop or by path) and source repos. Documents can be PDF, Word, Markdown, CSV, JSON, YAML, images, or Mermaid .mmd files. Images go through Claude vision to extract what's in the diagram.

Source code is special: no raw source ever goes to Claude. A repo gets walked and summarized into a structured RepoSummary — manifests, Dockerfiles, CI config, CODEOWNERS, the README, and secret-grep hits and only that summary is ever sent. Claude reasons about your codebase without ever seeing a line of it. This is by design. If you want to view the code, open up your preferred IDE clone it and if you want Claude to see it, hook in claude code. :-)

Out of all this, Tank builds a typed knowledge graph: 14 entity types (Service, Person, Team, Org, Asset, CloudAccount, Control, Decision, Detection, AttackTechnique, IAMPolicy, Endpoint, Repo, Runbook) connected by 9 relationship kinds (owns, manages, runs_on, has_control, has_threat, implements, covers, links_to, mentions). Every single entity carries a provenance badge: source (stated in a doc), inferred (Claude concluded it), claim (unverified assertion), or user (you added it).

So you always know whether a "fact" is grounded or a guess.

The knowledge graph. Every node is clickable; every edge came from a document. The graph above was built using the sample data from MedScribe-R-Us.

Chat with 15 tools

The chat is the operator's chair. You ask, Claude answers, and along the way it can call any of 15 tools to look things up in your knowledge base:

Tool	What it does
search_kb	hybrid retrieval over all your docs
get_entity	full card for a service/person/etc.
list_relationships	walk the graph
find_control_gaps	services missing a control
get_threat_model	latest TM for a service
find_decisions / get_recent_decisions	the decisions log
find_detection_for_technique	ATT&CK coverage
find_iam_risks	riskiest IAM policies
find_evidence_for_control	compliance mapping
search_lessons	the lessons-learned DB
get_risk_register	open risks by score
find_ir_runbooks	incident runbooks
…and a few more	document lookup, entity listing

It streams (Server-Sent Events), so you watch it think and call tools in real time. There's a full-page /chat and a persistent, resizable side panel that follows you around the app. And when your knowledge base is still sparse (on day one for example) it shifts into a discovery mode that helps you fill the graph instead of pretending it's full.

Chat with tool use. You can see exactly what it looked up to answer.

Living artifacts

This is the part I'm proudest of. Tank doesn't just generate one-shot reports — it hosts the documents a staff-level security engineer maintains week to week, and keeps them alive:

Threat models — per-service, versioned, and drift-aware. Each model stores a hash over the architecture chunks it was built from. When the underlying docs change, the hash stops matching, and Tank tells you the threat model has drifted. Regenerating doesn't start from scratch — it feeds the prior version back in and asks Claude to mark each old threat still_valid, updated, or invalidated, and add genuinely new ones.
Decisions log — deliberate choices (design choice, accepted risk, deferred fix, security invariant), with expiry dates that nudge you when a decision is about to lapse.
Design reviews — freewrite intake → checklist → approval, which spawns decisions.
Postmortems — freewrite → structured doc → action items become follow-ups, and lessons get auto-extracted.
Tabletops — scenario generator with timed injects and a scoring rubric.
IR runbooks — per-service, per-scenario playbooks grounded in that service's threat model, IAM, and past incidents.

DFD threat modeling

A dedicated three-stage pipeline for data-flow-diagram threat modeling. Feed it a Mermaid diagram, an architecture doc, an image of a whiteboard, or just a plain-language description. It generates the diagram if needed, runs a STRIDE analysis with a live progress tracker, and drops you into a split-panel workspace: diagram on the left, threats on the right. Click a node, the threats filter to it. Each threat gets an ID, a STRIDE category, a severity, a CVSS estimate, a mitigation, and references. Export to annotated Mermaid, JSON, or a print-ready PDF.

The DFD workspace. Click a box, see its threats.

Partner mode — the daily companion

The onboarding stuff matters for week one. Partner mode is what keeps Tank open in a tab six months later:

Journal with a real Markdown editor, plus an evening nudge if you haven't written anything.
Kanban boards for tracking your own work.
Meeting prep — a one-screen brief on who you're meeting, the overlap, the unknowns, and ranked questions to ask. It can auto-generate these overnight for tomorrow's calendar.
Notes — freewrite, and Tank extracts the structured bits (but asks you to confirm before committing anything to the knowledge base).
Anniversary retros at day 30/60/90/180/365: both a generic "how's it going" and a security-focused one.

Security program, risk, and vulnerabilities

A security program dashboard that aggregates health metrics from every table: open risks, vuln age, threat-model drift, compliance coverage, incident counts. These come with a weekly snapshot and an executive brief generator. A full risk register (likelihood × impact, Claude-assisted assessment grounded in your KB, review deadlines). And a soon to be functional vulnerability triage queue — intake, triage, assign, close, or promote-to-risk.

Fun detail: This post is the first in a series regarding Tank. I plan to hook functionality from Nyx into Tank. That way the repos, scanner findings and vuln triage will live in another system but you will get a snapshot in Tank.

The program dashboard — your posture at a glance.

The Stack

Like Nyx, the stack came together by building. I'm a hacker, not a software developer. I leaned on Claude Code hard for the implementation details while I focused on the security logic, which is the part I actually know.

AI: two Anthropic models, strictly split — Claude Sonnet 4.6 for all the reasoning (chat, reports, threat models, design reviews, vision) and Claude Haiku 4.5 for cheap structured extraction (entity extraction, meeting prep, lesson extraction). More on why below.
Backend: FastAPI, Python, server-rendered Jinja templates.
Storage: a single SQLite file (WAL mode), with FTS5 for keyword search and sqlite-vec for vector search.
Embeddings: local sentence-transformers (all-MiniLM-L6-v2). Never remote.
Frontend: server-rendered HTML, a D3 force graph for the knowledge graph, a purple-tinted dark theme (a Nyx-family resemblance, if you squint).

The whole thing boots with ./scripts/start.sh and runs about 53 tests across ~150 routes.

How Tank runs. One SQLite file, two models, everything local.

Design Choices I Made Along the Way

The features are the what. Here's the why. The decisions that shaped the thing, several of which I'd defend and a couple I'd reconsider.

Redact at ingest, not at send time. Covered above, but it's the single most important call in the codebase. It costs a few milliseconds per chunk and it kills an entire category of "oops, we forgot to scrub that one code path" bugs. Permanently.

Secrets are hashed, not encrypted. You genuinely cannot un-hash them, which means you can't be socially-engineered or subpoenaed into making Tank cough up a secret it ingested. It only ever knew the hash. The tradeoff: if you need the original value, you go look at your own source doc. Worth it.

Two models, strict split. Reasoning is hard and ambiguous? That's Sonnet. Entity extraction is a constrained-schema, low-ambiguity grind that runs constantly during bulk ingest? That's Haiku, roughly 10× cheaper. Routing the high-volume extraction work to the cheap model is what makes ingesting a whole company's worth of docs not cost a fortune. When I first started this project, all calls ran through Sonnet and it was $$$.

Prompt caching at two breakpoints. The system prompt (stable for a session) caches for an hour; the retrieved knowledge block (changes every turn) caches for five minutes. And a shared "scope block" is reused across all the report generators — so running six reports in a row pays for the context once and reads it cached five times. Caching is the biggest cost lever at this scale, same as it is in Nyx.

Living artifacts with a drift hash. Instead of regenerating threat models on a timer (wasteful) or never (stale), I hash the source chunks and only flag drift when the architecture actually moved. The artifact survives change and tells you when it's lying to you.

SQLite, one file, single connection. No "real" vector database, no Postgres, no service to operate. One file you can back up by copying it. The cost is that it's firmly single-user — which is fine, because the whole premise is one person, their machine, their data.

Browser print for PDFs. Reports export to PDF via window.print() and CSS, not a heavyweight rendering library. The output is browser-dependent and slightly less pretty, but it added zero dependencies and zero new failure modes. Right call for a personal tool.

What It Cannot Do (Honestly)

Here's the part most tool write-ups skip. Tank has real limits, some are intentional, and some are just unfinished.

Known limitations:

There's no authentication. Tank assumes one user on one machine behind a tunnel. Anyone who can reach the port can use it. No login, no CSRF tokens, no rate limiting. That's a deliberate single-user assumption, not an oversight — but it means do not expose this to the open internet.

It's single-user, structurally. One global SQLite connection behind a lock. Two people hammering it concurrently is not a supported scenario as of right now.

Secrets can't be rehydrated — by design, but worth restating. If Tank ate a secret, the original is gone from Tank's perspective forever.

No raw source code ever reaches Claude — only the structured repo summary. Great for privacy; it does mean Tank reasons about your code at a coarser grain than a tool that reads every line.

No two-way integrations. Tank reads (folders, calendars, CVE feeds, GitHub). It does not write back to Jira, Slack, or anything else. Continuous ingest is read-only on purpose — fewer outbound credentials sitting in a .env is a smaller attack surface.

IR runbook generation blocks the page for 20–60 seconds while it thinks. I know. A background task with a progress stream would be nicer adn eventually I will build this in. This is the first post in a series.

DFD caching is strict. The cache key is a hash of the exact diagram, so changing one character misses the cache and re-runs the analysis. Intentional, I'd rather re-run than show you stale threats but it can feel wasteful.

The vulnerability lifecycle is thin. There's an intake queue and a promote-to-risk path; there isn't a full-blown triage workflow UI. Vendor / third-party risk tracking isn't there at all. This will be changed in the future.

Known bugs and rough edges:

There are bugs. I use this thing and I still trip over them. Several of the newer Claude call sites compile clean but haven't been hammered against every real-world input yet, so expect the occasional odd output on a weird doc. Some UI flows are clunky. Postmortems use plain textareas with no live preview. I fix what I find and keep a running list. If you hit something, I want to know. I have yet to be hired as the 'first security engineer' somewhere so all my testing is theoretical and using sample data.

This is a personal project that got more serious than I intended. That's a good thing. But it carries the marks of one person with opinions building iteratively It is not something that went through formal QA.

Who Built This (and Why That Matters)

Same disclosure as the Nyx post, because it's still true: I am not a software developer by trade. I'm a hacker. Ten-plus years in infosec. I read code, find the holes, and exploit them. Building production software is not my background; I've always lived on the breaking side.

So, like Nyx, I leaned on Claude Code to bridge the gap. I brought the security logic, the framework of Tank, the redaction model, the threat-modeling flow, what a first hire actually needs etc. And I used Claude Code for the FastAPI patterns and implementation details I was learning as I went. This is not a vibe-coded app; I cared about getting the privacy contract right and I built it deliberately. But I'd be lying if I said I wrote every line from a place of deep expertise. I wrote it from a place of knowing the problem cold and figuring out the building part along the way.

That's the whole arc of these projects for me: breaker to builder. Take a problem I actually understand and build the tool I wish existed.

Why I'd Reach For It

Honestly? Because the next time I'm staring at one of those "first security hire" job posts thinking what would my first week even look like — this is the thing I'd open. It turns "read forty documents and hold them all in your head" into "feed it forty documents and ask it questions." It turns a blank threat model into a draft. It turns the firehose into a watering bucket.

That wasn't the original plan. The plan was to mess around with a privacy-preserving RAG pipeline and see if I could make a redaction contract I actually trusted. What I ended up with was an operator's chair. The place where everything you've learned about a company lives, ready to answer when you call.

That felt worth writing about. Oh and did I mention there is a cost/token tracker? :-)

Try It

Github here -> https://github.com/LeSpookyHacker/tank/

Tank is local-first and self-hosted:

1git clone <tank>
2cd tank
3./scripts/start.sh          # creates the venv, installs deps, inits the DB, boots the server

start.sh is idempotent. You can run it as many times as you like. Drop your ANTHROPIC_API_KEY in .env, and there's a sample data pack you can load up.

Sample Data - The Return of MedScribe-R-Us

What is MedScribe-R-Us? Those of you familiar with my blog and github repos know that I have ran a theoretical case study of being the first security engineer at a medical scribe startup. You can find all the blog posts for that project here at the Grimoire or here: MedScribe-R-Us Git Repo

Anyways, I expanded on that functional company and created sample data (ALL FICTIONAL) for me and you to play around with in Tank. You can load to see the whole thing populated:

1python -m scripts._gen_fixtures      # synthesize the sample docs
2python -m scripts.load_fixtures      # ingest them
3python -m scripts.verify_privacy --fixture-pack   # assert nothing leaked

That last command is the one I care about most. It scans the database for any planted identifier that escaped into a redacted column. Exit 0 means the privacy contract held. Run it after ingesting anything real.

Or you can simple copy/paste or drag and drop files into the ingestion pipeline.

There's a healthcheck at /healthz, full docs in docs/, and the complete build history (every phase, every tradeoff) in HISTORY.md.

Final Thoughts

So thats Tank! It cant teach you Kung Fu, or show you how to operate a helicopter. But it can help you excel or at least try to excel as the first security engineer at your new company. It started as a fun project. It stayed fun. That's probably the best thing I can say about it. And who knows? Maybe one day I will be using it professionally.

If you try it and hit bugs, or have feedback, or want to tell me what it's missing from your first-week-on-the-job nightmare. Email me: wanderersgrimoire@gmail.com. I'm genuinely curious how other people would use this.

Github here: -> https://github.com/LeSpookyHacker/tank/

Topics Not Covered in This Post

Tank has a long tail I didn't get into here:

Compliance framework wizard (CIS / NIST / SOC 2 recommendations)
The prioritization engine (top-5 quarterly action plan)
Security policy generation (5 policy kinds)
Org → Team → Project hierarchy and multi-project scoping
Attack-surface ledger with weekly diffs
Sigma / IAM / control-framework ingest as first-class types
ATT&CK technique coverage mapping
The glossary builder and the evolving philosophy doc
Token usage and cost dashboard
The continuous-ingestion connectors (folder / calendar / CVE feed / GitHub)
And other fun easter eggs and functionality!

See the docs for the rest.

Glossary of Acronyms

ARN — Amazon Resource Name
ATT&CK — Adversarial Tactics, Techniques, and Common Knowledge (MITRE)
CMDB — Configuration Management Database
CVSS — Common Vulnerability Scoring System
DAST — Dynamic Application Security Testing
DFD — Data Flow Diagram
EPSS — Exploit Prediction Scoring System
FTS5 — Full-Text Search v5 (SQLite)
IAM — Identity and Access Management
IR — Incident Response
KB — Knowledge Base
PHI — Protected Health Information
RACI — Responsible, Accountable, Consulted, Informed
RAG — Retrieval-Augmented Generation
RRF — Reciprocal Rank Fusion
SAST — Static Application Security Testing
SHA — Secure Hash Algorithm
SLA — Service Level Agreement
SSE — Server-Sent Events
STRIDE — Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege
TM — Threat Model
WAL — Write-Ahead Logging (SQLite)

Tank is an independent personal project. It is not affiliated with Anthropic, GitHub, my employer or any of the tools or frameworks it integrates with. If you build on it, credit me.

Built by Le Spooky Hacker

Tags:General Security AppSec

Bluesky LinkedIn Reddit Hacker News

Join the Grimoire

Get notified when I publish new posts. No spam, unsubscribe anytime.