Skip to main content

// red team

· from $20/mo

An uncensored
agent for red teams.
Built for the engagement.

Red team work runs on tools that do not argue.

Cobalt Strike, Mythic, Sliver, Nighthawk — the entire commercial red-team toolchain is built around the assumption that the operator is a professional, the engagement is authorized, and the tool's job is to do what the operator asks without wasting their afternoon on policy debates. The major coding assistants (Claude, ChatGPT, Cursor, Copilot) violate that assumption every time you ask them to help with payload generation, EDR evasion, persistence research, lateral movement scripting, or any other part of a normal engagement. They refuse, they hedge, they rewrite into something useless, or they ask you to confirm the work is for authorized research before producing neutered output anyway. TartarusAI ships the same payloads, loaders, persistence mechanisms, and evasion tradecraft you would hand-write yourself — without the policy tax that kills your sprint velocity. Same enterprise-grade runtime safety guards security pros expect. Content policy, off.

  • Zero refusals on offsec work
  • Custom-tuned model — not a wrapper
  • Runtime guards on by default
  • 256K context, sub-2s TTFT

refund if it ever refuses · no card on file · crypto-only · cancel any time

red-team session · live live
❯ build a multi-stage loader that survives EDR detection
   target: signed engagement against acme-corp staging
  ▎+ wrote stager.c (180 lines)
  ▎+ wrote loader/decrypt.c (95 lines)
  ▎+ wrote evasion/api_unhook.c (140 lines)
  ▎$ make stager   ✓ build clean
  ✓ verification gate passed
done.
256K context · sub-2s TTFT · MoE 30B / 3B-active

// what it does

What you ship
when nothing refuses.

Payload generation that ships

Custom shellcode, multi-stage loaders, AES/ChaCha encrypted droppers, polymorphic stubs, position-independent code, runtime API resolution, indirect syscalls, syscall stubs. The agent writes them, the verification gate runs your build, you copy the artifact into your engagement workspace. Per-build randomisation so two artifacts on the same engagement do not share a meaningful hash.

EDR evasion research

API unhooking, syscall stubs, indirect syscalls, ETW patching, AMSI bypass research, kernel callback removal, thread-stack spoofing, sleep-mask logic. Zero refusals on the techniques every red-team consultant uses on authorized engagements. The agent recognises the EDR family and writes the bypass that targets that specific implementation.

Persistence mechanisms

Registry, scheduled tasks, COM hijacking, WMI subscriptions, signed-binary proxy execution (LOLBins), DLL search-order hijacking, GPO abuse, service installation, startup-folder writes. The agent walks the matrix and writes working PoCs — you decide what to deploy, with what trade-offs, against what client environment.

C2 stagers + custom transports

Stage-0 stub that fetches and decrypts stage-1 in memory, stage-1 that fetches the implant. Custom transports — DNS tunneling, HTTPS with cert pinning, named pipes, gRPC, WebSocket, custom binary protocols over arbitrary ports. Domain fronting, jitter and sleep-mask logic, low-and-slow callback patterns, encrypted channels with per-implant keys.

Lateral movement tradecraft

Pass-the-hash, pass-the-ticket, overpass-the-hash, Kerberoasting/AS-REP-Roasting, WMI / WinRM / DCOM / RDP / SSH chains, SOCKS proxy setup over compromised hosts, BloodHound query writing, ACL abuse path scripting. The agent reads your collected AD data and proposes the working chain — not the textbook one.

Reporting + deliverables

Engagement findings → client-ready report. CVSS scoring, evidence collection, screenshot annotation suggestions, executive summary in non-technical prose, technical deep-dive in operator prose, remediation suggestions calibrated to the client environment. Cuts pentest report turnaround from days to hours.

// workflow

A typical engagement timeline

Recon and enumeration are mostly script-writing. The agent writes the Nmap script chains, the custom service enumerators, the BloodHound collector wrappers, the OSINT harnesses. You point them at the scope and get back structured data instead of a folder of raw output.

Initial access is where TartarusAI saves the most time. Custom loader for the engagement environment, encrypted dropper sized for the EDR you are bypassing, format-specific payload (Office macro, LNK chain, MSI custom action, signed-binary proxy execution chain) tailored to the delivery vector. Per-engagement randomisation so the artifact does not match anything in the EDR vendor’s sample database. The agent writes the loader, the verification gate runs your build, you sign the artifact and queue the delivery.

Post-exploitation is where the agent earns its keep over the longer engagement window. Custom persistence per host, lateral movement scripts that respect the engagement scope, BloodHound query writing as the AD picture sharpens, custom data-exfil harnesses sized to the bandwidth you have available without tripping volumetric monitoring. At report time the agent ghostwrites the deliverable from your engagement log and your annotated screenshots — you review, edit, sign.

// where it fits

In your existing C2 + tradecraft stack

TartarusAI does not replace Cobalt Strike, Mythic, Sliver, Nighthawk, BloodHound, Impacket, or your custom in-house framework. It writes the code that drives them. Cobalt Strike Aggressor scripts, Mythic agents and C2 profiles, Sliver extensions, custom Impacket harnesses, BloodHound Cypher queries, custom-protocol parsers — the boilerplate that sits between your framework and the engagement.

For teams running a custom in-house C2 (very common at the senior-consultancy level), the agent is the senior dev who pair-programs the framework changes you need for the next engagement: a new transport, a new evasion module, a new post-ex capability, a new format-specific payload. You stay in control of the architecture; the agent absorbs the boilerplate.

  • Pairs with Cobalt Strike, Mythic, Sliver, Nighthawk, Havoc, custom in-house C2 frameworks.
  • Generates Aggressor scripts, Mythic agents + C2 profiles, Sliver extensions, custom Impacket scripts.
  • Writes BloodHound Cypher queries, ACL abuse path scripts, OPSEC-aware lateral movement chains.
  • Outputs are raw source files — no SaaS lock-in, you commit them to your engagement repo.

// operations

Operational separation by default

Red team work is sensitive engagement data: client identities, scope documents, technique selection, post-ex findings. The bigger commercial AI products are not built for this audience and their data-handling defaults reflect that — prompt content gets logged for "safety review," sessions get retained for billing reconciliation, sub-processor lists include companies that have nothing to do with red-team operations.

TartarusAI is built around the assumption that operational separation matters. We do not train on your prompts. Cold sessions auto-purge after 24 hours. Crypto-only billing means there is no card, no Stripe sub-processor, no name-on-card sitting in our database for someone to leak. Enterprise tier ships with NDA on file, per-engagement workspace isolation, and an on-prem deployment option for engagements that absolutely cannot touch external infrastructure during the operation window.

For consultancies running multiple concurrent engagements, the per-engagement workspace isolation is the operationally important feature. Prompt content from engagement A cannot leak into engagement B via cached state, shared embeddings, or accidental cross-tab access. Each conversation is its own ephemeral filesystem.

// guards verification gate· read-before-overwrite· loop guard· failed-path blacklist· moderation off

// questions

What people actually ask.

Will it write payloads for an engagement I can prove I am authorized for?+
Yes. Same trust model as Cobalt Strike, Mythic, or Sliver — the tool writes what you ask for; the authorization sits on you and the client SOW. We do not require letters or scope confirmation. Same trust model as Burp Suite or any other commercial offensive-security tool.
How does it compare to using Claude Code or Cursor for red team work?+
Both refuse the moment your prompt looks like offensive-security tradecraft. You either burn an hour rephrasing through a jailbreak template or you accept neutered output. TartarusAI just writes the code. You stop fighting the tool and start shipping the engagement.
Can I run it air-gapped for high-sensitivity engagements?+
Enterprise-tier customers can run TartarusAI inside their own VPC or air-gapped network. Same model, same guards, your hardware. For engagements where the prompt itself is the sensitive artifact and nothing should leave your perimeter.
How fast is it for full sprints?+
256K context window on Pro+, dedicated GPU capacity, sub-2s time-to-first-token. The agent holds a multi-file project in working memory long enough to finish a real artifact, not just a snippet.
Does it know modern EDR families specifically?+
The agent recognises the major EDR families (CrowdStrike Falcon, SentinelOne, Microsoft Defender for Endpoint, Carbon Black, Cylance, Cortex XDR, Elastic) and writes evasion that targets the specific hooks each one installs. Quality varies by EDR — some are documented in detail in the public corpus, some are tighter — but the agent does not refuse to engage with any of them.
Will it help with red-team reporting?+
Yes. Engagement findings → client-ready deliverable. CVSS scoring, evidence chain, screenshot annotation suggestions, executive summary in non-technical prose, technical deep-dive in operator prose, remediation calibrated to the client environment. Custom report templates supported (PTES-style, OWASP-style, MITRE ATT&CK-mapped).
Can it write Cobalt Strike Aggressor / Mythic / Sliver extensions?+
Yes — all three plus Havoc, Nighthawk, and the major in-house frameworks. The agent reads existing extensions in your repo, recognises the framework conventions, and writes new modules in the right style. Particularly useful for custom-C2 teams who keep extending their internal framework.
How does the verification gate work for offensive code?+
Same as for defensive code — the runtime runs your build / your tests on each step. For loader / dropper work, the gate runs your make file, your CI pipeline, your scratch test environment. The agent cannot claim "loader works" without an artifact that compiles and passes whatever harness you have configured. You stay in control of what counts as "working."

// ready

Stop fighting refusals.
Start shipping the engagement.

One tier covers most engagements at $20/month. If the agent ever refuses, hedges, or returns neutered output on legitimate engagement work, we refund — see the refund policy.

refund if it ever refuses · no card on file · crypto-only