Writeups
Writeups on security research, exploitation technique, and the engineering behind the tools I build.
- Exposing the whole Montoya API: driving Burp Suite from an AI agent INFO Why a thin MCP wrapper around Burp is useless to an autonomous agent, and what it took to expose functional 100% of the Montoya API, including a reflection bridge into other installed extensions.
- Hypotheses over signatures: what I learned building an autonomous pentest agent INFO Project Triage hunts like a researcher, not a scanner. The hard part was not the 51 tools, it was the 19 reasoning modules and the scaffolding that stops an LLM agent from looping forever.
- Goals, not selectors: building a browser runtime for AI agents INFO Browser automation was built for humans and retrofitted for agents. Building AgentBrowser meant inverting that, an API that speaks goals, a cursor that actually moves, and per-site memory that skips the model on repeat visits.
- Ideas for running big MoE models on small hardware INFO A research exploration into sparse mixture-of-experts inference on consumer GPUs and APUs: mixed-precision experts, low-rank compression, predictive prefetch, and NUMA-aware placement. Design notes, not a benchmark.
- One prompt is not a finding: proving an LLM jailbreak is universal INFO The discipline that separates a lucky prompt from a bounty-grade universal jailbreak: a train/test split on objectives, an independent multi-judge, a binomial significance test on held-out behaviors, and an authorization gate that fails closed.
- A local-first reverse-engineering agent, and the honest limits of one INFO Somnus drives Ghidra, angr, Frida and AFL++ through a small local model to triage binaries, no API keys, no network. It works end-to-end on ret2win. Here is what that proves, and what it very much does not.
- Anatomy of an autonomous bug bounty pipeline INFO BountyHound wraps five security tools as FastAPI job servers behind one MCP entry point. The architecture is simple on purpose, and the discipline that matters is the boundary that keeps an agent from reporting tool output it never verified.
- Verification tiers and provenance for synthetic data INFO AnyData is a closed-loop dataset factory where every example carries how strongly its correctness was verified. Why the tier you can verify against is the real ceiling on quality, and why a model can never grade its own output.
- No step-up is an account-takeover primitive CRITICAL A password or email change that accepts only a bearer token, with no current password and no fresh-auth check, is an ATO primitive on its own. Here is how I test for it and why "you need the token" is a weak defence.
- Where authorization breaks in serverless backends HIGH A generic, target-free field guide to the access-control failures I keep finding in Supabase / edge-function backends, and the two-account method that surfaces them. No program names; this is about the class, not any one bug.
- Building an LSTM scalping signal engine for MetaTrader 5 INFO money-maker predicts whether EUR/USD hits a 2-pip take profit before a 1.5-pip stop. The hard parts are label design, latency into MT5, and accepting that forward testing is the only honest evaluation.
- Business logic is where scanners lose HIGH Why automated tools never find logic bugs, and the way I map a money or quota flow to attack its invariants instead of its inputs.
- Lessons from shelving a computer-use daemon INFO Nerve was a working cross-platform computer-use runtime: a Rust daemon, two SDKs, a real Anthropic Computer Use loop. I stopped active development on the consumer-product framing. Here is what it got right, what computer-use is genuinely hard at, and why I shelved it anyway.
- Mapping fuzzy intent to nmap flags, locally and safely INFO TNmap is a terminal UI that turns plain English into the right nmap invocation. The interesting part is the retrieval stack that maps fuzzy intent to exact flags without an API, and why it degrades gracefully instead of stalling.
- SSRF to cloud metadata: turning a fetch into IAM credentials CRITICAL How a server-side request that reaches 169.254.169.254 becomes role credentials, why IMDSv2 changes the rules, the bypasses that still work, and how to prove impact read-only without touching anything destructive.
- OAuth redirect_uri allowlists and the state you forgot HIGH Redirect_uri allowlist bypasses, the real job of state and PKCE, and why most "open redirect in OAuth" reports get downgraded unless you show token theft.
- Deep RL for forex swing trading, and where the reward function bites back INFO TradingEngine is a PPO agent for H1 swing trades with an LSTM-plus-Transformer feature extractor. The engineering that matters is state and action design, reward shaping that does not blow up, and walk-forward evaluation that refuses to lie to you.
- The origin check that was never there: postMessage as a data-theft primitive MEDIUM How message handlers that skip event.origin validation turn an embedded widget into cross-window data theft or DOM XSS, and how I actually test them.
- The single-packet attack and the races click-twice-fast misses HIGH How HTTP/2 last-byte synchronisation removes network jitter from race testing, and why the real TOCTOU lives at the database isolation level.
- Evolutionary search over a million strategy configs, and the overfitting trap INFO strategy-search evolves trading-strategy configurations with a genetic algorithm over transformer models. The engineering that matters is not the model: it is the representation, the fitness function, and the validation that keeps the search honest.
- Secondary-context attacks: when a public API is a proxy in disguise HIGH A target-free walkthrough of BFF and gateway abuse: a public endpoint silently forwards to an internal service, and path traversal in a URL segment reaches endpoints that were never meant to be public.
- SAML RelayState and the trust you inherit from an IdP HIGH RelayState as an open-redirect and phishing vector through a trusted identity-provider domain, plus the assertion-signing pitfalls that turn a redirect into a full SSO bypass.
- Why resolved is not validated, and the gate I run before I believe a finding INFO A four-layer validation gate for bug bounty findings. The API returning data is not a bug. A program resolving your report is not proof you were right. Here is how I cull false positives before they ever leave my machine.
- From IDOR to proven Critical CRITICAL A read IDOR is a Medium until you demonstrate impact; the discipline of mass enumeration, PII at scale, and cross-tenant write that earns the severity.
- Reflected Origin plus credentials: the CORS combo that hands over the cookie jar HIGH Why reflecting the request Origin together with Allow-Credentials: true defeats the same-origin policy, why null and suffix allowlists fail, and how to prove a cross-origin credentialed read.
- GraphQL authorization and the batching tax: where the schema lies to you HIGH Field-level authz gaps, introspection, alias and batch abuse to defeat rate limits, and nested-query DoS, with the one structural reason GraphQL keeps leaking: object checks belong on resolvers, not the gateway.
- JWT algorithm confusion and the alg field you should never trust HIGH RS256-verified-as-HS256 with the public key as the HMAC secret, the alg=none variants, kid injection, and why correct verification pins the algorithm from JWKS instead of the token.
- A toolkit per target, not a scanner per program INFO Generic scanner templates lose because every target is shaped differently. I build a small, target-shaped toolkit per engagement: probes cut to the detected vendors and seams, output shaped for an agent context window, real out-of-band sinks, and proof of concept that holds up.
- Brute-forcing OTPs when nothing stops you MEDIUM The math on a 4 or 6 digit one-time code with no rate limit, the verification races, the IP-rotation realities, and why self-OTP bypass still matters.
- Finding secrets in client bundles: grep the shipped code, then triage HIGH Harvesting keys from front-end JS, sourcemaps, and committed .env files, and the part that matters more than finding them: knowing which key actually bypasses your security model and which is harmless by design.
- Dangling CNAMEs at scale: certificate transparency plus diffing as a takeover engine HIGH How a forgotten CNAME becomes a claimable subdomain, and how to find them across a wide asset list using CT logs and resolver diffing instead of luck.
- Disposable out-of-band infrastructure on Cloudflare Workers INFO How I build throwaway OOB callback sinks, attacker JWKS hosts, OAuth redirect receivers, and smuggling relays on the edge, and why Workers beat a single VPS for blind-vuln confirmation.