// case_study/smtbot.md

SMTbot case study.

A 24/7 crypto futures scalper on Bybit V5. The interesting story isn't the strategy — it's the two-day rewrite that replaced a brittle TradingView scraper chain with a native WebSocket pipeline, and the bot got ~6700× faster overnight without changing a single trade decision.

Bybit V5 native · ~640 tests · ~24 ms cycle
01 · Architecture

Event-driven, from bar boundary to order placement.

A Bybit V5 WebSocket fans live kline streams into a per-symbol buffer and a closed-bar scheduler. A Python-native Pine v6 emulator computes the indicators in-process — bit-perfect parity with the original TradingView readings, but at ~24 ms instead of ~160 s. The BotRunner cascades three entry paths through a single decision diamond; only TAKE reaches the Order Router. SQLite is the async journal underneath everything.

SMTbot architecture — four layers A four-stage pipeline. Stage one ingests Bybit V5 WebSocket kline streams through a WS client into a per-symbol KlineBuffer and a CycleScheduler. Stage two computes indicators with a Python-native Pine v6 emulator (Heikin Ashi, WaveTrend, MFI, EMA200, VWAP) into a MarketState snapshot. Stage three runs a BotRunner that dispatches evaluate_entry across three cascade paths (cross-based, pre-cross, HA reversal) into a TAKE / NO_SETUP / REJECT decision. Stage four executes only on TAKE via an Order Router into Bybit V5 REST, while every cycle writes to a SQLite journal that a FastAPI dashboard reads. 01 · DATA INGRESS streams · 1m closed-bar · gap-fill SOURCE Bybit V5 WS kline.{1,3,5,15} .SYMBOL CLIENT BybitWSClient async subscribe auto-reconnect BUFFER KlineBuffer · per-symbol deque gap-fill on reconnect CLOCK CycleScheduler · 1m closed-bar HEARTBEAT event-driven no polling · no GUI 25 pairs in parallel 02 · INDICATOR LAYER in-process Pine v6 emulator · bit-perfect parity BUILD MarketStateBuilder PINE v6 · EMULATOR HA · WaveTrend · MFI · EMA200 · VWAP 10/10 bit-perfect diff vs TradingView SNAPSHOT MarketState closed bars 03 · DECISION CASCADE three entry paths · one decision · single exit doctrine PER-SYMBOL CYCLE BotRunner._run_one_symbol CASCADE DISPATCH evaluate_entry() PATH 1 cross_based PATH 2 · fallback pre_cross PATH 3 · fast ha_reversal TAKE · NO_SETUP · REJECT 1m tick snapshot 04 · EXECUTE + JOURNAL TAKE only · append-only journal · out-of-band dashboard ROUTE Order Router EXEC Bybit V5 REST JOURNAL · SQLITE aiosqlite OBSERVE · READ-ONLY FastAPI Dashboard TAKE write · every cycle
02 · Stack

What the bot is made of.

Every component picked for one reason: it has to survive a 24/7 loop without you watching it. No frameworks for show, no abstractions that aren't paying for themselves.

Streams
  • Bybit V5 WebSocket kline.{1,3,5,15}.SYMBOL subscriptions via pybit
  • BybitWSClient async subscribe + auto-reconnect, no GUI dependency
  • KlineBuffer per-symbol deque with gap-fill on reconnect
  • CycleScheduler 1m closed-bar dispatch + heartbeat — event-driven, never polling
Brain
  • Python 3.11+ fully async (asyncio), event-driven from bar boundary to order
  • Pine v6 emulator HA · WaveTrend · MFI · EMA200 · VWAP — bit-perfect parity, in-process
  • evaluate_entry() cascade dispatch: cross_based → pre_cross → ha_reversal
  • Exit doctrine position-attached SL, post-only reduce-only TP, idempotent BE lock
  • pydantic · pyyaml typed config + env overrides, schema validation at boot
  • FastAPI + uvicorn read-only journal viewer, 5 s poll, live PnL ledger
Execution
  • Bybit V5 REST futures trading via pybit (V5 endpoints, hedge-aware)
  • Order Router routes only on TAKE — NO_SETUP and REJECT short-circuit before the wire
  • UTA + hedge mode Unified Trading Account, cross-margin USDT/USDC
  • Limit + SL doctrine no market entries, ever — risk bounded by the stop, maker rebate by design
Tuning & Tests
  • aiosqlite append-only async journal — trades, decision log, position snapshots
  • pytest ~640 cases pinning strategy, data pipeline, journal, execution
  • Optuna (TPE + CMA-ES) two-stage walk-forward tune — TPE for wide search, CMA-ES for refinement
  • Claude Code pair-programming AI — codes, audits diffs (decisions stay in Python)
  • stable-baselines3 RL roadmap (Phase 6) — current bot is rule-based VMC
03 · Problems

Three hard problems.

P1

The cutover — and what survives it.

The first data layer was a chain of indirection: TradingView Desktop → Electron CDP → Node.js MCP daemon → Python bridge → cell-by-cell signal-table parser. It worked, but pinned the bot to a Windows machine with TV open, serial-swept 15 symbols in ~160 seconds, and forced the strategy to think in 5-minute cadence. A two-day rewrite replaced the entire chain with a Bybit V5 WebSocket + Python-native Pine v6 emulator. The hard part wasn't speed — it was parity. A single off-by-one in the HA streak counter or a wrong sign in WaveTrend would silently change every entry decision. A diagnostic diffed every signal cell between the old and new pipelines: 10/10 bit-perfect on the first clean build. Post-cutover backtest cohort matched the pre-cutover cohort within ±0.01R per trade. No regression.

P2

Three cascade entry paths, one scoring engine.

A mean-reversion entry that fires at the moment a trend exhausts: WaveTrend cross + Heikin Ashi color flip + a multi-timeframe soft factor stack. If the primary path (cross_based) doesn't fire, the bot falls through to a slope-based pre_cross detector, then to a fast ha_reversal detector. Each path feeds the same scoring engine against direction-aligned signals on 5m + 15m + 3m timeframes plus a BTC/ETH composite bias. The cascade short-circuits on the first TAKE — no double-counting, no path-fighting.

P3

A single exit doctrine.

One position, three exits, no Python timers. Position-attached SL lives on Bybit, not in the bot — it fires even if the process is dead. Post-only reduce-only TP limit sits in the book, so a fill collects the maker rebate instead of paying taker. An idempotent break-even lock moves the stop to entry the moment unrealized P&L crosses a threshold — and idempotent means: if the lock has already moved the stop, calling it again is a no-op. That property survives reconnects, crashes, clock skew, and the 3am cycle that overlaps a fill.

04 · What I learned

Three lines I'd write on a Post-it.

  • Latency is architecture, not optimization. The ~6700× speedup didn't come from tuning the chain — it came from replacing it. When the budget is two orders of magnitude, the answer is rarely a faster loop.
  • Parity is a first-class requirement when you port a chart engine. A 10/10 bit-perfect signal diff is what made the post-cutover backtest cohort credible. Without it, every divergence is a discussion instead of a fact.
  • Publish the architecture, withhold the parameters. The tuned weights — RR multiples, soft factor weights, per-symbol risk — are the edge. Publishing them turns a private signal into a crowded one. They live in the private repo, and they stay there.