Building an arbitrage bot
A working end-to-end CEX-CEX spread bot in Python. By the end you'll have a script that polls DataMaxi+ for cross-exchange premiums, decides when a spread is wide enough to act on, and hands off to a (mocked) execution layer.
This is intentionally a teaching example. The data side and the decision logic are real and you can run them today. Order placement requires a venue-specific exchange client (ccxt, binance-python, bybit-api, etc.) — that part is mocked here so the tutorial stays focused.
What you'll build
A loop that does this every few seconds:
- Poll the premium endpoint for a list of symbols.
- Filter to spreads above a configurable threshold.
- Size the trade against an account-balance cap.
- Submit mock buy/sell orders to the two legs.
- Log the decision with enough context to replay.
Strategy depth (which symbols, what threshold, fees, slippage, hedge sizing) is covered in CEX-CEX spread strategy. This tutorial is the engineering scaffolding around that strategy.
Prerequisites
- API key — see How to get an API key.
- Python 3.10+ (we'll use
matchand modern typing). httpxfor HTTP, or swap in the DataMaxi+ Python SDK if you prefer:
pip install httpx
export DTMX_API_KEY="your_api_key_here"
1. The data layer
DataMaxi+ already does the heavy lifting — for any symbol on multiple venues it computes the premium (relative price difference). Hit /api/v1/premium to get a snapshot.
# datafeed.py
import os
import httpx
DTMX_KEY = os.environ["DTMX_API_KEY"]
BASE = "https://api.datamaxiplus.com"
client = httpx.Client(
base_url=BASE,
headers={"X-DTMX-APIKEY": DTMX_KEY},
timeout=5.0,
)
def fetch_premium(symbol: str) -> list[dict]:
"""
Returns a list of per-(buy, sell) venue rows for `symbol`,
each with the current premium in basis points.
"""
r = client.get("/api/v1/premium", params={"symbol": symbol})
r.raise_for_status()
return r.json().get("data", [])
The exact response shape is documented in the Premium REST reference. For this tutorial we'll assume rows of:
{
"symbol": "BTC-USDT",
"buyExchange": "bybit",
"sellExchange": "binance",
"buyPrice": 98410.0,
"sellPrice": 98612.3,
"premium": 20.5
}
premium is in basis points. 20.5 bps is 0.205% — sell side is 0.205% above buy side.
2. The decision layer
Pure functions — no I/O, easy to unit-test.
# strategy.py
from dataclasses import dataclass
@dataclass(frozen=True)
class Opportunity:
symbol: str
buy_exchange: str
sell_exchange: str
buy_price: float
sell_price: float
premium_bps: float
def find_opportunities(
rows: list[dict],
threshold_bps: float,
) -> list[Opportunity]:
out = []
for row in rows:
if row["premium"] < threshold_bps:
continue
out.append(Opportunity(
symbol=row["symbol"],
buy_exchange=row["buyExchange"],
sell_exchange=row["sellExchange"],
buy_price=row["buyPrice"],
sell_price=row["sellPrice"],
premium_bps=row["premium"],
))
return out
def size_trade(
opp: Opportunity,
max_notional_usd: float,
available_balance_usd: float,
) -> float:
"""
Return the quote-currency notional to deploy on this opportunity.
Cap by the smaller of the configured max and current balance.
"""
return min(max_notional_usd, available_balance_usd)
Your real threshold_bps needs to cover taker fees on both venues plus a slippage buffer plus a margin of profit. A rough rule:
threshold_bps = (taker_bps_buy + taker_bps_sell) * 2 + slippage_bps + min_profit_bps
The * 2 is because you're paying fees on both the entry and the eventual unwind. For two 0.05% (5 bps) venues, expect a floor around 25-30 bps before any spread is actually profitable.
3. The execution layer (mocked)
Real execution means signing requests to two exchange APIs. That's venue-specific, requires your own keys, and is outside the scope of DataMaxi+. We mock the interface:
# execution.py
from strategy import Opportunity
class MockBroker:
def __init__(self, name: str):
self.name = name
def buy(self, symbol: str, notional_usd: float, price: float):
qty = notional_usd / price
print(f" [{self.name}] BUY {qty:.6f} {symbol} @ {price}")
return {"status": "filled", "qty": qty, "price": price}
def sell(self, symbol: str, notional_usd: float, price: float):
qty = notional_usd / price
print(f" [{self.name}] SELL {qty:.6f} {symbol} @ {price}")
return {"status": "filled", "qty": qty, "price": price}
BROKERS = {
"binance": MockBroker("binance"),
"bybit": MockBroker("bybit"),
"okx": MockBroker("okx"),
}
def execute(opp: Opportunity, notional_usd: float):
buy_br = BROKERS[opp.buy_exchange]
sell_br = BROKERS[opp.sell_exchange]
# In real life: place both legs concurrently to minimise leg-out risk.
buy_br.buy(opp.symbol, notional_usd, opp.buy_price)
sell_br.sell(opp.symbol, notional_usd, opp.sell_price)
To go live, swap MockBroker for a wrapper around your venue client. Keep the buy / sell interface, and the bot doesn't need to know.
4. The main loop
# bot.py
import time
from datafeed import fetch_premium
from strategy import find_opportunities, size_trade
from execution import execute
SYMBOLS = ["BTC-USDT", "ETH-USDT", "SOL-USDT"]
THRESHOLD_BPS = 30.0
MAX_NOTIONAL_USD = 1_000.0
POLL_INTERVAL_S = 5.0
def get_balance() -> float:
# Stub. In real life: query each exchange and return min usable.
return 5_000.0
def tick():
balance = get_balance()
for symbol in SYMBOLS:
rows = fetch_premium(symbol)
opps = find_opportunities(rows, THRESHOLD_BPS)
for opp in opps:
notional = size_trade(opp, MAX_NOTIONAL_USD, balance)
print(
f"[{opp.symbol}] {opp.buy_exchange} -> {opp.sell_exchange} "
f"{opp.premium_bps:.1f}bps notional=${notional:.0f}"
)
execute(opp, notional)
def main():
while True:
try:
tick()
except Exception as e:
print(f"tick failed: {e!r}")
time.sleep(POLL_INTERVAL_S)
if __name__ == "__main__":
main()
Run it:
python bot.py
You'll see a stream of decisions and mock fills. Nothing leaves your machine.
5. What you'd add before going live
The 200-line scaffold above is enough to demonstrate the pipeline; it is not enough to put money on. Before you go live:
▸ Fees. Pull each venue's actual taker fee per account tier; raise THRESHOLD_BPS accordingly. DataMaxi+ has /api/v1/trading-fees for the published schedules.
▸ Slippage model. A 30 bps spread on $100 notional may close to 5 bps once you're hitting $100k orders. Use top-of-book depth from the orderbook endpoints to estimate fill price.
▸ Inventory tracking. After execution you're long on one venue and short on another. Track positions. Unwind when the spread mean-reverts. Don't double-up on the same pair while a position is open.
▸ Funding rate hedge. If the "sell" leg is a perp, you're paying or receiving funding every interval. Subscribe to funding rates and incorporate into the threshold dynamically.
▸ Reconcile. Periodically cross-check exchange-reported balances with your internal state. Discrepancies catch bugs and missed fills.
▸ Concurrent legs. The mock places legs sequentially. Real execution must place both legs concurrently and have an abort path if one fills and the other doesn't (leg-out risk).
▸ Persistence. A bot that forgets its open positions on restart is a bot that will accidentally close them at a loss. Use a small SQLite/Postgres file.
▸ Rate limits. Polling premium every 5 seconds is fine. Polling 100 symbols every 100ms is not. See Rate Limits.
▸ Switch to WebSocket. Once your symbol list grows, replace fetch_premium polling with the premium WS stream. REST polling caps out fast.
Next steps
- CEX-CEX spread strategy — the strategy side of the same problem (sizing, risk, expected returns).
- Streaming funding rates — feed dynamic costs into your threshold.
- Python SDK quickstart — swap raw
httpxfor the typed SDK.