Skip to main content

Building an arbitrage bot

A working end-to-end CEX-CEX spread bot in Python. By the end you'll have a script that polls DataMaxi+ for cross-exchange premiums, decides when a spread is wide enough to act on, and hands off to a (mocked) execution layer.

This is intentionally a teaching example. The data side and the decision logic are real and you can run them today. Order placement requires a venue-specific exchange client (ccxt, binance-python, bybit-api, etc.) — that part is mocked here so the tutorial stays focused.

What you'll build

A loop that does this every few seconds:

  1. Poll the premium endpoint for a list of symbols.
  2. Filter to spreads above a configurable threshold.
  3. Size the trade against an account-balance cap.
  4. Submit mock buy/sell orders to the two legs.
  5. Log the decision with enough context to replay.

Strategy depth (which symbols, what threshold, fees, slippage, hedge sizing) is covered in CEX-CEX spread strategy. This tutorial is the engineering scaffolding around that strategy.

Prerequisites

  • API key — see How to get an API key.
  • Python 3.10+ (we'll use match and modern typing).
  • httpx for HTTP, or swap in the DataMaxi+ Python SDK if you prefer:
pip install httpx
export DTMX_API_KEY="your_api_key_here"

1. The data layer

DataMaxi+ already does the heavy lifting — for any symbol on multiple venues it computes the premium (relative price difference). Hit /api/v1/premium to get a snapshot.

# datafeed.py
import os
import httpx

DTMX_KEY = os.environ["DTMX_API_KEY"]
BASE = "https://api.datamaxiplus.com"

client = httpx.Client(
base_url=BASE,
headers={"X-DTMX-APIKEY": DTMX_KEY},
timeout=5.0,
)

def fetch_premium(symbol: str) -> list[dict]:
"""
Returns a list of per-(buy, sell) venue rows for `symbol`,
each with the current premium in basis points.
"""
r = client.get("/api/v1/premium", params={"symbol": symbol})
r.raise_for_status()
return r.json().get("data", [])

The exact response shape is documented in the Premium REST reference. For this tutorial we'll assume rows of:

{
"symbol": "BTC-USDT",
"buyExchange": "bybit",
"sellExchange": "binance",
"buyPrice": 98410.0,
"sellPrice": 98612.3,
"premium": 20.5
}

premium is in basis points. 20.5 bps is 0.205% — sell side is 0.205% above buy side.

2. The decision layer

Pure functions — no I/O, easy to unit-test.

# strategy.py
from dataclasses import dataclass

@dataclass(frozen=True)
class Opportunity:
symbol: str
buy_exchange: str
sell_exchange: str
buy_price: float
sell_price: float
premium_bps: float

def find_opportunities(
rows: list[dict],
threshold_bps: float,
) -> list[Opportunity]:
out = []
for row in rows:
if row["premium"] < threshold_bps:
continue
out.append(Opportunity(
symbol=row["symbol"],
buy_exchange=row["buyExchange"],
sell_exchange=row["sellExchange"],
buy_price=row["buyPrice"],
sell_price=row["sellPrice"],
premium_bps=row["premium"],
))
return out

def size_trade(
opp: Opportunity,
max_notional_usd: float,
available_balance_usd: float,
) -> float:
"""
Return the quote-currency notional to deploy on this opportunity.
Cap by the smaller of the configured max and current balance.
"""
return min(max_notional_usd, available_balance_usd)

Your real threshold_bps needs to cover taker fees on both venues plus a slippage buffer plus a margin of profit. A rough rule:

threshold_bps = (taker_bps_buy + taker_bps_sell) * 2 + slippage_bps + min_profit_bps

The * 2 is because you're paying fees on both the entry and the eventual unwind. For two 0.05% (5 bps) venues, expect a floor around 25-30 bps before any spread is actually profitable.

3. The execution layer (mocked)

Real execution means signing requests to two exchange APIs. That's venue-specific, requires your own keys, and is outside the scope of DataMaxi+. We mock the interface:

# execution.py
from strategy import Opportunity

class MockBroker:
def __init__(self, name: str):
self.name = name

def buy(self, symbol: str, notional_usd: float, price: float):
qty = notional_usd / price
print(f" [{self.name}] BUY {qty:.6f} {symbol} @ {price}")
return {"status": "filled", "qty": qty, "price": price}

def sell(self, symbol: str, notional_usd: float, price: float):
qty = notional_usd / price
print(f" [{self.name}] SELL {qty:.6f} {symbol} @ {price}")
return {"status": "filled", "qty": qty, "price": price}

BROKERS = {
"binance": MockBroker("binance"),
"bybit": MockBroker("bybit"),
"okx": MockBroker("okx"),
}

def execute(opp: Opportunity, notional_usd: float):
buy_br = BROKERS[opp.buy_exchange]
sell_br = BROKERS[opp.sell_exchange]
# In real life: place both legs concurrently to minimise leg-out risk.
buy_br.buy(opp.symbol, notional_usd, opp.buy_price)
sell_br.sell(opp.symbol, notional_usd, opp.sell_price)

To go live, swap MockBroker for a wrapper around your venue client. Keep the buy / sell interface, and the bot doesn't need to know.

4. The main loop

# bot.py
import time
from datafeed import fetch_premium
from strategy import find_opportunities, size_trade
from execution import execute

SYMBOLS = ["BTC-USDT", "ETH-USDT", "SOL-USDT"]
THRESHOLD_BPS = 30.0
MAX_NOTIONAL_USD = 1_000.0
POLL_INTERVAL_S = 5.0

def get_balance() -> float:
# Stub. In real life: query each exchange and return min usable.
return 5_000.0

def tick():
balance = get_balance()
for symbol in SYMBOLS:
rows = fetch_premium(symbol)
opps = find_opportunities(rows, THRESHOLD_BPS)
for opp in opps:
notional = size_trade(opp, MAX_NOTIONAL_USD, balance)
print(
f"[{opp.symbol}] {opp.buy_exchange} -> {opp.sell_exchange} "
f"{opp.premium_bps:.1f}bps notional=${notional:.0f}"
)
execute(opp, notional)

def main():
while True:
try:
tick()
except Exception as e:
print(f"tick failed: {e!r}")
time.sleep(POLL_INTERVAL_S)

if __name__ == "__main__":
main()

Run it:

python bot.py

You'll see a stream of decisions and mock fills. Nothing leaves your machine.

5. What you'd add before going live

The 200-line scaffold above is enough to demonstrate the pipeline; it is not enough to put money on. Before you go live:

Fees. Pull each venue's actual taker fee per account tier; raise THRESHOLD_BPS accordingly. DataMaxi+ has /api/v1/trading-fees for the published schedules.

Slippage model. A 30 bps spread on $100 notional may close to 5 bps once you're hitting $100k orders. Use top-of-book depth from the orderbook endpoints to estimate fill price.

Inventory tracking. After execution you're long on one venue and short on another. Track positions. Unwind when the spread mean-reverts. Don't double-up on the same pair while a position is open.

Funding rate hedge. If the "sell" leg is a perp, you're paying or receiving funding every interval. Subscribe to funding rates and incorporate into the threshold dynamically.

Reconcile. Periodically cross-check exchange-reported balances with your internal state. Discrepancies catch bugs and missed fills.

Concurrent legs. The mock places legs sequentially. Real execution must place both legs concurrently and have an abort path if one fills and the other doesn't (leg-out risk).

Persistence. A bot that forgets its open positions on restart is a bot that will accidentally close them at a loss. Use a small SQLite/Postgres file.

Rate limits. Polling premium every 5 seconds is fine. Polling 100 symbols every 100ms is not. See Rate Limits.

Switch to WebSocket. Once your symbol list grows, replace fetch_premium polling with the premium WS stream. REST polling caps out fast.

Next steps