
Back in 2020 I wrote about rate limiting writes to MySQL with ProxySQL. The main idea is still valid, but two things changed enough to write it again. ProxySQL moved on, and the global delay variable that the old post used is now the rough way to do it. We also have a better way to answer a question the original post did not really solve: not only if we should throttle, but which writes to throttle.
This version rewrites everything in Python, targets ProxySQL 3.0.x (the current Stable tier) and MySQL 8.4 LTS, and adds binlog change information so the throttle goes after the table that is really causing the replica lag, instead of guessing.
What a throttle is
In simple words: we hold the queries for a short time so the replicas can catch up. It is similar to Galera flow control, with one important difference. Flow control, when it triggers, stops the writes completely until the nodes catch up. A throttle never stops a write, it only delays it. You pay a bit of write latency and you get a replica that stays close to the primary, which is usually what you want during a traffic spike.
There are several ways to do this. One popular tool is Freno. But ProxySQL can also do it by itself, and that is what we are going to build.
Two mechanisms and why we use the per-rule one
The original post started with mysql-default_query_delay, a global variable that delays every query that passes through ProxySQL. It still exists and still works, it is not deprecated. But it is global by definition: it cannot tell one table from another, and that difference is the whole point of this article.
The more precise mechanism is the per-rule delay column in mysql_query_rules: the number of milliseconds to delay the execution of any query that matches the rule. It is meant for throttling and QoS. We use it for two reasons. First, it is precise, it only affects the queries that the rule matches, so we can target one single table. Second, when a rule sets delay, the global mysql-default_query_delay is ignored for that query, so the two do not fight each other.
The trigger: replica lag in MySQL 8.4

A short note about the names, because this confuses people. In MySQL 8.0.22 the replication status fields were renamed. The SHOW SLAVE STATUS and Seconds_Behind_Master from the 2020 script are now deprecated aliases. On 8.4 you should use SHOW REPLICA STATUS and Seconds_Behind_Source.
Reading it from Python with PyMySQL (the lightest option, no C extension to compile) looks like this:
def get_replica_lag(conn):
"""Return Seconds_Behind_Source as an int, or None when replication is
not reporting a value. NULL is NOT zero."""
with conn.cursor() as cur:
cur.execute("SHOW REPLICA STATUS") # 8.0.22+ / 8.4 wording
row = cur.fetchone()
if not row:
return None
val = row.get("Seconds_Behind_Source")
return int(val) if val is not None else None
Two details matter here, because both can leave a naive throttle stuck. Seconds_Behind_Source is NULL when replication is not really applying, for example the SQL thread is stopped, or the relay log is empty while the receiver thread is down.
That means unknown, not caught up, so we return None and handle that case on purpose. Also, the value compares the SQL (applier) thread against the IO (receiver) thread, so on a slow network it is only an approximation. For a throttle that reacts in seconds it is good enough, just do not read it as an exact transaction-level lag.
Where the 2020 version was guessing
The original script chose which rule to throttle by reading the ProxySQL hit counters and picking the rule with the fewest hits. In that case it was the DELETE rule, because it had the fewest hits at the proxy. The idea was to keep the collateral damage low, which is reasonable. But it uses the wrong signal.
The ProxySQL hit counts tell you how often a query runs as seen by the proxy. They tell you nothing about how much binlog each statement produces, and the binlog volume is exactly what the replica has to replay to catch up. One big multi-row write to a hot table can put more rows into the binlog than thousands of small point reads. If you throttle the rule with the fewest hits, you can be delaying queries that have little to do with the lag, while the real cause keeps running.
What you really want to know is: which table is producing the most change volume in the binlog right now. Then throttle the writes to that table.
Letting the binlog answer
So we need to know, cheaply and often, which table is producing the most binlog change right now. The binlog already has this information, we only need it in a form we can query.
bintrail does that. Its stream command tails the binlog and indexes every row event into a MySQL table, binlog_events, with the schema name, the table name, the event type and the event timestamp. Once that table exists, “which table is hottest” is one GROUP BY:
SELECT schema_name, table_name, COUNT(*) AS c
FROM binlog_events
WHERE event_timestamp >= UTC_TIMESTAMP() - INTERVAL 60 SECOND
GROUP BY schema_name, table_name
ORDER BY c DESC
LIMIT 1;
The index is local and the query is cheap, so the answer is current within seconds, fast enough for a control loop that reacts in seconds. In Python:
def top_table_by_volume(index):
sql = (
"SELECT schema_name, table_name, COUNT(*) AS c "
"FROM binlog_events "
"WHERE event_timestamp >= UTC_TIMESTAMP() - INTERVAL %s SECOND "
"GROUP BY schema_name, table_name "
"ORDER BY c DESC LIMIT 1"
)
index.ping(reconnect=True)
with index.cursor() as cur:
cur.execute(sql, (VOLUME_WINDOW_S,))
row = cur.fetchone()
if not row:
return None
return row[0], row[1]
Three things to be careful about.
First, this returns absolute counts, not a rate. For “which table is hottest” the ranking is all we need, so the raw count is fine; if you want changes per second, divide by your window.
Second, pin the index session to UTC. bintrail stores the binlog event timestamps in UTC, so comparing against UTC_TIMESTAMP() keeps the window correct no matter what timezone the index server runs in. The script does SET time_zone = '+00:00' on connect.
Third, do not turn a query error into “no hot table”. If the index is down or the connection blips, the query raises, and the caller keeps the previous target instead of reading the error as “nothing is hot”. An error that silently becomes “nothing is hot” would stop the throttle while the replica keeps lagging, which is the opposite of what we want.
The lookup is a network round trip to the index, so we do not call it in a tight loop. We poll the replica lag often, because that is cheap and local, and we only re-query the hot table every few seconds.
Applying the delay to the right rule
Now we turn “schema.table is hot” into a ProxySQL action. We keep one dedicated throttle rule and set its delay. There are two mistakes the obvious version makes:
- Do not throttle reads. A plain table-name match also catches
SELECT, and reads do not produce binlog, so delaying them only hurts queries that are not causing the lag. We anchor the match to a write verb. - Do not ignore the schema. If you match only the table name, a table with the same name in another schema gets throttled too. We put the schema in the pattern.
def set_throttle(conn, schema, table, delay_ms):
# ProxySQL allows PCRE (default since v1.4.0) and RE2. We avoid lookbehind
# and consume a separator char right before the table token, so `orders`
# won't match inside `customer_orders` (the char there is `_`, not a separator).
sch = re.escape(schema)
tbl = re.escape(table)
pattern = (
rf"(?is)^\s*(?:INSERT|UPDATE|DELETE|REPLACE)\b"
rf".*[\s,(]`?(?:{sch}`?\.`?)?{tbl}`?\b"
)
with conn.cursor() as cur:
cur.execute(
"""REPLACE INTO mysql_query_rules
(rule_id, active, match_digest, delay, apply)
VALUES (%s, 1, %s, %s, 0)""",
(THROTTLE_RULE_ID, pattern, delay_ms),
)
cur.execute("LOAD MYSQL QUERY RULES TO RUNTIME")
match_digest matches the normalized query digest (the literal values are replaced), using a regular expression. ProxySQL supports both PCRE and RE2 engines, and PCRE is the default since v1.4.0. The schema part is optional in the pattern because many applications send unqualified writes against their default database. The price is that an unqualified write to a same-named table in another schema can still match. If you need stricter isolation, use match_pattern against the raw query.
One ordering detail. ProxySQL evaluates the rules in ascending rule_id, and a rule with apply=1 stops the evaluation for that query. So the throttle rule has to come before any apply=1 routing rule, or your routing rules have to use apply=0, which is how the statement-type setup in the original post was configured. Choose a rule_id with that in mind. And check stats_mysql_query_rules.hits to confirm that the rule is matching real traffic, because a rule that matches nothing throttles nothing while it looks like it is working. That hits counter is eventually consistent, it lags the live count by about one query, so read it over a few seconds and not query by query.
One deployment note: the script talks to the ProxySQL admin interface (port 6032). On ProxySQL 3.0 the default admin account only accepts connections from the ProxySQL host itself, so either run the script on that host, or add a second non-local admin credential (admin_credentials="admin:admin;radmin:radmin") and point the script to that one.
def clear_throttle(conn):
with conn.cursor() as cur:
cur.execute("DELETE FROM mysql_query_rules WHERE rule_id=%s", (THROTTLE_RULE_ID,))
cur.execute("LOAD MYSQL QUERY RULES TO RUNTIME")
Why not throttle the SELECTs? They are most of the traffic
This is the normal question. On almost any real database the reads are much more than the writes, so throttling reads would have the biggest effect. Is that not where the leverage is?
It is the biggest effect on proxy throughput, and no effect at all on what we are fixing. The trigger is the replica apply lag, and that lag depends only on write volume. Reads never go into the binlog, the replica never receives them, and the replica SQL thread never replays them. Throttling reads would slow down the 90% of traffic that has nothing to do with the lag, and the lag would stay the same.
You do not have to trust me. On a primary and replica pair, watch the binlog position and the number of replicated transactions while you send traffic through ProxySQL:

| Traffic through ProxySQL | binlog position | new replicated transactions |
|---|---|---|
| baseline | 6573 | – |
500 × SELECT | 6573 (unchanged) | +0 |
200 × UPDATE | 66773 (+60 KB) | +200 |
Five hundred reads add nothing to the replication stream. Two hundred writes are the whole work the replica has to apply. The reads are not slow replication, they are invisible to it.
This is also why a binlog signal is the right one and the 2020 hit-counter is not. Because the reads dominate, any counter based on query frequency is full of SELECTs and points you to the wrong rule. The binlog_events index records only writes, so a GROUP BY over it cannot be confused by read volume. The fact that most traffic is reads is the reason to use a write-only signal, not a reason against it.
There is one case where reads do matter: a primary that is CPU or IO bound because of read load. But that is a different problem with a different solution, read/write splitting, and a write-volume tool is not the right one for it. For the replica lag, the reads are negligible.
The control loop

The shape is the same as in 2020: watch the lag, apply the delay when it grows, remove it when the replica catches up. What changes is the diagnosis step, plus three robustness points that the bash version could ignore but a long running Python daemon cannot:
- Use hysteresis, not exact zero. Engage at
lag >= 3, release atlag <= 1. If you release only at exactly0, the throttle stays forever on a busy replica that sits at 1-2 seconds. - A flaky dependency must not kill the daemon. Each iteration is wrapped, so an index timeout or a ProxySQL problem is logged and retried, never fatal. A crash would leave the throttle active with nobody watching.
- Do not reload rules that did not change.
LOAD MYSQL QUERY RULES TO RUNTIMErebuilds the whole active rule set under a lock, so we only call it when the hot table really changes.
def tick(proxysql, replica, index, state):
replica.ping(reconnect=True)
lag = get_replica_lag(replica)
if lag is None: # replication broken/unknown, surface it
print("WARN: replica lag is NULL - replication stopped/broken", file=sys.stderr)
if state["throttling"]:
clear_throttle(proxysql)
state.update(throttling=False, hot=None, rule_table=None)
return
print(f"lag={lag}s")
if lag >= LAG_THRESHOLD_S:
now = time.monotonic()
if state["hot"] is None or (now - state["last_hotspot"]) >= HOTSPOT_REFRESH_S:
new_hot = top_table_by_volume(index) # raises on error -> retried
if new_hot is not None:
state["hot"] = new_hot
state["last_hotspot"] = now
if state["hot"]:
schema, table = state["hot"]
if not state["throttling"] or table != state["rule_table"]:
set_throttle(proxysql, schema, table, DELAY_MS)
state.update(throttling=True, rule_table=table)
print(f"throttling {schema}.{table} by {DELAY_MS}ms")
elif state["throttling"] and lag <= CLEAR_THRESHOLD_S:
clear_throttle(proxysql)
state.update(throttling=False, hot=None, rule_table=None)
print(f"lag back to {lag}s, throttle removed")
And the signal handler that removes the throttle on exit, the equivalent of the trap in the bash version, with one rule: never print “cleared” if the clear actually failed. A false success is worse than a clear failure, because the operator leaves thinking the writes are at full speed when they are still delayed.
def shutdown(*_):
try:
clear_throttle(proxysql)
except Exception as exc:
print(f"ERROR: failed to clear throttle - rule {THROTTLE_RULE_ID} may still "
f"be ACTIVE. Remove: DELETE FROM mysql_query_rules WHERE "
f"rule_id={THROTTLE_RULE_ID}; ({exc})", file=sys.stderr)
sys.exit(1)
print("throttle cleared, bye")
sys.exit(0)
The complete script that you can run is here: https://github.com/dbtrail/bintrail/blob/main/scripts/adaptive_throttle.py
Beyond lag
The lag is only one possible trigger. Replaceget_replica_lag with anything you can read from a status variable: Threads_running for contention, the InnoDB checkpoint age for IO pressure. The throttle mechanism does not care what triggered it.
The new freedom is on the other side of the loop. Because the binlog is indexed and queryable, you can split the cause further by adding event_type to the same GROUP BY. The throttle can be as specific as “delay the UPDATEs to this one table” instead of “delay all the writes and hope”.
With the binlog indexed and queryable, the throttle stops guessing and starts aiming.