MySQL 9 Binlog: What's Actually Different
I added MySQL 9.x support to MygramDB. MygramDB ingests data from MySQL via GTID-based binlog replication and parses the binary format itself. If an INSERT or UPDATE hits a table with a VECTOR column, the parser encounters an unknown column type and breaks. It had to be dealt with.
I lined up the source code of MySQL 8.4.8 and 9.6.0 and compared every piece of binlog-related code. Event headers, event types, GTID format, ROWS events, TABLE_MAP metadata — I diffed everything.
The binlog has spent over 20 years accumulating backward compatibility. Event headers carry event lengths so unknown events can be skipped. Metadata uses TLV for extensibility. That design held up in 9.x. The format's skeleton is intact.
New data types, though, are a different story. When parsing row data in a ROWS event, the byte layout is determined by the column type. An unknown column type means not knowing how many bytes to advance. You can't even skip it.
What Stayed the Same
The reassuring part first.
| Area | Changed? |
|---|---|
| Event header (19 bytes, fixed) | Identical |
| Checksum (CRC32, 4 bytes) | Identical |
| Binlog version | Still 4 |
| Event type numbers (0–42) | Identical |
| GTID / Tagged GTID wire format | Identical |
| Transaction Payload compression (type 40) | Identical |
| DECIMAL binary encoding | Identical |
| Packed integer | Identical |
Tagged GTID events (type 42) were already introduced in 8.4. No changes in 9.x. The skeleton of the binlog stream is completely unchanged. Most existing binlog parsers can read a 9.x stream with zero modifications.
What Changed
Three changes. Only one of them forces binlog consumers to act.
1. MYSQL_TYPE_VECTOR = 242 — The Only Breaking Change
MySQL 9.0 added the VECTOR type. include/field_types.h now has MYSQL_TYPE_VECTOR = 242. In 8.4.8, code 242 was unassigned.
The background is the explosive rise of RAG. Since LLMs went mainstream, the demand for storing embedding vectors in databases and running similarity searches has surged. PostgreSQL got there first with pgvector (2021 onward) and was becoming the de facto choice for vector search. MySQL 9.0 (July 2024, Innovation Release) added a native VECTOR type and HNSW indexes to catch up. It's a separate release line from the 8.4 LTS, but any tool that reads 9.x binlog streams has to deal with it.
As a side effect of this competition between databases, the binlog gained a new column type. The good news: VECTOR's binlog encoding is identical to BLOB.
In MySQL's source (sql/field.h), Field_vector inherits from Field_blob. packlength is fixed at 4 (4-byte length prefix). The payload is an array of IEEE 754 float32 values — dimensions × 4 bytes, fixed length.
TABLE_MAP metadata is the same 1 byte as other BLOB types (indicating the length prefix size). ROWS encoding is the same length prefix + variable-length binary.
In practice, add type 242 to the same branch as BLOB (252) in field size calculation, and add VECTOR to the BLOB handling in TABLE_MAP metadata parsing and ROWS decoding. For MygramDB, it was a few extra case statements across three files.
How mysqlbinlog displays VECTOR
MySQL's mysqlbinlog decodes VECTOR data using float4get() and prints something like [1.00000e+00,2.00000e+00,3.00000e+00] (log_event.cc, lines 2040–2051). Binlog consumers don't need to do the same. Passing the binary data through as-is is often enough. MygramDB converts it to a hex string.
2. VECTOR_DIMENSIONALITY — A New TABLE_MAP Metadata Field
A new TLV field, VECTOR_DIMENSIONALITY, was added to TABLE_MAP optional metadata (rows_event.h). It stores the dimensionality of VECTOR columns.
This is a "safe to ignore" change. TABLE_MAP optional metadata uses TLV (Type-Length-Value) encoding — unknown types can be skipped by reading the length and advancing past them. This mechanism has been the same since before 8.4. If your TLV skipping is implemented correctly, there's nothing to do.
Only add parsing if you need the dimensionality (for VECTOR column validation, etc.).
3. USE_SQL_FOREIGN_KEY_F — A New Row Event Flag
Bit 4 (1U << 4) was added to ROWS event flags (rows_event.h). It indicates whether SQL-standard foreign key constraint handling is enabled.
| Bit | Name | Meaning |
|---|---|---|
| 0 | STMT_END_F | End of transaction |
| 1 | NO_FOREIGN_KEY_CHECKS_F | Foreign key checks disabled |
| 2 | RELAXED_UNIQUE_CHECKS_F | Relaxed unique constraint checks |
| 3 | COMPLETE_ROWS_F | All columns included |
| 4 | USE_SQL_FOREIGN_KEY_F | SQL-standard FK handling (new in 9.x) |
| 5–15 | — | Reserved |
Also safe to ignore. If your implementation ignores unknown flag bits, no action needed. Only consumers that interpret individual row event flags for replication control need to care.
What Changed Outside the Binlog
Not binlog format changes per se, but two changes that affect tools implementing binlog replication.
binlog_format=ROW Deprecated
The --binlog-format option is deprecated in MySQL 9.x. Row-based is the only format. There was almost no reason to use anything else in production as of 8.x, so the practical impact is small. But if your Docker Compose or my.cnf explicitly sets --binlog-format=ROW, you'll get a startup warning. Remove it.
mysql_native_password Removed
The server-side mysql_native_password plugin is gone in 9.x. caching_sha2_password is the default and only option.
SSL connections work as before. For non-SSL connections, the client needs to set MYSQL_OPT_GET_SERVER_PUBLIC_KEY to fetch the RSA public key. For MygramDB, I enabled this via libmysqlclient's mysql_options().
mysql_native_password still exists on the client side
Confusingly, libmysqlclient (the client library) still has a mysql_native_password implementation. In 9.6.0, it just moved to libmysql/authentication_native_password/. What was removed is the server-side plugin. Connecting from a client to a pre-9.x server still works.
Summary
The full comparison of MySQL 8.4 and 9.x binlog formats:
| Change | Action required? | Why |
|---|---|---|
| MYSQL_TYPE_VECTOR (242) | Yes | Unknown column types break parsing |
| VECTOR_DIMENSIONALITY metadata | No | TLV skip handles it |
| USE_SQL_FOREIGN_KEY_F flag | No | Unknown bits safely ignored |
| binlog_format=ROW deprecated | Remove the setting | Just a warning |
| mysql_native_password removed | Connection config change | Unrelated to binlog format |
The binlog's backward-compatible design works as expected in 9.x. Unknown events can be skipped. Unknown metadata fields can be skipped. Unknown flag bits can be ignored.
Unknown column types cannot be skipped. Row data in ROWS events has its byte layout determined by column type. An unknown type means not knowing how far to advance. Whether the binlog designers chose not to make this extensible, or whether type additions were too rare to matter, I don't know. Either way, VECTOR handling needs to be added. That its encoding is identical to BLOB is a mercy.