diff --git a/mariadb-query-optimization/SKILL.md b/mariadb-query-optimization/SKILL.md index 822e485..bf0ca36 100644 --- a/mariadb-query-optimization/SKILL.md +++ b/mariadb-query-optimization/SKILL.md @@ -7,291 +7,15 @@ description: "Best practices for query optimization in MariaDB — indexing stra *Last updated: 2026-05-25* -> **Requires:** MariaDB 10.1+ for `ANALYZE` and histograms; optimizer improvements through **11.8 LTS** (GA May 2025) form the baseline below. +> **Requires:** MariaDB 10.1+ for `ANALYZE` and histograms. > -> **Default context:** Assume MariaDB **11.8 LTS** unless the user states another version. Features marked **12.x** may be suggested when relevant (including as upgrade options), but always state the minimum version — do not present them as available on 11.8. +> **Server context:** See [MariaDB Versioning Context](../_shared/versioning.md). -## What LLMs Get Wrong +## Documentation -| Pattern | What to do instead | -|---|---| -| `SELECT * FROM table LIMIT 10 OFFSET 50000` | Use cursor-based pagination — `OFFSET` scans all skipped rows | -| Blanket rule "functions on indexed columns kill indexes" | Outdated on MariaDB 11.1+/11.3+ for many cases. `YEAR(col) = const` and `UPPER(col) = const` on case-insensitive columns can now use indexes — see [Functions on indexed columns](#functions-on-indexed-columns) below | -| Adding an index to a low-cardinality column (boolean, status with 2-3 values) | Optimizer skips indexes with low selectivity and does a table scan anyway | -| Not running `ANALYZE TABLE` after bulk inserts | Histogram statistics become stale; optimizer makes poor plan choices | -| Composite index `(a, b, c)` used in `WHERE b = 1 AND c = 2` | Leftmost prefix rule: this skips `a`, so the index is not used | -| `SELECT *` in queries with JOINs | Name only the columns needed — prevents accidentally blocking covering indexes | - -## Reading EXPLAIN - -Run `EXPLAIN` before tuning anything: - -```sql -EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10; -``` - -**Red flags in the output:** - -| Field | Red flag | What it means | -|---|---|---| -| `type` | `ALL` | Full table scan — missing index or index not used | -| `key` | `NULL` | No index used despite one existing — check for function on column or type mismatch | -| `rows` | Very high number | Optimizer estimates scanning many rows | -| `Extra` | `Using filesort` | Expensive sort not covered by an index | -| `Extra` | `Using temporary` | Temp table created — often from `GROUP BY` or `DISTINCT` | -| `Extra` | `Using index` | ✅ Good — covering index, no table row access needed | - -**`ANALYZE`** statement (MariaDB 10.1+) actually executes the query and shows real row counts vs. estimates — more reliable than `EXPLAIN` alone. Note: MariaDB uses `ANALYZE`, not `EXPLAIN ANALYZE`: - -```sql -ANALYZE SELECT * FROM orders WHERE customer_id = 42; -``` - -**Optimizer Trace** shows the optimizer's full decision process. Since MariaDB 12.1 the trace can include full table and view definitions (`optimizer_record_context` system variable). Since 13.0 it also includes the specific statistics (histograms, index stats) used for cardinality estimates — together they're powerful for diagnosing surprising `rows` estimates: - -```sql -SET optimizer_trace = 'enabled=on'; -SELECT * FROM orders WHERE customer_id = 42; -SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G -SET optimizer_trace = 'enabled=off'; -``` - -## Indexing Rules - -### The Leftmost Prefix Rule - -For a composite index `(a, b, c)`, MariaDB can use: -- `WHERE a = 1` ✅ -- `WHERE a = 1 AND b = 2` ✅ -- `WHERE a = 1 AND b = 2 AND c = 3` ✅ -- `WHERE b = 2` ✗ — skips `a`, index not used -- `WHERE a = 1 AND c = 3` — only `a` part is used - -Put the most selective equality conditions first, then range conditions last: -```sql --- Query: WHERE status = 'active' AND created_at > '2025-01-01' ORDER BY created_at -INDEX (status, created_at) -- ✅ equality first, range last -INDEX (created_at, status) -- ✗ range first breaks the prefix for status -``` - -### Covering Indexes - -A covering index includes all columns needed by the query — no table row access needed (`Using index` in EXPLAIN): - -```sql --- Query fetches id, status, created_at for a customer --- Covering index includes all three: -CREATE INDEX idx_customer_cover ON orders (customer_id, status, created_at); --- Now EXPLAIN shows: Extra = Using index -``` - -### When NOT to Add an Index - -- **Low-cardinality columns**: a `status` column with values `active`/`inactive` affects 50% of rows — the optimizer prefers a table scan. Index useful only when combined with other high-selectivity columns. -- **Small tables** (< a few thousand rows): full scans are faster than index lookups for tiny tables. -- **Write-heavy columns**: every index slows `INSERT`, `UPDATE`, `DELETE` — don't index columns that are rarely queried. - -### Functions on Indexed Columns - -The classic rule "any function on an indexed column disables the index" is **outdated for MariaDB 11.1+ and 11.4 LTS**. The optimizer can now use indexes for a number of common function patterns: - -| Pattern | Works on the index? | Since | -|---|---|---| -| `WHERE YEAR(col) = 2025` | ✅ — sargable, picks the right range | 11.1+ (MDEV-8320) | -| `WHERE DATE(col) <= '2025-12-31'` | ✅ — sargable | 11.1+ (MDEV-8320) | -| `WHERE UPPER(varchar_col) = '...'` on a case-insensitive collation (e.g. `utf8mb4_uca1400_ai_ci`) | ✅ — `sargable_casefold=ON` is the default | 11.3+ (MDEV-31496) | -| `WHERE SUBSTR(col, 1, n) = 'abc'` | ✅ — leading-prefix `SUBSTR` is optimized | 11.8+ (MDEV-34911) | -| `WHERE LOWER(case_sensitive_col) = '...'` | ✗ — index not used (collation isn't case-insensitive) | — | -| `WHERE CAST(col AS UNSIGNED) = 1` or other type-changing transforms | ✗ — index not used | — | - -For cases that the optimizer still can't sargabilize, the rewrite-to-range pattern remains valid: - -```sql -WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01' -``` - -Verify with `EXPLAIN` rather than assuming: on 11.4+ many "won't use the index" rewrites are now no-ops. If `EXPLAIN` still shows `type=ALL` for a sargable pattern, check `@@optimizer_switch` for `sargable_casefold` and confirm the column's collation is `_ci`. - -## Pagination: Cursor-Based Instead of OFFSET - -`OFFSET` is a hidden performance trap. `LIMIT 10 OFFSET 50000` scans and discards 50,000 rows on every page load. - -```sql --- ✗ Slow — scans 50,000 rows to skip them: -SELECT id, title FROM posts ORDER BY id DESC LIMIT 10 OFFSET 50000; - --- ✅ Fast — index seek directly to the cursor position: --- First page: -SELECT id, title FROM posts ORDER BY id DESC LIMIT 10; - --- Next page (pass last id from previous result as $last_id): -SELECT id, title FROM posts WHERE id < $last_id ORDER BY id DESC LIMIT 10; -``` - -For filtered queries, include the filter column in the index alongside id: - -```sql --- Query: WHERE category = 'news' ORDER BY id DESC -CREATE INDEX idx_cat_id ON posts (category, id); --- Cursor query: -SELECT id, title FROM posts WHERE category = 'news' AND id < $last_id ORDER BY id DESC LIMIT 10; -``` - -To detect whether another page exists, fetch `LIMIT 11` and check if the 11th row appears. - -## Histogram Statistics - -Histograms let the optimizer understand data distribution on non-indexed columns — critical for query plan quality on complex queries. Without them, the optimizer assumes uniform distribution and can choose wrong join orders. - -```sql --- Collect histograms for a table (requires a full scan — run during low traffic): -ANALYZE TABLE orders; - --- Verify histograms were collected: -SELECT * FROM mysql.column_stats WHERE table_name = 'orders'; -``` - -**When to run `ANALYZE TABLE`:** -- After bulk inserts or large data changes -- When `EXPLAIN` shows unexpectedly high `rows` estimates -- After initially creating a table and loading data - -**Tune histogram granularity** for tables with highly skewed data distributions: -```sql -SET histogram_size = 100; -- default is 0 (disabled) in older versions, 254 in 10.4.3+ -ANALYZE TABLE orders; -``` - -Histograms are collected per-column automatically when using `ANALYZE TABLE` with `histogram_size > 0`. They are stored in `mysql.column_stats` and consulted when `optimizer_use_condition_selectivity >= 4` (default in 10.4.1+). - -## MariaDB Optimizer Switches - -MariaDB's optimizer has more tunable flags than MySQL. The most useful for developers: - -```sql --- See current settings: -SELECT @@optimizer_switch\G - --- Disable a specific optimization for a session (useful for debugging): -SET optimizer_switch = 'derived_merge=off'; - --- Re-enable: -SET optimizer_switch = 'derived_merge=on'; -``` - -**Most impactful flags:** - -| Flag | Default | Effect | -|---|---|---| -| `derived_merge` | on | Merges derived tables into outer query — usually faster | -| `semijoin` | on | Optimizes `IN`/`EXISTS` subqueries — disable to debug unexpected plans | -| `subquery_cache` | on | Caches correlated subquery results — big win for repeated subqueries | -| `rowid_filter` | on | Pre-filters rowids before fetching rows — helps range queries | -| `mrr` | off | Multi-Range Read — enable for large range scans on spinning disks | - -Turn flags off one at a time to isolate which optimization is causing a bad plan, then report via JIRA if a default setting produces a worse plan than the alternative. - -### Optimizer Improvements in the 10.7–10.11 LTS Window - -The 10.11 LTS line bundles features that arrived in the 10.7–10.10 short-term releases: - -- **JSON-format histograms** (10.8+, MDEV-21130, MDEV-26519) — histogram statistics are stored in JSON and are more precise than the older binary format. Just running `ANALYZE TABLE` on 10.8+ gives the optimizer better cardinality estimates. -- **Descending indexes** (10.8+, MDEV-13756) — `CREATE INDEX idx ON t (a ASC, b DESC)` is supported; useful for composite `ORDER BY a, b DESC` patterns and for `MIN()`/`MAX()` on descending indexes. -- **`SHOW ANALYZE [FORMAT=JSON]`** (10.9+, MDEV-27021) — get the optimizer plan and runtime stats for a query running in another connection without intrusion. `EXPLAIN FOR CONNECTION` syntax also supported (MDEV-10000). -- **Improved optimization for joins with many `eq_ref` tables** (10.10+, MDEV-28852, MDEV-26278) — large star-schema-style joins plan dramatically better. -- **`ANALYZE FORMAT=JSON` reports time spent in the optimizer itself** (10.11+, MDEV-28926) — separates planning time from execution time. - -### Optimizer Improvements in 11.4 LTS - -The 11.4 LTS line continues the overhaul: - -- **New cost-based cost model** (11.0+) — replaces the older rule-based heuristics with a tuned model aware of SSDs and per-engine characteristics. `EXPLAIN` and join-order choices in 10.6 vs. 11.4 can differ noticeably on the same query. If you have manual `optimizer_adjust_secondary_key_costs` settings from 10.x, remove them — they're no-ops on 11.4+. -- **Semi-join optimization for single-table `UPDATE`/`DELETE`** (11.1+, MDEV-7487) — subqueries inside `UPDATE`/`DELETE` can now use the same subquery rewrites that `SELECT` uses (materialization, semi-join, etc.). Often a large speedup, no rewrite needed. -- **Sargable `DATE`/`YEAR` comparisons against constants** (11.1+, MDEV-8320) — see [Functions on Indexed Columns](#functions-on-indexed-columns) above. -- **Sargable case-folding** (11.3+, MDEV-31496, `sargable_casefold` on by default) — `UCASE`/`LCASE`/`UPPER`/`LOWER` on a column with a case-insensitive collation can use the index. - -### Optimizer Improvements in 11.5–11.8 LTS - -These are part of the current LTS baseline — useful for understanding what the optimizer can do today: - -- **Index Condition Pushdown on partitioned tables** (11.5+, MDEV-12404) — previously partitioned tables couldn't use ICP; now they do, often a large speedup on partitioned schemas -- **`ANALYZE` shows selectivity of pushed index condition** (11.5+, MDEV-18478) — useful when diagnosing whether ICP is helping -- **Charset Narrowing Optimization on by default** (11.8+, MDEV-34380) — eliminates unnecessary character set conversions in WHERE clauses -- **`SUBSTR(col, 1, n) = const_str` optimization** (11.8+, MDEV-34911) — the optimizer can now use a column index even when the condition is a leading-prefix `SUBSTR` -- **Virtual column support in the optimizer** (11.8+, MDEV-35616) — see [Virtual Column Support in the Optimizer](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/virtual-column-support-in-the-optimizer); previously, virtual columns were largely invisible to the optimizer -- **Cost-based subquery strategy for single-table `UPDATE`/`DELETE`** (11.8+, MDEV-25008) — the optimizer now picks between subquery strategies by cost - -### Optimizer Improvements in MariaDB 12.x - -Several further limitations were lifted in the 12.x rolling releases: - -- **Rowid filtering on reverse-ordered scans** (12.0+) — previously `ORDER BY ... DESC` queries couldn't benefit from rowid filtering; now they can -- **Index Condition Pushdown on reverse-ordered scans** (12.0+) — same fix for ICP -- **Loose Index Scan ("Use index for group-by") works with `DESC` key parts** (12.0+) — previously required `ASC` indexes -- **GROUP BY / ORDER BY can use indexes on virtual columns** (12.1+) -- **Reorderable LEFT JOIN optimization** (12.3+) — the optimizer can now reorder more `LEFT JOIN` combinations -- **Distinct GROUP BY column inference** (12.2+) — derived tables with `GROUP BY` are recognized as having distinct group keys, enabling more optimizations downstream - -If you target the 11.8 LTS baseline and see a plan that looks needlessly slow on a reverse-ordered or virtual-column query, it may be one of these — verify by running the same query on a 12.x version. - -## Optimizer Hints - -MariaDB 12.0 introduced a comprehensive MySQL-8-style optimizer hints framework (MDEV-35504), with additional hints added through 12.1 and 12.2. Hints go in a `/*+ ... */` comment right after `SELECT` and override the optimizer for one query without changing session settings: - -```sql -SELECT /*+ JOIN_ORDER(o, c) */ * -FROM orders o JOIN customers c ON c.id = o.customer_id; -``` - -**Available hints:** - -| Hint | Since | Purpose | -|---|---|---| -| `QB_NAME(name)` | 12.0 | Name a query block so other hints can target it from outside | -| `JOIN_FIXED_ORDER` / `JOIN_ORDER(t1, t2, ...)` | 12.0 | Force a join order (`JOIN_FIXED_ORDER` is similar to `STRAIGHT_JOIN`) | -| `JOIN_PREFIX(t1, ...)` / `JOIN_SUFFIX(t1, ...)` | 12.0 | Force specific tables to be first or last in the join order | -| `MAX_EXECUTION_TIME(ms)` | 12.0 | Abort the query if it runs longer than the timeout | -| `[NO_]MRR` / `[NO_]BKA` / `[NO_]BNL` | 12.0 | Toggle Multi-Range Read, Batched Key Access, Block Nested Loop | -| `[NO_]ICP` | 12.0 | Toggle Index Condition Pushdown | -| `[NO_]RANGE_OPTIMIZATION` | 12.0 | Toggle range optimizer | -| `SEMIJOIN(strategy, ...)` / `SUBQUERY(strategy)` | 12.0 | Pick subquery rewrite strategy | -| `[NO_]INDEX(t idx, ...)` / `[NO_]JOIN_INDEX` / `[NO_]GROUP_INDEX` / `[NO_]ORDER_INDEX` | 12.1 | Force / forbid specific index usage by purpose | -| `[NO_]SPLIT_MATERIALIZED` / `[NO_]DERIVED_CONDITION_PUSHDOWN` / `[NO_]MERGE` | 12.1 | Control subquery / derived-table optimizations | -| `[NO_]ROWID_FILTER` / `[NO_]INDEX_MERGE` | 12.2 | Toggle rowid filtering and index merge | - -**`QB_NAME()` example** — name a subquery so an outer hint can target it: - -```sql -SELECT /*+ NO_MERGE(@sub) */ * -FROM ( - SELECT /*+ QB_NAME(sub) */ customer_id, COUNT(*) AS n - FROM orders - GROUP BY customer_id -) t -WHERE n > 10; -``` - -Hints are more targeted than `SET optimizer_switch` because they apply only to the query they're in, not the whole session. - -## Quick Wins Checklist - -Before adding indexes or rewriting queries, check these first: - -1. `EXPLAIN` the slow query — confirm where the time actually is -2. `ANALYZE TABLE` — stale statistics cause bad plans -3. Check for functions on indexed columns in `WHERE` — note many cases are now sargable on 11.4+ (`YEAR()`, `DATE()`, `UPPER()` on `_ci` collations) -4. Check for `OFFSET` in pagination queries -5. Verify composite index column order matches query predicates (leftmost prefix) -6. Check `EXPLAIN` Extra column for `Using filesort` or `Using temporary` — these often point to a missing or misordered index - -## Sources - -- [Query Optimizations — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations) -- [EXPLAIN — MariaDB KB](https://mariadb.com/docs/server/reference/sql-statements/administrative-sql-statements/analyze-and-explain-statements/explain) -- [optimizer_switch — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimizer-switch) -- [Getting Started with Indexes — MariaDB KB](https://mariadb.com/docs/server/mariadb-quickstart-guides/mariadb-indexes-guide) -- [Building the Best Index for a Given SELECT — MariaDB Docs](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/optimization-and-indexes/building-the-best-index-for-a-given-select) -- [Histogram-Based Statistics — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/histogram-based-statistics) -- [Pagination Optimization — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/pagination-optimization) - -*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).* +- [Deep Dive Index] + - [Execution Plans](docs/execution-plans.md) + - [Index Strategies](docs/index-strategies.md) + - [Pagination](docs/pagination.md) + - [Histograms](docs/histograms.md) +- [Sources](docs/sources.md) diff --git a/mariadb-query-optimization/docs/execution-plans.md b/mariadb-query-optimization/docs/execution-plans.md new file mode 100644 index 0000000..2e801b5 --- /dev/null +++ b/mariadb-query-optimization/docs/execution-plans.md @@ -0,0 +1,146 @@ +# Reading EXPLAIN + +Run `EXPLAIN` before tuning anything: + +```sql +EXPLAIN SELECT * FROM orders WHERE customer_id = 42 ORDER BY created_at DESC LIMIT 10; +``` + +**Red flags in the output:** + +| Field | Red flag | What it means | +|---|---|---| +| `type` | `ALL` | Full table scan — missing index or index not used | +| `key` | `NULL` | No index used despite one existing — check for function on column or type mismatch | +| `rows` | Very high number | Optimizer estimates scanning many rows | +| `Extra` | `Using filesort` | Expensive sort not covered by an index | +| `Extra` | `Using temporary` | Temp table created — often from `GROUP BY` or `DISTINCT` | +| `Extra` | `Using index` | ✅ Good — covering index, no table row access needed | + +**`ANALYZE`** statement (MariaDB 10.1+) actually executes the query and shows real row counts vs. estimates — more reliable than `EXPLAIN` alone. Note: MariaDB uses `ANALYZE`, not `EXPLAIN ANALYZE`: + +```sql +ANALYZE SELECT * FROM orders WHERE customer_id = 42; +``` + +**Optimizer Trace** shows the optimizer's full decision process. Since MariaDB 12.1 the trace can include full table and view definitions (`optimizer_record_context` system variable). Since 13.0 it also includes the specific statistics (histograms, index stats) used for cardinality estimates — together they're powerful for diagnosing surprising `rows` estimates: + +```sql +SET optimizer_trace = 'enabled=on'; +SELECT * FROM orders WHERE customer_id = 42; +SELECT * FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE\G +SET optimizer_trace = 'enabled=off'; +``` + +## MariaDB Optimizer Switches + +MariaDB's optimizer has more tunable flags than MySQL. The most useful for developers: + +```sql +-- See current settings: +SELECT @@optimizer_switch\G + +-- Disable a specific optimization for a session (useful for debugging): +SET optimizer_switch = 'derived_merge=off'; + +-- Re-enable: +SET optimizer_switch = 'derived_merge=on'; +``` + +**Most impactful flags:** + +| Flag | Default | Effect | +|---|---|---| +| `derived_merge` | on | Merges derived tables into outer query — usually faster | +| `semijoin` | on | Optimizes `IN`/`EXISTS` subqueries — disable to debug unexpected plans | +| `subquery_cache` | on | Caches correlated subquery results — big win for repeated subqueries | +| `rowid_filter` | on | Pre-filters rowids before fetching rows — helps range queries | +| `mrr` | off | Multi-Range Read — enable for large range scans on spinning disks | + +Turn flags off one at a time to isolate which optimization is causing a bad plan, then report via JIRA if a default setting produces a worse plan than the alternative. + +### Optimizer Improvements in the 10.7–10.11 LTS Window + +The 10.11 LTS line bundles features that arrived in the 10.7–10.10 short-term releases: + +- **JSON-format histograms** (10.8+, MDEV-21130, MDEV-26519) — histogram statistics are stored in JSON and are more precise than the older binary format. Just running `ANALYZE TABLE` on 10.8+ gives the optimizer better cardinality estimates. +- **Descending indexes** (10.8+, MDEV-13756) — `CREATE INDEX idx ON t (a ASC, b DESC)` is supported; useful for composite `ORDER BY a, b DESC` patterns and for `MIN()`/`MAX()` on descending indexes. +- **`SHOW ANALYZE [FORMAT=JSON]`** (10.9+, MDEV-27021) — get the optimizer plan and runtime stats for a query running in another connection without intrusion. `EXPLAIN FOR CONNECTION` syntax also supported (MDEV-10000). +- **Improved optimization for joins with many `eq_ref` tables** (10.10+, MDEV-28852, MDEV-26278) — large star-schema-style joins plan dramatically better. +- **`ANALYZE FORMAT=JSON` reports time spent in the optimizer itself** (10.11+, MDEV-28926) — separates planning time from execution time. + +### Optimizer Improvements in 11.4 LTS + +The 11.4 LTS line continues the overhaul: + +- **New cost-based cost model** (11.0+) — replaces the older rule-based heuristics with a tuned model aware of SSDs and per-engine characteristics. `EXPLAIN` and join-order choices in 10.6 vs. 11.4 can differ noticeably on the same query. If you have manual `optimizer_adjust_secondary_key_costs` settings from 10.x, remove them — they're no-ops on 11.4+. +- **Semi-join optimization for single-table `UPDATE`/`DELETE`** (11.1+, MDEV-7487) — subqueries inside `UPDATE`/`DELETE` can now use the same subquery rewrites that `SELECT` uses (materialization, semi-join, etc.). Often a large speedup, no rewrite needed. +- **Sargable `DATE`/`YEAR` comparisons against constants** (11.1+, MDEV-8320) — see [Functions on Indexed Columns](#functions-on-indexed-columns) above. +- **Sargable case-folding** (11.3+, MDEV-31496, `sargable_casefold` on by default) — `UCASE`/`LCASE`/`UPPER`/`LOWER` on a column with a case-insensitive collation can use the index. + +### Optimizer Improvements in 11.5–11.8 LTS + +These are part of the current LTS baseline — useful for understanding what the optimizer can do today: + +- **Index Condition Pushdown on partitioned tables** (11.5+, MDEV-12404) — previously partitioned tables couldn't use ICP; now they do, often a large speedup on partitioned schemas +- **`ANALYZE` shows selectivity of pushed index condition** (11.5+, MDEV-18478) — useful when diagnosing whether ICP is helping +- **Charset Narrowing Optimization on by default** (11.8+, MDEV-34380) — eliminates unnecessary character set conversions in WHERE clauses +- **`SUBSTR(col, 1, n) = const_str` optimization** (11.8+, MDEV-34911) — the optimizer can now use a column index even when the condition is a leading-prefix `SUBSTR` +- **Virtual column support in the optimizer** (11.8+, MDEV-35616) — see [Virtual Column Support in the Optimizer](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/virtual-column-support-in-the-optimizer); previously, virtual columns were largely invisible to the optimizer +- **Cost-based subquery strategy for single-table `UPDATE`/`DELETE`** (11.8+, MDEV-25008) — the optimizer now picks between subquery strategies by cost + +### Optimizer Improvements in MariaDB 12.x + +Several further limitations were lifted in the 12.x rolling releases: + +- **Rowid filtering on reverse-ordered scans** (12.0+) — previously `ORDER BY ... DESC` queries couldn't benefit from rowid filtering; now they can +- **Index Condition Pushdown on reverse-ordered scans** (12.0+) — same fix for ICP +- **Loose Index Scan ("Use index for group-by") works with `DESC` key parts** (12.0+) — previously required `ASC` indexes +- **GROUP BY / ORDER BY can use indexes on virtual columns** (12.1+) +- **Reorderable LEFT JOIN optimization** (12.3+) — the optimizer can now reorder more `LEFT JOIN` combinations +- **Distinct GROUP BY column inference** (12.2+) — derived tables with `GROUP BY` are recognized as having distinct group keys, enabling more optimizations downstream + +If you target the 11.8 LTS baseline and see a plan that looks needlessly slow on a reverse-ordered or virtual-column query, it may be one of these — verify by running the same query on a 12.x version. + +## Optimizer Hints + +MariaDB 12.0 introduced a comprehensive MySQL-8-style optimizer hints framework (MDEV-35504), with additional hints added through 12.1 and 12.2. Hints go in a `/*+ ... */` comment right after `SELECT` and override the optimizer for one query without changing session settings: + +```sql +SELECT /*+ JOIN_ORDER(o, c) */ * +FROM orders o JOIN customers c ON c.id = o.customer_id; +``` + +**Available hints:** + +| Hint | Since | Purpose | +|---|---|---| +| `QB_NAME(name)` | 12.0 | Name a query block so other hints can target it from outside | +| `JOIN_FIXED_ORDER` / `JOIN_ORDER(t1, t2, ...)` | 12.0 | Force a join order (`JOIN_FIXED_ORDER` is similar to `STRAIGHT_JOIN`) | +| `JOIN_PREFIX(t1, ...)` / `JOIN_SUFFIX(t1, ...)` | 12.0 | Force specific tables to be first or last in the join order | +| `MAX_EXECUTION_TIME(ms)` | 12.0 | Abort the query if it runs longer than the timeout | +| `[NO_]MRR` / `[NO_]BKA` / `[NO_]BNL` | 12.0 | Toggle Multi-Range Read, Batched Key Access, Block Nested Loop | +| `[NO_]ICP` | 12.0 | Toggle Index Condition Pushdown | +| `[NO_]RANGE_OPTIMIZATION` | 12.0 | Toggle range optimizer | +| `SEMIJOIN(strategy, ...)` / `SUBQUERY(strategy)` | 12.0 | Pick subquery rewrite strategy | +| `[NO_]INDEX(t idx, ...)` / `[NO_]JOIN_INDEX` / `[NO_]GROUP_INDEX` / `[NO_]ORDER_INDEX` | 12.1 | Force / forbid specific index usage by purpose | +| `[NO_]SPLIT_MATERIALIZED` / `[NO_]DERIVED_CONDITION_PUSHDOWN` / `[NO_]MERGE` | 12.1 | Control subquery / derived-table optimizations | +| `[NO_]ROWID_FILTER` / `[NO_]INDEX_MERGE` | 12.2 | Toggle rowid filtering and index merge | + +**`QB_NAME()` example** — name a subquery so an outer hint can target it: + +```sql +SELECT /*+ NO_MERGE(@sub) */ * +FROM ( + SELECT /*+ QB_NAME(sub) */ customer_id, COUNT(*) AS n + FROM orders + GROUP BY customer_id +) t +WHERE n > 10; +``` + +Hints are more targeted than `SET optimizer_switch` because they apply only to the query they're in, not the whole session. + +### Quick Wins +1. `EXPLAIN` the slow query — confirm where the time actually is. +6. Check `EXPLAIN` Extra column for `Using filesort` or `Using temporary` — these often point to a missing or misordered index. diff --git a/mariadb-query-optimization/docs/histograms.md b/mariadb-query-optimization/docs/histograms.md new file mode 100644 index 0000000..17ea78a --- /dev/null +++ b/mariadb-query-optimization/docs/histograms.md @@ -0,0 +1,30 @@ +# Histogram Statistics + +Histograms let the optimizer understand data distribution on non-indexed columns — critical for query plan quality on complex queries. Without them, the optimizer assumes uniform distribution and can choose wrong join orders. + +```sql +-- Collect histograms for a table (requires a full scan — run during low traffic): +ANALYZE TABLE orders; + +-- Verify histograms were collected: +SELECT * FROM mysql.column_stats WHERE table_name = 'orders'; +``` + +**When to run `ANALYZE TABLE`:** +- After bulk inserts or large data changes +- When `EXPLAIN` shows unexpectedly high `rows` estimates +- After initially creating a table and loading data + +**Tune histogram granularity** for tables with highly skewed data distributions: +```sql +SET histogram_size = 100; -- default is 0 (disabled) in older versions, 254 in 10.4.3+ +ANALYZE TABLE orders; +``` + +Histograms are collected per-column automatically when using `ANALYZE TABLE` with `histogram_size > 0`. They are stored in `mysql.column_stats` and consulted when `optimizer_use_condition_selectivity >= 4` (default in 10.4.1+). + +### What LLMs Get Wrong +- Not running `ANALYZE TABLE` after bulk inserts: Histogram statistics become stale; optimizer makes poor plan choices. + +### Quick Wins +2. `ANALYZE TABLE` — stale statistics cause bad plans. diff --git a/mariadb-query-optimization/docs/index-strategies.md b/mariadb-query-optimization/docs/index-strategies.md new file mode 100644 index 0000000..bceacaa --- /dev/null +++ b/mariadb-query-optimization/docs/index-strategies.md @@ -0,0 +1,65 @@ +# Indexing Rules + +## The Leftmost Prefix Rule + +For a composite index `(a, b, c)`, MariaDB can use: +- `WHERE a = 1` ✅ +- `WHERE a = 1 AND b = 2` ✅ +- `WHERE a = 1 AND b = 2 AND c = 3` ✅ +- `WHERE b = 2` ✗ — skips `a`, index not used +- `WHERE a = 1 AND c = 3` — only `a` part is used + +Put the most selective equality conditions first, then range conditions last: +```sql +-- Query: WHERE status = 'active' AND created_at > '2025-01-01' ORDER BY created_at +INDEX (status, created_at) -- ✅ equality first, range last +INDEX (created_at, status) -- ✗ range first breaks the prefix for status +``` + +## Covering Indexes + +A covering index includes all columns needed by the query — no table row access needed (`Using index` in EXPLAIN): + +```sql +-- Query fetches id, status, created_at for a customer +-- Covering index includes all three: +CREATE INDEX idx_customer_cover ON orders (customer_id, status, created_at); +-- Now EXPLAIN shows: Extra = Using index +``` + +## When NOT to Add an Index + +- **Low-cardinality columns**: a `status` column with values `active`/`inactive` affects 50% of rows — the optimizer prefers a table scan. Index useful only when combined with other high-selectivity columns. +- **Small tables** (< a few thousand rows): full scans are faster than index lookups for tiny tables. +- **Write-heavy columns**: every index slows `INSERT`, `UPDATE`, `DELETE` — don't index columns that are rarely queried. + +## Functions on Indexed Columns + +The classic rule "any function on an indexed column disables the index" is **outdated for MariaDB 11.1+ and 11.4 LTS**. The optimizer can now use indexes for a number of common function patterns: + +| Pattern | Works on the index? | Since | +|---|---|---| +| `WHERE YEAR(col) = 2025` | ✅ — sargable, picks the right range | 11.1+ (MDEV-8320) | +| `WHERE DATE(col) <= '2025-12-31'` | ✅ — sargable | 11.1+ (MDEV-8320) | +| `WHERE UPPER(varchar_col) = '...'` on a case-insensitive collation (e.g. `utf8mb4_uca1400_ai_ci`) | ✅ — `sargable_casefold=ON` is the default | 11.3+ (MDEV-31496) | +| `WHERE SUBSTR(col, 1, n) = 'abc'` | ✅ — leading-prefix `SUBSTR` is optimized | 11.8+ (MDEV-34911) | +| `WHERE LOWER(case_sensitive_col) = '...'` | ✗ — index not used (collation isn't case-insensitive) | — | +| `WHERE CAST(col AS UNSIGNED) = 1` or other type-changing transforms | ✗ — index not used | — | + +For cases that the optimizer still can't sargabilize, the rewrite-to-range pattern remains valid: + +```sql +WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01' +``` + +Verify with `EXPLAIN` rather than assuming: on 11.4+ many "won't use the index" rewrites are now no-ops. If `EXPLAIN` still shows `type=ALL` for a sargable pattern, check `@@optimizer_switch` for `sargable_casefold` and confirm the column's collation is `_ci`. + +### What LLMs Get Wrong +- Blanket rule "functions on indexed columns kill indexes": Outdated on MariaDB 11.1+/11.3+ for many cases. `YEAR(col) = const` and `UPPER(col) = const` on case-insensitive columns can now use indexes. +- Adding an index to a low-cardinality column (boolean, status with 2-3 values): Optimizer skips indexes with low selectivity and does a table scan anyway. +- Composite index `(a, b, c)` used in `WHERE b = 1 AND c = 2`: Leftmost prefix rule: this skips `a`, so the index is not used. +- `SELECT *` in queries with JOINs: Name only the columns needed — prevents accidentally blocking covering indexes. + +### Quick Wins +3. Check for functions on indexed columns in `WHERE` — note many cases are now sargable on 11.4+ (`YEAR()`, `DATE()`, `UPPER()` on `_ci` collations). +5. Verify composite index column order matches query predicates (leftmost prefix). diff --git a/mariadb-query-optimization/docs/pagination.md b/mariadb-query-optimization/docs/pagination.md new file mode 100644 index 0000000..650425c --- /dev/null +++ b/mariadb-query-optimization/docs/pagination.md @@ -0,0 +1,32 @@ +# Pagination: Cursor-Based Instead of OFFSET + +`OFFSET` is a hidden performance trap. `LIMIT 10 OFFSET 50000` scans and discards 50,000 rows on every page load. + +```sql +-- ✗ Slow — scans 50,000 rows to skip them: +SELECT id, title FROM posts ORDER BY id DESC LIMIT 10 OFFSET 50000; + +-- ✅ Fast — index seek directly to the cursor position: +-- First page: +SELECT id, title FROM posts ORDER BY id DESC LIMIT 10; + +-- Next page (pass last id from previous result as $last_id): +SELECT id, title FROM posts WHERE id < $last_id ORDER BY id DESC LIMIT 10; +``` + +For filtered queries, include the filter column in the index alongside id: + +```sql +-- Query: WHERE category = 'news' ORDER BY id DESC +CREATE INDEX idx_cat_id ON posts (category, id); +-- Cursor query: +SELECT id, title FROM posts WHERE category = 'news' AND id < $last_id ORDER BY id DESC LIMIT 10; +``` + +To detect whether another page exists, fetch `LIMIT 11` and check if the 11th row appears. + +### What LLMs Get Wrong +- `SELECT * FROM table LIMIT 10 OFFSET 50000`: Use cursor-based pagination — `OFFSET` scans all skipped rows. + +### Quick Wins +4. Check for `OFFSET` in pagination queries. diff --git a/mariadb-query-optimization/docs/sources.md b/mariadb-query-optimization/docs/sources.md new file mode 100644 index 0000000..7744745 --- /dev/null +++ b/mariadb-query-optimization/docs/sources.md @@ -0,0 +1,11 @@ +# Sources + +- [Query Optimizations — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations) +- [EXPLAIN — MariaDB KB](https://mariadb.com/docs/server/reference/sql-statements/administrative-sql-statements/analyze-and-explain-statements/explain) +- [optimizer_switch — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/optimizer-switch) +- [Getting Started with Indexes — MariaDB KB](https://mariadb.com/docs/server/mariadb-quickstart-guides/mariadb-indexes-guide) +- [Building the Best Index for a Given SELECT — MariaDB Docs](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/optimization-and-indexes/building-the-best-index-for-a-given-select) +- [Histogram-Based Statistics — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/statistics-for-optimizing-queries/histogram-based-statistics) +- [Pagination Optimization — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/optimization-and-tuning/query-optimizations/pagination-optimization) + +*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).* diff --git a/mariadb-replication-and-ha/SKILL.md b/mariadb-replication-and-ha/SKILL.md index d2ca700..bbbae1e 100644 --- a/mariadb-replication-and-ha/SKILL.md +++ b/mariadb-replication-and-ha/SKILL.md @@ -15,236 +15,16 @@ MariaDB offers three tiers of replication depending on your consistency and avai | **Semi-synchronous replication** | Stronger (1 replica ACK) | Manual or tool-assisted | Reducing data loss risk without full sync overhead | | **Galera Cluster** | Synchronous (multi-primary) | Automatic | Zero-data-loss HA, multi-datacenter writes | -> **Requires:** GTID replication: MariaDB 10.0+. Semi-sync built-in: 10.3+. Parallel replication optimistic mode: 10.5.1+. Current LTS is 11.8 (GA May 2025). +> **Requires:** GTID replication: MariaDB 10.0+. Semi-sync built-in: 10.3+. Parallel replication optimistic mode: 10.5.1+. > -> **Default context:** Assume MariaDB **11.8 LTS** unless the user states another version. Features marked **12.x** or **13.0** may be suggested when relevant (including as upgrade options), but always state the minimum version — do not present them as available on 11.8. +> **Server context:** See [MariaDB Versioning Context](../_shared/versioning.md). -## What LLMs Get Wrong +## Documentation -| What you might see | What's correct | -|---|---| -| MySQL GTID format or `gtid_mode=ON` syntax | MariaDB GTID uses a different format (`domain-server-seq`) and different commands — MySQL and MariaDB GTIDs are **incompatible** | -| "Install the Galera plugin" | Galera Cluster is built into MariaDB — no plugin installation required | -| Assuming sequential `AUTO_INCREMENT` in Galera | Galera produces gaps in auto-increment sequences across nodes by design — never rely on sequential values | -| `LOCK TABLES` or `GET_LOCK()` in a Galera environment | Not supported in Galera — use transactions instead | -| Treating a replica as a backup | Replication is not a backup — a `DROP TABLE` on the primary replicates immediately to all replicas | -| Tables without primary keys in a Galera cluster | All tables in Galera must have a primary key — `DELETE` fails on keyless tables | - -## Standard Async Replication - -The foundation: one primary, one or more replicas. The primary writes to the binary log; replicas apply changes asynchronously. - -**GTID-based replication is the default since MariaDB 10.10** (MDEV-19801) and remains so on 10.11 LTS, 11.4 LTS, and 11.8 LTS. On a fresh replica start, a `RESET SLAVE`, or a `CHANGE MASTER TO` that omits `MASTER_USE_GTID`, the replica defaults to `slave_pos` instead of legacy file/position. If you have configs that rely on the old behavior, set `MASTER_USE_GTID=no` explicitly. - -```sql --- On replica (10.10+ — MASTER_USE_GTID is optional, slave_pos is the default): -CHANGE MASTER TO - MASTER_HOST='primary.host', - MASTER_USER='repl_user', - MASTER_PASSWORD='password', - MASTER_USE_GTID = slave_pos; - -START SLAVE; -``` - -**Promoting a replica to primary** — historically `MASTER_USE_GTID=current_pos` was used to include locally-written GTIDs. **`current_pos` is deprecated since 10.10** (MDEV-20122). Use `MASTER_DEMOTE_TO_SLAVE=1` instead: it converts the old primary's `gtid_binlog_pos` into `gtid_slave_pos` so the demoted server can attach to the new primary cleanly without race conditions. - -```sql --- On the former primary, being demoted to a replica (10.10+): -CHANGE MASTER TO - MASTER_HOST='new_primary.host', - ..., - MASTER_DEMOTE_TO_SLAVE=1; -START SLAVE; -``` - -Since MariaDB 13.0, `CHANGE MASTER` also resets `Master_Server_Id` in `SHOW SLAVES STATUS`. On older versions this field could carry stale values across primary changes — check it explicitly when reconfiguring replication on pre-13.0 servers. - -### MariaDB GTID Format - -MariaDB GTIDs have three components: `domain_id-server_id-sequence` (e.g., `0-1-247`). - -This is **different from MySQL's** `server_uuid:sequence` format. They are not compatible — a MariaDB primary cannot replicate to a MySQL replica using GTIDs, and vice versa. - -**Domain IDs** enable multi-source replication: assign each primary a distinct `gtid_domain_id` so replicas can track multiple sources independently: -```sql --- On primary A: -SET GLOBAL gtid_domain_id = 1; - --- On primary B: -SET GLOBAL gtid_domain_id = 2; -``` - -Since MariaDB 13.0, `default_master_connection` can be set at the global level — convenient for replicas that connect to one logical "primary" source across multiple servers without specifying the connection name in every replication command. - -### Parallel Replication - -By default, replicas apply events serially. Parallel replication (up to 10× faster on write-heavy workloads) uses a pool of worker threads: - -```ini -# my.cnf on replica: -slave_parallel_threads = 4 -slave_parallel_mode = optimistic # default since 10.5.1 — tries parallel, retries on conflict -``` - -`optimistic` mode applies transactions in parallel and retries on conflict. Use `conservative` for stricter workloads where conflict retries are unacceptable. - -Since MariaDB 12.1, parallel replication also works when **asynchronously replicating between two Galera clusters** (MDEV-20065) — useful for cross-datacenter or DR setups where one Galera cluster is an async replica of another. - -### Replication Improvements in 10.7–10.11 LTS - -- **Optimistic two-phase `ALTER TABLE` replication** (10.8+, MDEV-11675, `binlog_alter_two_phase`) — opt-in: when enabled, a large `ALTER TABLE` is started on the replica in parallel with the primary's execution rather than after, drastically reducing replication lag during schema changes. Off by default for compatibility. -- **`mariadb-binlog --gtid-strict-mode` and GTID range filtering via `--start-position` / `--stop-position`** (10.8+, MDEV-4989) — point-in-time replay tools can target GTIDs directly without needing file/offset pairs. -- **`slave_max_statement_time`** (10.10+, MDEV-27161) — caps the execution time of a single replicated query on the SQL thread, useful when you must keep lag bounded and would rather skip a slow statement than fall further behind. -- **`mariadb-binlog --do-domain-ids` / `--ignore-domain-ids` / `--ignore-server-ids`** (10.9+, MDEV-20119) — domain/server filtering when extracting binlog events. -- **Multi-source replication CHANNEL syntax** (10.7+, MDEV-26307) — MySQL-style `FOR CHANNEL 'name'` clauses now work in `CHANGE MASTER TO`, `START SLAVE`, etc. - -### Replication Improvements in 11.4 LTS - -- **Global limit on binary log disk space** (11.4+, MDEV-31404) — `max_binlog_total_size` (alias `binlog_space_limit`, default `0` = no limit) triggers binlog purging when the total size of all binlogs exceeds the threshold. Combine with `--slave-connections-needed-for-purge` (default `1`) so purging won't run if a configured replica is disconnected. New status variable `binlog_disk_use` reports current disk usage. -- **GTID index for the binary log** (11.4+, MDEV-4991) — a new GTID-to-position index lets reconnecting replicas seek straight to their start position without scanning whole binlog files. Controlled by `binlog_gtid_index` (default `ON`), `binlog_gtid_index_page_size`, and `binlog_gtid_index_span_min`. Status variables `binlog_gtid_index_hit` / `binlog_gtid_index_miss` let you confirm it's being used. -- **`SQL_BEFORE_GTIDS` / `SQL_AFTER_GTIDS` for `START SLAVE UNTIL`** (11.4+, MDEV-27247) — finer-grained stopping for staged failover or PITR replay. -- **Detailed replication-lag fields** (11.4+, MDEV-29639) — `SHOW REPLICA STATUS` adds `Master_last_event_time`, `Slave_last_event_time`, `Master_Slave_time_diff` for clearer lag interpretation than `Seconds_Behind_Master` alone (the 11.6 update built on this — see below). - -### Binlog Performance Improvements in 11.7 - -- **Large-transaction commit no longer freezes other transactions** (11.7+, MDEV-32014) — previously, committing a very large transaction while `log_bin` was on would stall all other transactions until the binlog write completed. This bottleneck is gone. -- **Async rollback of prepared transactions during binlog crash recovery** (11.7+, MDEV-33853) — faster startup after a crash with many prepared transactions. -- **`slave_abort_blocking_timeout`** (11.7+, MDEV-34857) — kill long-running queries on a replica when they block replication progress past a threshold. Useful on read replicas that occasionally run long analytical queries. - -### Monitoring Replication Lag - -```sql -SHOW SLAVE STATUS\G --- Key fields: --- Seconds_Behind_Master: estimated lag in seconds --- Last_SQL_Error: last error stopping the SQL thread --- Relay_Log_Pos vs Read_Master_Log_Pos: how far behind the relay log is -``` - -Alert when `Seconds_Behind_Master > 5` for latency-sensitive applications. A value of `NULL` means replication is not running. Note: `Seconds_Behind_Master` can be misleading on idle primaries — use heartbeat tools (e.g., `pt-heartbeat`) for accurate measurement. - -Since MariaDB 11.6 (MDEV-33856), the definition of `Seconds_Behind_Master` was refined and three new columns were added to `SHOW ALL REPLICAS STATUS` plus a new Information Schema `SLAVE_STATUS` table, providing more nuanced lag visibility (e.g., separate measurements for IO vs SQL thread lag). - -## Semi-Synchronous Replication - -The primary waits for at least one replica to acknowledge receipt before committing. Reduces data loss risk on failover without requiring full synchronous overhead. - -```sql --- Enable on primary: -SET GLOBAL rpl_semi_sync_master_enabled = 1; - --- Enable on replica: -SET GLOBAL rpl_semi_sync_slave_enabled = 1; -``` - -If no replica acknowledges within `rpl_semi_sync_master_timeout` (default 10 seconds), the primary falls back to async. Built-in since MariaDB 10.3 — no plugin needed. - -Use when: you need stronger data durability than async but your workload tolerates a small write latency increase. - -## Galera Cluster - -Multi-primary synchronous replication — all nodes accept reads and writes, changes are certified across the cluster before committing. No single point of failure. Built into MariaDB. - -> **Packaging change (12.3+):** The Galera library is no longer included as a server-package dependency or in the MariaDB repositories by default (MDEV-38744). On 12.3+ you must install `galera-4` (or your distro's equivalent) separately when setting up a Galera node. The MariaDB server still understands Galera natively — only the library distribution changed. - -### Developer Constraints - -These will break in Galera if you're not aware of them: - -**All tables must have a primary key:** -```sql --- ✗ DELETE fails in Galera on keyless tables: -CREATE TABLE logs (message TEXT); - --- ✅ Always define a PK: -CREATE TABLE logs (id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, message TEXT); -``` - -**AUTO_INCREMENT values have gaps** — Galera uses `auto_increment_increment` and `auto_increment_offset` per node to avoid conflicts, resulting in non-sequential IDs. Never rely on sequential auto-increment in Galera. - -**LOCK TABLES, GET_LOCK(), and `FLUSH TABLES {table list} WITH READ LOCK` are not supported** — use transactions. Note: global `FLUSH TABLES WITH READ LOCK` (no table list) IS supported: -```sql --- ✗ Not supported in Galera: -LOCK TABLES orders WRITE; --- ✅ Use a transaction instead: -BEGIN; -SELECT ... FOR UPDATE; -UPDATE ...; -COMMIT; -``` - -**InnoDB only** — Galera replicates only InnoDB tables. MyISAM has experimental support via `wsrep_replicate_myisam` but is not recommended for production. - -**Transaction size limits** — default caps: 128K rows (`wsrep_max_ws_rows`) and 2GB (`wsrep_max_ws_size`). Extremely large transactions degrade cluster performance significantly and may require config tuning. Note: in MariaDB 13.0+, the default `binlog_row_event_max_size` is 64 KB (up from older 8 KB) — relevant when sizing replication events for write-heavy workloads. - -**Binary log must be ROW format** — do not change `binlog_format` at runtime in a Galera cluster. - -**Write-set retry on conflict** (12.1+) — `wsrep_applier_retry_count` controls how many times an applier retries a write set before erroring out. Tune this if your workload sees transient certification conflicts on busy clusters. - -**Automatic SST user account management** (11.6+, MDEV-31809) — Galera now manages the dedicated SST (State Snapshot Transfer) user account automatically; you no longer have to create and grant it manually on every node. - -**IP allowlist for nodes joining the cluster** (10.10+, MDEV-27246) — `wsrep_allowlist` restricts which IPs can make SST/IST requests, reducing the attack surface on a Galera cluster's intra-node traffic. - -**Query cache:** The query cache was removed in later MariaDB versions and was not required in Galera since MariaDB 10.1.2. No action needed on modern installations. - -### Stale Reads and Consistency - -Galera is "virtually synchronous" — a committed write on one node may not be immediately visible on another without additional synchronization: - -```sql --- Force a sync point before reading (performance cost): -SET SESSION wsrep_sync_wait = 1; -SELECT * FROM orders WHERE id = 42; -``` - -Use `wsrep_sync_wait` only where strict read-after-write consistency is required. For most reads, eventual consistency across nodes (milliseconds) is acceptable. - -### Lost Updates in Galera - -Galera does not prevent lost updates in read-modify-write patterns. Use `SELECT ... FOR UPDATE` explicitly: - -```sql --- ✗ Race condition — another node could modify between SELECT and UPDATE: -SELECT balance FROM accounts WHERE id = 1; --- ... application logic ... -UPDATE accounts SET balance = new_value WHERE id = 1; - --- ✅ Lock the row at read time: -BEGIN; -SELECT balance FROM accounts WHERE id = 1 FOR UPDATE; -UPDATE accounts SET balance = new_value WHERE id = 1; -COMMIT; -``` - -## Replication is Not a Backup - -A `DROP TABLE` or `DELETE FROM table` on the primary replicates to all replicas immediately. Replication protects against hardware failure — not against accidental data changes. Maintain independent backups (`mariadb-dump`, `mariadb-backup`) on a schedule separate from replication. - -Delayed replication is one mitigation — intentionally lag a replica by a set time: -```sql -CHANGE MASTER TO MASTER_DELAY = 3600; -- 1 hour lag -``` - -This gives a recovery window for accidental changes, but it is still not a substitute for backups. - -**Point-in-time recovery (13.0+)** — the new `innodb_log_archive` variable instructs InnoDB to preserve the write-ahead log as a continuous sequence of files instead of overwriting a ring buffer. Combined with a base backup, this enables PITR and incremental backups without needing the binary log alone. Use this on systems that need to roll forward to a precise transaction. - -## Failover Tools - -- **MaxScale** — MariaDB's proxy with automatic failover, read-write splitting, and connection routing. Detects primary failure and promotes the most up-to-date replica. Requires GTID replication. [mariadb.com/docs/maxscale](https://mariadb.com/docs/maxscale) -- **ProxySQL** — third-party proxy, widely used for read-write splitting and connection pooling with MariaDB - -For Galera, failover is automatic — any surviving node continues accepting writes. No proxy required for basic HA, though MaxScale or ProxySQL add connection routing. - -## Sources - -- [Replication Overview — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/replication-overview) -- [GTID — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/gtid) -- [Parallel Replication — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/parallel-replication) -- [Semi-Synchronous Replication — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/semisynchronous-replication) -- [Galera Cluster Known Limitations — MariaDB Docs](https://mariadb.com/docs/galera-cluster/reference/mariadb-galera-cluster-known-limitations) -- [Auto-Increments in Galera — mariadb.org](https://mariadb.org/auto-increments-in-galera/) -- [Automatic Failover with MariaDB Monitor — MaxScale Docs](https://mariadb.com/docs/maxscale/mariadb-maxscale-tutorials/automatic-failover-with-mariadb-monitor) - -*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).* +- [Deep Dive Index] + - [Overview](docs/overview.md) + - [Standard Async Replication](docs/async-replication.md) + - [Semi-Synchronous Replication](docs/semi-sync-replication.md) + - [Galera Cluster](docs/galera-cluster.md) + - [Failover Tools](docs/failover-tools.md) +- [Sources](docs/sources.md) diff --git a/mariadb-replication-and-ha/docs/async-replication.md b/mariadb-replication-and-ha/docs/async-replication.md new file mode 100644 index 0000000..33add91 --- /dev/null +++ b/mariadb-replication-and-ha/docs/async-replication.md @@ -0,0 +1,93 @@ +# Standard Async Replication + +The foundation: one primary, one or more replicas. The primary writes to the binary log; replicas apply changes asynchronously. + +## MariaDB GTID + +GTID-based replication is the default since MariaDB 10.10 (MDEV-19801) and remains so on 10.11 LTS, 11.4 LTS, and 11.8 LTS. On a fresh replica start, a `RESET SLAVE`, or a `CHANGE MASTER TO` that omits `MASTER_USE_GTID`, the replica defaults to `slave_pos` instead of legacy file/position. If you have configs that rely on the old behavior, set `MASTER_USE_GTID=no` explicitly. + +```sql +-- On replica (10.10+ — MASTER_USE_GTID is optional, slave_pos is the default): +CHANGE MASTER TO + MASTER_HOST='primary.host', + MASTER_USER='repl_user', + MASTER_PASSWORD='password', + MASTER_USE_GTID = slave_pos; + +START SLAVE; +``` + +**Promoting a replica to primary** — historically `MASTER_USE_GTID=current_pos` was used to include locally-written GTIDs. **`current_pos` is deprecated since 10.10** (MDEV-20122). Use `MASTER_DEMOTE_TO_SLAVE=1` instead: it converts the old primary's `gtid_binlog_pos` into `gtid_slave_pos` so the demoted server can attach to the new primary cleanly without race conditions. + +```sql +-- On the former primary, being demoted to a replica (10.10+): +CHANGE MASTER TO + MASTER_HOST='new_primary.host', + ..., + MASTER_DEMOTE_TO_SLAVE=1; +START SLAVE; +``` + +Since MariaDB 13.0, `CHANGE MASTER` also resets `Master_Server_Id` in `SHOW SLAVES STATUS`. On older versions this field could carry stale values across primary changes — check it explicitly when reconfiguring replication on pre-13.0 servers. + +### GTID Format and Multi-Source + +MariaDB GTIDs have three components: `domain_id-server_id-sequence` (e.g., `0-1-247`). + +This is **different from MySQL's** `server_uuid:sequence` format. They are not compatible — a MariaDB primary cannot replicate to a MySQL replica using GTIDs, and vice versa. + +**Domain IDs** enable multi-source replication: assign each primary a distinct `gtid_domain_id` so replicas can track multiple sources independently: +```sql +-- On primary A: +SET GLOBAL gtid_domain_id = 1; + +-- On primary B: +SET GLOBAL gtid_domain_id = 2; +``` + +Since MariaDB 13.0, `default_master_connection` can be set at the global level — convenient for replicas that connect to one logical "primary" source across multiple servers without specifying the connection name in every replication command. + +## Parallel Replication + +By default, replicas apply events serially. Parallel replication (up to 10× faster on write-heavy workloads) uses a pool of worker threads: + +```ini +# my.cnf on replica: +slave_parallel_threads = 4 +slave_parallel_mode = optimistic # default since 10.5.1 — tries parallel, retries on conflict +``` + +`optimistic` mode applies transactions in parallel and retries on conflict. Use `conservative` for stricter workloads where conflict retries are unacceptable. + +Since MariaDB 12.1, parallel replication also works when **asynchronously replicating between two Galera clusters** (MDEV-20065) — useful for cross-datacenter or DR setups where one Galera cluster is an async replica of another. + +## Replication & Binlog Improvements (10.7–11.4) + +- **Optimistic two-phase `ALTER TABLE` replication** (10.8+, MDEV-11675, `binlog_alter_two_phase`) — opt-in: when enabled, a large `ALTER TABLE` is started on the replica in parallel with the primary's execution rather than after, drastically reducing replication lag during schema changes. +- **`mariadb-binlog` improvements** (10.8+) — `--gtid-strict-mode` and GTID range filtering via `--start-position` / `--stop-position` lets point-in-time replay tools target GTIDs directly. +- **`slave_max_statement_time`** (10.10+, MDEV-27161) — caps the execution time of a single replicated query on the SQL thread. +- **`mariadb-binlog` filtering** (10.9+, MDEV-20119) — `--do-domain-ids` / `--ignore-domain-ids` / `--ignore-server-ids` for binlog event extraction. +- **Multi-source replication CHANNEL syntax** (10.7+, MDEV-26307) — MySQL-style `FOR CHANNEL 'name'` clauses work in `CHANGE MASTER TO`, `START SLAVE`, etc. +- **Global limit on binary log disk space** (11.4+, MDEV-31404) — `max_binlog_total_size` triggers binlog purging when total size exceeds threshold. +- **GTID index for binary log** (11.4+, MDEV-4991) — new GTID-to-position index lets reconnecting replicas seek straight to their start position. +- **Detailed replication-lag fields** (11.4+, MDEV-29639) — `SHOW REPLICA STATUS` adds fields for clearer lag interpretation than `Seconds_Behind_Master` alone. + +## Performance Improvements (11.7+) + +- **Large-transaction commit no longer freezes other transactions** (11.7+, MDEV-32014) — committing large transactions while `log_bin` is on no longer stalls other transactions. +- **Async rollback of prepared transactions during binlog crash recovery** (11.7+, MDEV-33853). +- **`slave_abort_blocking_timeout`** (11.7+, MDEV-34857) — kill long-running queries on a replica when they block replication progress past a threshold. + +## Monitoring Replication Lag + +```sql +SHOW SLAVE STATUS\G +-- Key fields: +-- Seconds_Behind_Master: estimated lag in seconds +-- Last_SQL_Error: last error stopping the SQL thread +-- Relay_Log_Pos vs Read_Master_Log_Pos: how far behind the relay log is +``` + +Alert when `Seconds_Behind_Master > 5` for latency-sensitive applications. A value of `NULL` means replication is not running. Note: `Seconds_Behind_Master` can be misleading on idle primaries — use heartbeat tools (e.g., `pt-heartbeat`) for accurate measurement. + +Since MariaDB 11.6 (MDEV-33856), the definition of `Seconds_Behind_Master` was refined and new columns were added to `SHOW ALL REPLICAS STATUS` plus a new Information Schema `SLAVE_STATUS` table, providing more nuanced lag visibility (e.g., separate measurements for IO vs SQL thread lag). diff --git a/mariadb-replication-and-ha/docs/failover-tools.md b/mariadb-replication-and-ha/docs/failover-tools.md new file mode 100644 index 0000000..8934019 --- /dev/null +++ b/mariadb-replication-and-ha/docs/failover-tools.md @@ -0,0 +1,6 @@ +# Failover Tools + +- **MaxScale** — MariaDB's proxy with automatic failover, read-write splitting, and connection routing. Detects primary failure and promotes the most up-to-date replica. Requires GTID replication. [mariadb.com/docs/maxscale](https://mariadb.com/docs/maxscale) +- **ProxySQL** — third-party proxy, widely used for read-write splitting and connection pooling with MariaDB + +For Galera, failover is automatic — any surviving node continues accepting writes. No proxy required for basic HA, though MaxScale or ProxySQL add connection routing. diff --git a/mariadb-replication-and-ha/docs/galera-cluster.md b/mariadb-replication-and-ha/docs/galera-cluster.md new file mode 100644 index 0000000..6ce24bf --- /dev/null +++ b/mariadb-replication-and-ha/docs/galera-cluster.md @@ -0,0 +1,55 @@ +# Galera Cluster + +Multi-primary synchronous replication — all nodes accept reads and writes, changes are certified across the cluster before committing. No single point of failure. Built into MariaDB. + +> **Packaging change (12.3+):** The Galera library is no longer included as a server-package dependency or in the MariaDB repositories by default (MDEV-38744). On 12.3+ you must install `galera-4` (or your distro's equivalent) separately when setting up a Galera node. The MariaDB server still understands Galera natively — only the library distribution changed. + +## Developer Constraints + +These will break in Galera if you're not aware of them: + +- **All tables must have a primary key:** + ```sql + -- ✗ DELETE fails in Galera on keyless tables: + CREATE TABLE logs (message TEXT); + + -- ✅ Always define a PK: + CREATE TABLE logs (id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY, message TEXT); + ``` +- **AUTO_INCREMENT values have gaps** — Galera uses `auto_increment_increment` and `auto_increment_offset` per node to avoid conflicts, resulting in non-sequential IDs. +- **LOCK TABLES, GET_LOCK(), and FLUSH TABLES {table list} WITH READ LOCK are not supported** — use transactions. Note: global `FLUSH TABLES WITH READ LOCK` (no table list) IS supported. +- **InnoDB only** — Galera replicates only InnoDB tables. +- **Transaction size limits** — default caps: 128K rows (`wsrep_max_ws_rows`) and 2GB (`wsrep_max_ws_size`). +- **Binary log must be ROW format**. +- **Write-set retry on conflict** (12.1+) — `wsrep_applier_retry_count` controls how many times an applier retries a write set. +- **Automatic SST user account management** (11.6+, MDEV-31809) — Galera now manages the dedicated SST (State Snapshot Transfer) user account automatically. +- **IP allowlist for nodes joining the cluster** (10.10+, MDEV-27246) — `wsrep_allowlist` restricts which IPs can make SST/IST requests. + +## Stale Reads and Consistency + +Galera is "virtually synchronous" — a committed write on one node may not be immediately visible on another without additional synchronization: + +```sql +-- Force a sync point before reading (performance cost): +SET SESSION wsrep_sync_wait = 1; +SELECT * FROM orders WHERE id = 42; +``` + +Use `wsrep_sync_wait` only where strict read-after-write consistency is required. + +## Lost Updates in Galera + +Galera does not prevent lost updates in read-modify-write patterns. Use `SELECT ... FOR UPDATE` explicitly: + +```sql +-- ✗ Race condition — another node could modify between SELECT and UPDATE: +SELECT balance FROM accounts WHERE id = 1; +-- ... application logic ... +UPDATE accounts SET balance = new_value WHERE id = 1; + +-- ✅ Lock the row at read time: +BEGIN; +SELECT balance FROM accounts WHERE id = 1 FOR UPDATE; +UPDATE accounts SET balance = new_value WHERE id = 1; +COMMIT; +``` diff --git a/mariadb-replication-and-ha/docs/overview.md b/mariadb-replication-and-ha/docs/overview.md new file mode 100644 index 0000000..84cb245 --- /dev/null +++ b/mariadb-replication-and-ha/docs/overview.md @@ -0,0 +1,25 @@ +# Replication & HA: Overview + +## What LLMs Get Wrong + +| What you might see | What's correct | +|---|---| +| MySQL GTID format or `gtid_mode=ON` syntax | MariaDB GTID uses a different format (`domain-server-seq`) and different commands — MySQL and MariaDB GTIDs are **incompatible** | +| "Install the Galera plugin" | Galera Cluster is built into MariaDB — no plugin installation required | +| Assuming sequential `AUTO_INCREMENT` in Galera | Galera produces gaps in auto-increment sequences across nodes by design — never rely on sequential values | +| `LOCK TABLES` or `GET_LOCK()` in a Galera environment | Not supported in Galera — use transactions instead | +| Treating a replica as a backup | Replication is not a backup — a `DROP TABLE` on the primary replicates immediately to all replicas | +| Tables without primary keys in a Galera cluster | All tables in Galera must have a primary key — `DELETE` fails on keyless tables | + +## Replication is Not a Backup + +A `DROP TABLE` or `DELETE FROM table` on the primary replicates to all replicas immediately. Replication protects against hardware failure — not against accidental data changes. Maintain independent backups (`mariadb-dump`, `mariadb-backup`) on a schedule separate from replication. + +Delayed replication is one mitigation — intentionally lag a replica by a set time: +```sql +CHANGE MASTER TO MASTER_DELAY = 3600; -- 1 hour lag +``` + +This gives a recovery window for accidental changes, but it is still not a substitute for backups. + +**Point-in-time recovery (13.0+)** — the new `innodb_log_archive` variable instructs InnoDB to preserve the write-ahead log as a continuous sequence of files instead of overwriting a ring buffer. Combined with a base backup, this enables PITR and incremental backups without needing the binary log alone. Use this on systems that need to roll forward to a precise transaction. diff --git a/mariadb-replication-and-ha/docs/semi-sync-replication.md b/mariadb-replication-and-ha/docs/semi-sync-replication.md new file mode 100644 index 0000000..b9ddb3f --- /dev/null +++ b/mariadb-replication-and-ha/docs/semi-sync-replication.md @@ -0,0 +1,15 @@ +# Semi-Synchronous Replication + +The primary waits for at least one replica to acknowledge receipt before committing. Reduces data loss risk on failover without requiring full synchronous overhead. + +```sql +-- Enable on primary: +SET GLOBAL rpl_semi_sync_master_enabled = 1; + +-- Enable on replica: +SET GLOBAL rpl_semi_sync_slave_enabled = 1; +``` + +If no replica acknowledges within `rpl_semi_sync_master_timeout` (default 10 seconds), the primary falls back to async. Built-in since MariaDB 10.3 — no plugin needed. + +Use when: you need stronger data durability than async but your workload tolerates a small write latency increase. diff --git a/mariadb-replication-and-ha/docs/sources.md b/mariadb-replication-and-ha/docs/sources.md new file mode 100644 index 0000000..1971d34 --- /dev/null +++ b/mariadb-replication-and-ha/docs/sources.md @@ -0,0 +1,11 @@ +# Sources + +- [Replication Overview — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/replication-overview) +- [GTID — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/gtid) +- [Parallel Replication — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/parallel-replication) +- [Semi-Synchronous Replication — MariaDB KB](https://mariadb.com/docs/server/ha-and-performance/standard-replication/semisynchronous-replication) +- [Galera Cluster Known Limitations — MariaDB Docs](https://mariadb.com/docs/galera-cluster/reference/mariadb-galera-cluster-known-limitations) +- [Auto-Increments in Galera — mariadb.org](https://mariadb.org/auto-increments-in-galera/) +- [Automatic Failover with MariaDB Monitor — MaxScale Docs](https://mariadb.com/docs/maxscale/mariadb-maxscale-tutorials/automatic-failover-with-mariadb-monitor) + +*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).* diff --git a/oracle-to-mariadb/SKILL.md b/oracle-to-mariadb/SKILL.md index ac7da6a..c316ed6 100644 --- a/oracle-to-mariadb/SKILL.md +++ b/oracle-to-mariadb/SKILL.md @@ -17,147 +17,17 @@ Additional advantages over other open source alternatives: - **ColumnStore** — columnar analytics engine, comparable to Oracle's In-Memory ColumnStore - **Cost**: organizations typically achieve 70–90% cost reduction vs Oracle licensing -> **Requires:** MariaDB Community Server 10.3+ for Oracle compatibility mode. 10.6+ for `ROWNUM`, `TO_CHAR()`, `ADD_MONTHS()`, `SYS_GUID()`. Current LTS is 11.8 (May 2025). - -## The First Step: Enable Oracle Compatibility Mode - -```sql -SET sql_mode = 'ORACLE'; --- or permanently in configuration: --- sql_mode = ORACLE -``` - -Without this, PL/SQL syntax, Oracle data type synonyms, and Oracle-style functions will not work. This is the single most important step and the most commonly missed. - -## What LLMs Get Wrong - -| What you might see | What's correct | -|---|---| -| PL/SQL that fails with syntax errors | Set `sql_mode=ORACLE` first — without it, PL/SQL constructs are not recognized | -| Oracle `DATE` mapped to MariaDB `DATE` | Oracle `DATE` stores date AND time — map to `DATETIME`, not `DATE` | -| Assuming 100% PL/SQL compatibility | ~80% works without changes; `SYNONYM`, `INSERT ALL`, and `CONNECT BY` usually require rewrites — `(+)` joins work in Oracle mode on 12.1+ (see next row) | -| `SYNONYM` usage in schema or code | No equivalent in MariaDB — replace with views or direct object references | -| `(+)` outer join notation | Supported in Oracle mode since MariaDB 12.1 (MDEV-13817) — on older versions rewrite as `LEFT JOIN` / `RIGHT JOIN` | -| `START WITH ... CONNECT BY` | Not supported — rewrite as recursive CTE using `WITH RECURSIVE` | -| `TIMESTAMP WITH TIME ZONE` | Loses timezone on migration — becomes `DATETIME`; handle timezone in application | - -## Data Type Mapping - -`sql_mode=ORACLE` handles these automatically: - -| Oracle | MariaDB | Notes | -|---|---|---| -| `VARCHAR2(n)` | `VARCHAR(n)` | Automatic | -| `NUMBER(p,s)` | `DECIMAL(p,s)` | Automatic | -| `NUMBER` | `DOUBLE` | Automatic | -| `DATE` | `DATETIME` | **Automatic, but note:** Oracle DATE includes time; MariaDB DATE does not | -| `CLOB` | `LONGTEXT` | Automatic | -| `BLOB` | `LONGBLOB` | Automatic | -| `RAW(n)` | `VARBINARY(n)` | Automatic | -| `CHAR(n > 255)` | `VARCHAR(n)` | Manual | -| `TIMESTAMP WITH TIME ZONE` | `DATETIME` | Manual — timezone info lost | -| `BFILE` | `LONGBLOB` | Manual — file path storage differs | -| `ROWID` | `CHAR(10)` | Manual | - -## What sql_mode=ORACLE Covers - -### PL/SQL Syntax (10.3+) -- Packages: `CREATE PACKAGE`, `CREATE PACKAGE BODY` — since 11.4 also work outside `sql_mode=ORACLE` (MDEV-10075), letting you use packages in mixed-mode codebases -- Cursors: explicit, implicit, parameterized, `%ISOPEN`, `%ROWCOUNT`, `%FOUND`, `%NOTFOUND` -- Cursors on prepared statements (12.3+) -- Cursor variables: `TYPE ... IS REF CURSOR` (13.0+) — pass cursors as procedure parameters and return values -- Pre-defined weak `SYS_REFCURSOR` (12.0+) — built-in cursor type, no `TYPE` declaration needed; `max_open_cursors` system variable caps concurrent open cursors -- Variable types: `:=` assignment, `%TYPE`, `%ROWTYPE` -- `ROW` data type as stored function return value (11.7+) — function returns a structured row, similar to Oracle row types -- `RECORD` types in routine parameters and function `RETURN` clauses (13.0+) -- Associative arrays: `DECLARE TYPE ... TABLE OF ... INDEX BY` (12.1+) -- Stored routine parameters can have default values (11.8+) — call procedures with fewer arguments, like Oracle's `DEFAULT` clause -- Control flow: `FOR i IN 1..10 LOOP`, `GOTO`, `EXIT WHEN`, `ELSIF`, `CONTINUE` -- Exception handling: `EXCEPTION WHEN TOO_MANY_ROWS / NO_DATA_FOUND / DUP_VAL_ON_INDEX` -- Anonymous blocks: `BEGIN ... END` -- Dynamic SQL: `EXECUTE IMMEDIATE ... USING` -- Trigger variables: `:NEW`, `:OLD` - -### Oracle-Compatible Functions -| Function | Available Since | -|---|---| -| `DECODE()` | 10.3 | -| `CHR()`, `SUBSTR()` with position 0 | 10.3 | -| `ADD_MONTHS()` | 10.6 | -| `TO_CHAR()` | 10.6 (FM padding-suppression format added in 12.0) | -| `SYS_GUID()` | 10.6 | -| `ROWNUM` | 10.6 | -| `TO_NUMBER()` | 12.2.1 | -| `TO_DATE()` | 12.3 (native; on older versions use `STR_TO_DATE()`) | -| `TRUNC()` | 12.2 | - -### NULL Handling -Oracle treats empty strings as `NULL`. `sql_mode=ORACLE` does **not** activate this automatically — `EMPTY_STRING_IS_NULL` must be added separately: - -```sql -SET sql_mode = 'ORACLE,EMPTY_STRING_IS_NULL'; -``` - -Without `EMPTY_STRING_IS_NULL`, `''` is not `NULL` in MariaDB even in Oracle mode. - -### Other Automatic Behaviors -- `||` as string concatenation (NULL-ignoring) -- `MINUS` as synonym for `EXCEPT` (10.6+) -- Named placeholders (`:1`, `:2`) in prepared statements -- `SELECT UNIQUE` as synonym for `SELECT DISTINCT` -- `DUAL` table support - -## What Requires Manual Rewriting - -These Oracle features have no direct equivalent and require code changes: - -- **`SYNONYM`** — replace with views (`CREATE VIEW`) or update object references directly -- **`INSERT ALL` / `INSERT FIRST`** — rewrite as multiple `INSERT` statements or application logic -- **`(+)` outer join syntax** — supported natively since MariaDB 12.1 in Oracle mode. On older versions rewrite as `LEFT JOIN` / `RIGHT JOIN`: - ```sql - -- Oracle: - SELECT * FROM a, b WHERE a.id = b.id(+); - -- MariaDB: - SELECT * FROM a LEFT JOIN b ON a.id = b.id; - ``` -- **`START WITH ... CONNECT BY`** — rewrite as recursive CTE: - ```sql - -- MariaDB equivalent: - WITH RECURSIVE tree AS ( - SELECT id, parent_id, name FROM categories WHERE parent_id IS NULL - UNION ALL - SELECT c.id, c.parent_id, c.name FROM categories c - JOIN tree t ON c.parent_id = t.id - ) - SELECT * FROM tree; - ``` -- **`TIMESTAMP WITH TIME ZONE`** — store timezone offset in a separate column or handle in application -- **Object types and inheritance** — no equivalent; restructure as normalized tables - -## Migration Tools - -- **[MariaDB Migration Assessment Tool](https://mariadb.com/resources/blog/mariadb-migration-assessment-tool-how-ready-are-you-to-migrate-to-mariadb-from-oracle/)** — analyzes Oracle DDL export and scores compatibility before you start -- **Connect Storage Engine** — create MariaDB tables that read live Oracle data via ODBC; useful for phased migration without full cutover -- **DBeaver** — schema mapping, data comparison, and transfer between Oracle and MariaDB -- **MaxScale query rewriting** — intercept and rewrite unsupported Oracle syntax on-the-fly for cases that can't be changed in the application - -## Key Gotchas - -- **Autocommit**: MariaDB has autocommit enabled by default; Oracle does not. Add explicit `COMMIT`/`ROLLBACK` handling or disable autocommit. -- **`SYSDATE`**: In MariaDB Oracle mode, `SYSDATE` works but returns `DATETIME`. Verify date-only comparisons. -- **`DUAL`**: Works in MariaDB but not identically — avoid schema-qualified references like `schema.DUAL`. -- **Sequences**: `CREATE SEQUENCE` works in MariaDB 10.3+ with Oracle-compatible syntax (`MINVALUE`, `MAXVALUE`, `INCREMENT BY`). -- **`DROP USER` with active sessions** (12.1+): Oracle-compatible behavior — in Oracle mode, `DROP USER` fails if the user has active sessions; in other modes, it issues a warning. -- **~20% rewrite**: No tool achieves 100% Oracle-to-MariaDB conversion. Plan for manual review of complex PL/SQL, object types, and unsupported syntax. - -## Sources - -- [sql_mode=ORACLE — MariaDB Docs](https://mariadb.com/docs/release-notes/community-server/about/compatibility-and-differences/sql_modeoracle) -- [Oracle to MariaDB Migration — mariadb.com](https://mariadb.com/migrations/oracle-to-mariadb/) -- [Easier Oracle to MariaDB Migrations with sql_mode and DBeaver — MariaDB Blog](https://mariadb.com/resources/blog/easier-oracle-to-mariadb-migrations-with-sql_mode-and-dbeaver/) -- [Migration Assessment Tool — MariaDB Blog](https://mariadb.com/resources/blog/mariadb-migration-assessment-tool-how-ready-are-you-to-migrate-to-mariadb-from-oracle/) -- [MariaDB vs PostgreSQL for Oracle Migration — mariadb.com](https://mariadb.com/products/enterprise/comparison/mariadb-vs-postgresql/) -- [Migration from Oracle to MariaDB Deep Dive — Severalnines](https://severalnines.com/blog/migration-oracle-database-mariadb-deep-dive/) -- [Data migration from Oracle to MariaDB with Connect SE — mariadb.org](https://mariadb.org/data-migration-from-oracle-to-mariadb-with-docker-and-connect-se-a-step-by-step-guide/) - -*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).* +> **Requires:** MariaDB Community Server 10.3+ for Oracle compatibility mode; 10.6+ for `ROWNUM`, `TO_CHAR()`, `ADD_MONTHS()`, `SYS_GUID()`. +> +> **Server context:** See [MariaDB Versioning Context](../_shared/versioning.md). + +## Documentation + +- [Deep Dive Index] + - [Overview](docs/overview.md) + - [Data Type Mapping](docs/data-types.md) + - [PL/SQL Syntax](docs/plsql-syntax.md) + - [Oracle-Compatible Functions](docs/functions.md) + - [Manual Rewriting Required](docs/manual-rewrite.md) + - [Key Gotchas](docs/gotchas.md) +- [Sources](docs/sources.md) diff --git a/oracle-to-mariadb/docs/data-types.md b/oracle-to-mariadb/docs/data-types.md new file mode 100644 index 0000000..7a0d458 --- /dev/null +++ b/oracle-to-mariadb/docs/data-types.md @@ -0,0 +1,17 @@ +# Data Type Mapping + +`sql_mode=ORACLE` handles these automatically: + +| Oracle | MariaDB | Notes | +|---|---|---| +| `VARCHAR2(n)` | `VARCHAR(n)` | Automatic | +| `NUMBER(p,s)` | `DECIMAL(p,s)` | Automatic | +| `NUMBER` | `DOUBLE` | Automatic | +| `DATE` | `DATETIME` | **Automatic, but note:** Oracle DATE includes time; MariaDB DATE does not | +| `CLOB` | `LONGTEXT` | Automatic | +| `BLOB` | `LONGBLOB` | Automatic | +| `RAW(n)` | `VARBINARY(n)` | Automatic | +| `CHAR(n > 255)` | `VARCHAR(n)` | Manual | +| `TIMESTAMP WITH TIME ZONE` | `DATETIME` | Manual — timezone info lost | +| `BFILE` | `LONGBLOB` | Manual — file path storage differs | +| `ROWID` | `CHAR(10)` | Manual | diff --git a/oracle-to-mariadb/docs/functions.md b/oracle-to-mariadb/docs/functions.md new file mode 100644 index 0000000..f4c9cc5 --- /dev/null +++ b/oracle-to-mariadb/docs/functions.md @@ -0,0 +1,13 @@ +# Oracle-Compatible Functions + +| Function | Available Since | +|---|---| +| `DECODE()` | 10.3 | +| `CHR()`, `SUBSTR()` with position 0 | 10.3 | +| `ADD_MONTHS()` | 10.6 | +| `TO_CHAR()` | 10.6 (FM padding-suppression format added in 12.0) | +| `SYS_GUID()` | 10.6 | +| `ROWNUM` | 10.6 | +| `TO_NUMBER()` | 12.2.1 | +| `TO_DATE()` | 12.3 (native; on older versions use `STR_TO_DATE()`) | +| `TRUNC()` | 12.2 | diff --git a/oracle-to-mariadb/docs/gotchas.md b/oracle-to-mariadb/docs/gotchas.md new file mode 100644 index 0000000..a049956 --- /dev/null +++ b/oracle-to-mariadb/docs/gotchas.md @@ -0,0 +1,8 @@ +# Key Gotchas + +- **Autocommit**: MariaDB has autocommit enabled by default; Oracle does not. Add explicit `COMMIT`/`ROLLBACK` handling or disable autocommit. +- **`SYSDATE`**: In MariaDB Oracle mode, `SYSDATE` works but returns `DATETIME`. Verify date-only comparisons. +- **`DUAL`**: Works in MariaDB but not identically — avoid schema-qualified references like `schema.DUAL`. +- **Sequences**: `CREATE SEQUENCE` works in MariaDB 10.3+ with Oracle-compatible syntax (`MINVALUE`, `MAXVALUE`, `INCREMENT BY`). +- **`DROP USER` with active sessions** (12.1+): Oracle-compatible behavior — in Oracle mode, `DROP USER` fails if the user has active sessions; in other modes, it issues a warning. +- **~20% rewrite**: No tool achieves 100% Oracle-to-MariaDB conversion. Plan for manual review of complex PL/SQL, object types, and unsupported syntax. diff --git a/oracle-to-mariadb/docs/manual-rewrite.md b/oracle-to-mariadb/docs/manual-rewrite.md new file mode 100644 index 0000000..0a6ec2c --- /dev/null +++ b/oracle-to-mariadb/docs/manual-rewrite.md @@ -0,0 +1,26 @@ +# What Requires Manual Rewriting + +These Oracle features have no direct equivalent and require code changes: + +- **`SYNONYM`** — replace with views (`CREATE VIEW`) or update object references directly +- **`INSERT ALL` / `INSERT FIRST`** — rewrite as multiple `INSERT` statements or application logic +- **`(+)` outer join syntax** — supported natively since MariaDB 12.1 in Oracle mode. On older versions rewrite as `LEFT JOIN` / `RIGHT JOIN`: + ```sql + -- Oracle: + SELECT * FROM a, b WHERE a.id = b.id(+); + -- MariaDB: + SELECT * FROM a LEFT JOIN b ON a.id = b.id; + ``` +- **`START WITH ... CONNECT BY`** — rewrite as recursive CTE: + ```sql + -- MariaDB equivalent: + WITH RECURSIVE tree AS ( + SELECT id, parent_id, name FROM categories WHERE parent_id IS NULL + UNION ALL + SELECT c.id, c.parent_id, c.name FROM categories c + JOIN tree t ON c.parent_id = t.id + ) + SELECT * FROM tree; + ``` +- **`TIMESTAMP WITH TIME ZONE`** — store timezone offset in a separate column or handle in application +- **Object types and inheritance** — no equivalent; restructure as normalized tables diff --git a/oracle-to-mariadb/docs/overview.md b/oracle-to-mariadb/docs/overview.md new file mode 100644 index 0000000..e612893 --- /dev/null +++ b/oracle-to-mariadb/docs/overview.md @@ -0,0 +1,40 @@ +# Oracle to MariaDB: Overview + +## The First Step: Enable Oracle Compatibility Mode + +```sql +SET sql_mode = 'ORACLE'; +-- or permanently in configuration: +-- sql_mode = ORACLE +``` + +Without this, PL/SQL syntax, Oracle data type synonyms, and Oracle-style functions will not work. This is the single most important step and the most commonly missed. + +## What LLMs Get Wrong + +| What you might see | What's correct | +|---|---| +| PL/SQL that fails with syntax errors | Set `sql_mode=ORACLE` first — without it, PL/SQL constructs are not recognized | +| Oracle `DATE` mapped to MariaDB `DATE` | Oracle `DATE` stores date AND time — map to `DATETIME`, not `DATE` | +| Assuming 100% PL/SQL compatibility | ~80% works without changes; `SYNONYM`, `INSERT ALL`, and `CONNECT BY` usually require rewrites — `(+)` joins work in Oracle mode on 12.1+ (see next row) | +| `SYNONYM` usage in schema or code | No equivalent in MariaDB — replace with views or direct object references | +| `(+)` outer join notation | Supported in Oracle mode since MariaDB 12.1 (MDEV-13817) — on older versions rewrite as `LEFT JOIN` / `RIGHT JOIN` | +| `START WITH ... CONNECT BY` | Not supported — rewrite as recursive CTE using `WITH RECURSIVE` | +| `TIMESTAMP WITH TIME ZONE` | Loses timezone on migration — becomes `DATETIME`; handle timezone in application | + +## NULL Handling + +Oracle treats empty strings as `NULL`. `sql_mode=ORACLE` does **not** activate this automatically — `EMPTY_STRING_IS_NULL` must be added separately: + +```sql +SET sql_mode = 'ORACLE,EMPTY_STRING_IS_NULL'; +``` + +Without `EMPTY_STRING_IS_NULL`, `''` is not `NULL` in MariaDB even in Oracle mode. + +## Migration Tools + +- **[MariaDB Migration Assessment Tool](https://mariadb.com/resources/blog/mariadb-migration-assessment-tool-how-ready-are-you-to-migrate-to-mariadb-from-oracle/)** — analyzes Oracle DDL export and scores compatibility before you start +- **Connect Storage Engine** — create MariaDB tables that read live Oracle data via ODBC; useful for phased migration without full cutover +- **DBeaver** — schema mapping, data comparison, and transfer between Oracle and MariaDB +- **MaxScale query rewriting** — intercept and rewrite unsupported Oracle syntax on-the-fly for cases that can't be changed in the application diff --git a/oracle-to-mariadb/docs/plsql-syntax.md b/oracle-to-mariadb/docs/plsql-syntax.md new file mode 100644 index 0000000..f75be48 --- /dev/null +++ b/oracle-to-mariadb/docs/plsql-syntax.md @@ -0,0 +1,24 @@ +# PL/SQL Syntax (10.3+) + +- Packages: `CREATE PACKAGE`, `CREATE PACKAGE BODY` — since 11.4 also work outside `sql_mode=ORACLE` (MDEV-10075), letting you use packages in mixed-mode codebases +- Cursors: explicit, implicit, parameterized, `%ISOPEN`, `%ROWCOUNT`, `%FOUND`, `%NOTFOUND` +- Cursors on prepared statements (12.3+) +- Cursor variables: `TYPE ... IS REF CURSOR` (13.0+) — pass cursors as procedure parameters and return values +- Pre-defined weak `SYS_REFCURSOR` (12.0+) — built-in cursor type, no `TYPE` declaration needed; `max_open_cursors` system variable caps concurrent open cursors +- Variable types: `:=` assignment, `%TYPE`, `%ROWTYPE` +- `ROW` data type as stored function return value (11.7+) — function returns a structured row, similar to Oracle row types +- `RECORD` types in routine parameters and function `RETURN` clauses (13.0+) +- Associative arrays: `DECLARE TYPE ... TABLE OF ... INDEX BY` (12.1+) +- Stored routine parameters can have default values (11.8+) — call procedures with fewer arguments, like Oracle's `DEFAULT` clause +- Control flow: `FOR i IN 1..10 LOOP`, `GOTO`, `EXIT WHEN`, `ELSIF`, `CONTINUE` +- Exception handling: `EXCEPTION WHEN TOO_MANY_ROWS / NO_DATA_FOUND / DUP_VAL_ON_INDEX` +- Anonymous blocks: `BEGIN ... END` +- Dynamic SQL: `EXECUTE IMMEDIATE ... USING` +- Trigger variables: `:NEW`, `:OLD` + +## Other Automatic Behaviors +- `||` as string concatenation (NULL-ignoring) +- `MINUS` as synonym for `EXCEPT` (10.6+) +- Named placeholders (`:1`, `:2`) in prepared statements +- `SELECT UNIQUE` as synonym for `SELECT DISTINCT` +- `DUAL` table support diff --git a/oracle-to-mariadb/docs/sources.md b/oracle-to-mariadb/docs/sources.md new file mode 100644 index 0000000..b40ae51 --- /dev/null +++ b/oracle-to-mariadb/docs/sources.md @@ -0,0 +1,11 @@ +# Sources + +- [sql_mode=ORACLE — MariaDB Docs](https://mariadb.com/docs/release-notes/community-server/about/compatibility-and-differences/sql_modeoracle) +- [Oracle to MariaDB Migration — mariadb.com](https://mariadb.com/migrations/oracle-to-mariadb/) +- [Easier Oracle to MariaDB Migrations with sql_mode and DBeaver — MariaDB Blog](https://mariadb.com/resources/blog/easier-oracle-to-mariadb-migrations-with-sql_mode-and-dbeaver/) +- [Migration Assessment Tool — MariaDB Blog](https://mariadb.com/resources/blog/mariadb-migration-assessment-tool-how-ready-are-you-to-migrate-to-mariadb-from-oracle/) +- [MariaDB vs PostgreSQL for Oracle Migration — mariadb.com](https://mariadb.com/products/enterprise/comparison/mariadb-vs-postgresql/) +- [Migration from Oracle to MariaDB Deep Dive — Severalnines](https://severalnines.com/blog/migration-oracle-database-mariadb-deep-dive/) +- [Data migration from Oracle to MariaDB with Connect SE — mariadb.org](https://mariadb.org/data-migration-from-oracle-to-mariadb-with-docker-and-connect-se-a-step-by-step-guide/) + +*For topics not covered here, see the official MariaDB documentation at [mariadb.com/docs](https://mariadb.com/docs).*