MySQL 生产优化与高可用 — SRE 话题文档 #4
本文档涵盖 MySQL 在生产环境中的部署架构、参数调优、监控告警、故障排查及最佳实践,面向 SRE 工程师和 DBA。
目录
1. 生产环境部署架构
1.1 主从复制架构(Master-Slave)
┌─────────────────────────────────────────────────────────┐
│ 应用层 (Application) │
│ ┌──────────────────────────────────┐ │
│ │ ProxySQL / HAProxy │ │
│ │ (读写分离 + 负载均衡) │ │
│ └──────────────┬───────────────────┘ │
└────────────────────────┼────────────────────────────────┘
│
┌────────────────────────┼────────────────────────────────┐
│ │ │
▼ ▼ │
┌──────────────────┐ ┌──────────────────┐ │
│ Master (RW) │ ───▶ │ Slave (RO) │ │
│ Primary Node │ GTID │ Replica Node │ │
│ │ │ │ │
│ - 写入流量 │ │ - 读取流量 │ │
│ - 全量数据 │ │ - 数据同步 │ │
└────────┬─────────┘ └──────────────────┘ │
│ │
│ 异步/半同步复制 │
│ (Async/Semi-Sync Replication) │
▼ │
┌──────────────────┐ │
│ Slave (RO) │ ← 可扩展多个只读副本 │
│ Replica Node │ │
└──────────────────┘ │
│
┌─────────────────────────────────────────────────────────────────────────┘
│
│ 数据流向: App → ProxySQL → Master(写) / Slave(读)
│ 高可用: Master 故障时,通过 Orchestrator/MHA 提升 Slave 为新 Master
└──────────────────────────────────────────────────────────────────────────
1.2 MGR 高可用集群架构(MySQL Group Replication)
┌─────────────────────────────────────────────────────────┐
│ 应用层 (Application) │
│ ┌──────────────────────────────────┐ │
│ │ MySQL Router / ProxySQL │ │
│ │ (自动路由 + 故障转移) │ │
│ └──────────────┬───────────────────┘ │
└────────────────────────┼────────────────────────────────┘
│
┌────────────────────────┼────────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Node 1 (RW) │◀───▶ │ Node 2 (RO) │◀───▶ │ Node 3 (RO) │
│ PRIMARY │ │ SECONDARY │ │ SECONDARY │
│ │ │ │ │ │
│ - 单点写入 │ │ - 只读副本 │ │ - 只读副本 │
│ - 自动选主 │ │ - 自动晋升 │ │ - 自动晋升 │
└──────────────────┘ └──────────────────┘ └──────────────────┘
│ │ │
└────────────────────────┼─────────────────────────┘
│
┌──────────────┴──────────────┐
│ Group Replication Plugin │
│ (Paxos 协议一致性) │
│ - 多数派确认写入 │
│ - 自动故障检测与切换 │
└─────────────────────────────┘
特性:
- 单主模式: 仅 Primary 可写,自动选主
- 多主模式: 所有节点可写(需处理冲突)
- 强一致性: 基于 Paxos 的分布式共识
- 自动容错: 节点故障自动剔除,恢复后自动加入
1.3 Kubernetes 部署配置
1.3.1 StatefulSet 部署 MySQL 主从
# mysql-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: database
labels:
app: mysql
spec:
serviceName: mysql-headless
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: init-mysql
image: mysql:8.0.35
command:
- bash
- "-c"
- |
set -ex
# 从 Pod 序号生成 server-id
[[ $(hostname) =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
# server-id 不能为 0
echo server-id=$((100 + ordinal)) >> /mnt/conf.d/server-id.cnf
# 序号为 0 的是主节点
if [[ $ordinal -eq 0 ]]; then
cp /mnt/config-map/master.cnf /mnt/conf.d/
else
cp /mnt/config-map/slave.cnf /mnt/conf.d/
fi
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
- name: config-map
mountPath: /mnt/config-map
- name: clone-mysql
image: gcr.io/google-samples/xtrabackup:1.0
command:
- bash
- "-c"
- |
set -ex
[[ -d /var/lib/mysql/mysql ]] && exit 0
[[ $(hostname) =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
[[ $ordinal -eq 0 ]] && exit 0
# 从前一个 Pod 克隆数据
ncat --recv-only mysql-$(($ordinal-1)).mysql-headless 3307 | xbstream -x -C /var/lib/mysql
xtrabackup --prepare --target-dir=/var/lib/mysql
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
containers:
- name: mysql
image: mysql:8.0.35
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
- name: MYSQL_DATABASE
value: "production"
ports:
- name: mysql
containerPort: 3306
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2000m"
memory: "4Gi"
livenessProbe:
exec:
command: ["mysqladmin", "ping", "-p${MYSQL_ROOT_PASSWORD}"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
exec:
command: ["mysql", "-h", "127.0.0.1", "-p${MYSQL_ROOT_PASSWORD}", "-e", "SELECT 1"]
initialDelaySeconds: 5
periodSeconds: 2
timeoutSeconds: 1
- name: xtrabackup
image: gcr.io/google-samples/xtrabackup:1.0
ports:
- name: xtrabackup
containerPort: 3307
command:
- bash
- "-c"
- |
set -ex
cd /var/lib/mysql
if [[ -f xtrabackup_slave_info && "x$(<xtrabackup_slave_info)" != "x" ]]; then
cat xtrabackup_slave_info | sed 's/$/\,/g' > change_master_to.sql.in
fi
exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
"exec xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root --password=${MYSQL_ROOT_PASSWORD}"
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
volumes:
- name: conf
emptyDir: {}
- name: config-map
configMap:
name: mysql-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "ssd-storage"
resources:
requests:
storage: 100Gi
1.3.2 ConfigMap 配置
# mysql-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql-config
namespace: database
data:
master.cnf: |
[mysqld]
log-bin=mysql-bin
binlog-format=ROW
gtid-mode=ON
enforce-gtid-consistency=ON
server-id=100
binlog-cache-size=1M
max-binlog-size=512M
expire-logs-days=7
sync-binlog=1
innodb-flush-log-at-trx-commit=1
slave.cnf: |
[mysqld]
log-bin=mysql-bin
binlog-format=ROW
gtid-mode=ON
enforce-gtid-consistency=ON
server-id=101
relay-log=relay-bin
relay-log-index=relay-bin.index
read-only=ON
binlog-cache-size=1M
max-binlog-size=512M
expire-logs-days=7
1.3.3 Service 配置
# mysql-service.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql-headless
namespace: database
spec:
clusterIP: None
selector:
app: mysql
ports:
- name: mysql
port: 3306
targetPort: 3306
---
apiVersion: v1
kind: Service
metadata:
name: mysql-read
namespace: database
spec:
selector:
app: mysql
ports:
- name: mysql
port: 3306
targetPort: 3306
---
apiVersion: v1
kind: Service
metadata:
name: mysql-write
namespace: database
spec:
selector:
app: mysql
statefulset.kubernetes.io/pod-name: mysql-0
ports:
- name: mysql
port: 3306
targetPort: 3306
1.4 Docker Compose 部署(开发/测试环境)
# docker-compose.yml
version: '3.8'
services:
mysql-master:
image: mysql:8.0.35
container_name: mysql-master
hostname: mysql-master
environment:
MYSQL_ROOT_PASSWORD: ${MYSQL_ROOT_PASSWORD:-root123}
MYSQL_DATABASE: production
MYSQL_USER: app_user
MYSQL_PASSWORD: ${MYSQL_APP_PASSWORD:-app123}
ports:
- "3306:3306"
volumes:
- mysql-master-data:/var/lib/mysql
- ./conf/master.cnf:/etc/mysql/conf.d/custom.cnf:ro
- ./init:/docker-entrypoint-initdb.d:ro
command:
- --server-id=1
- --log-bin=mysql-bin
- --binlog-format=ROW
- --gtid-mode=ON
- --enforce-gtid-consistency=ON
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_unicode_ci
networks:
- mysql-network
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-p${MYSQL_ROOT_PASSWORD:-root123}"]
interval: 10s
timeout: 5s
retries: 5
mysql-slave:
image: mysql:8.0.35
container_name: mysql-slave
hostname: mysql-slave
environment:
MYSQL_ROOT_PASSWORD: ${MYSQL_ROOT_PASSWORD:-root123}
ports:
- "3307:3306"
volumes:
- mysql-slave-data:/var/lib/mysql
- ./conf/slave.cnf:/etc/mysql/conf.d/custom.cnf:ro
command:
- --server-id=2
- --log-bin=mysql-bin
- --binlog-format=ROW
- --gtid-mode=ON
- --enforce-gtid-consistency=ON
- --read-only=ON
- --relay-log=relay-bin
- --character-set-server=utf8mb4
- --collation-server=utf8mb4_unicode_ci
networks:
- mysql-network
depends_on:
mysql-master:
condition: service_healthy
proxysql:
image: proxysql/proxysql:2.5.5
container_name: proxysql
ports:
- "6033:6033" # MySQL 代理端口
- "6032:6032" # 管理端口
- "6080:6080" # Web 管理界面
volumes:
- ./proxysql/proxysql.cnf:/etc/proxysql.cnf:ro
networks:
- mysql-network
depends_on:
- mysql-master
- mysql-slave
mysql-exporter:
image: prom/mysqld-exporter:v0.15.0
container_name: mysql-exporter
ports:
- "9104:9104"
environment:
DATA_SOURCE_NAME: "root:${MYSQL_ROOT_PASSWORD:-root123}@(mysql-master:3306)/"
networks:
- mysql-network
depends_on:
- mysql-master
networks:
mysql-network:
driver: bridge
volumes:
mysql-master-data:
mysql-slave-data:
2. 关键参数调优
2.1 核心配置文件 (my.cnf)
# /etc/my.cnf - MySQL 8.0 生产环境配置
# 适用于 64GB 内存服务器,请根据实际硬件调整
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
default-character-set = utf8mb4
[mysqld]
# ============================================
# 基础配置
# ============================================
user = mysql
port = 3306
socket = /var/run/mysqld/mysqld.sock
pid-file = /var/run/mysqld/mysqld.pid
basedir = /usr
datadir = /data/mysql/data
tmpdir = /data/mysql/tmp
# 字符集配置
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init_connect = 'SET NAMES utf8mb4'
# 默认存储引擎
default-storage-engine = InnoDB
# ============================================
# 连接与线程配置
# ============================================
# 最大连接数 (根据业务峰值设置,监控 Max_used_connections)
max_connections = 2000
# 为每个连接预留的缓冲区
# 每个连接占用内存 = thread_stack + join_buffer + sort_buffer + read_buffer + read_rnd_buffer
thread_stack = 256K
# 并发线程数限制(防止过多线程同时执行)
thread_concurrency = 32
thread_handling = one-thread-per-connection
# 线程缓存,减少线程创建销毁开销
# 建议设置为 max_connections 的 10-20%
thread_cache_size = 200
# 连接超时设置
connect_timeout = 10
wait_timeout = 28800
interactive_timeout = 28800
# ============================================
# InnoDB 存储引擎配置(核心!)
# ============================================
# 缓冲池大小 - 最重要的参数
# 建议设置为物理内存的 50-80%
innodb_buffer_pool_size = 48G
# 缓冲池实例数(提高并发性能)
# 每个 instance 建议 1GB,最大 64
innodb_buffer_pool_instances = 8
# 日志文件大小(影响写入性能和恢复时间)
# 建议设置为 buffer_pool_size 的 25%
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
# 日志缓冲区大小
innodb_log_buffer_size = 64M
# 刷盘策略(1=最安全,0/2=性能更好但有风险)
# 生产环境强烈建议设为 1
innodb_flush_log_at_trx_commit = 1
# Redo log 刷盘策略
innodb_log_writer_threads = ON
# 数据文件配置
innodb_data_file_path = ibdata1:1G:autoextend
innodb_autoextend_increment = 64M
# 文件格式
innodb_file_per_table = ON
# IO 配置
# 后台 IO 线程数(读/写)
innodb_read_io_threads = 8
innodb_write_io_threads = 8
# 异步 IO(Linux 推荐开启)
innodb_use_native_aio = ON
# 刷新邻接页(SSD 设为 OFF,HDD 设为 ON)
innodb_flush_neighbors = OFF
# 脏页刷盘策略
innodb_max_dirty_pages_pct = 75
innodb_max_dirty_pages_pct_lwm = 10
# Change Buffer(减少二级索引更新 IO)
innodb_change_buffer_max_size = 25
# Doublewrite Buffer(数据安全,SSD 可考虑关闭)
innodb_doublewrite = ON
# 锁配置
innodb_lock_wait_timeout = 50
innodb_deadlock_detect = ON
# 自适应哈希索引
innodb_adaptive_hash_index = ON
# ============================================
# Binary Log 配置(复制与恢复)
# ============================================
log_bin = mysql-bin
binlog_format = ROW
binlog_cache_size = 1M
max_binlog_size = 512M
# Binlog 过期时间(天)
binlog_expire_logs_seconds = 604800
# 或使用旧语法(已废弃): expire_logs_days = 7
# GTID 配置(推荐开启)
gtid_mode = ON
enforce_gtid_consistency = ON
# 主从同步安全
sync_binlog = 1
binlog_checksum = CRC32
log_bin_trust_function_creators = ON
# Row 格式下是否记录全镜像
binlog_row_image = FULL
# ============================================
# 慢查询日志配置
# ============================================
slow_query_log = ON
slow_query_log_file = /data/mysql/logs/slow.log
long_query_time = 1.0
log_queries_not_using_indexes = ON
log_slow_admin_statements = ON
log_slow_slave_statements = ON
min_examined_row_limit = 100
# ============================================
# 错误日志
# ============================================
log_error = /data/mysql/logs/error.log
log_error_verbosity = 3
# ============================================
# 临时表配置
# ============================================
tmp_table_size = 64M
max_heap_table_size = 64M
# ============================================
# 查询缓存(MySQL 8.0 已移除)
# ============================================
# query_cache_type = 0
# query_cache_size = 0
# ============================================
# 表缓存
# ============================================
table_open_cache = 4000
table_definition_cache = 2000
table_open_cache_instances = 16
# ============================================
# 排序与连接缓冲区
# ============================================
# 注意:这些是每连接缓冲区,设置过大可能导致内存溢出
sort_buffer_size = 2M
join_buffer_size = 2M
read_buffer_size = 1M
read_rnd_buffer_size = 1M
# ============================================
# 安全配置
# ============================================
# 禁止 DNS 解析(提升连接速度)
skip_name_resolve = ON
# 密码验证策略(可选)
# validate_password.length = 12
# validate_password.policy = MEDIUM
# ============================================
# 其他优化
# ============================================
# 打开文件限制
open_files_limit = 65535
# 二进制日志同步
binlog_sync_period = 0
# 性能 Schema
performance_schema = ON
performance_schema_instrument = '%=ON'
[mysqldump]
quick
max_allowed_packet = 64M
single-transaction
routines
triggers
[mysql]
no-auto-rehash
default-character-set = utf8mb4
prompt = '\\u@\\h [\\d]> '
[mysql.server]
user = mysql
basedir = /usr
2.2 关键参数详解表
| 参数名 | 推荐值 | 说明 | 调优依据 |
|---|---|---|---|
innodb_buffer_pool_size |
物理内存 50-80% | InnoDB 缓冲池,缓存数据和索引 | 越大越好,但不能导致系统 swap |
innodb_log_file_size |
buffer_pool 25% | Redo log 文件大小 | 影响写入性能和恢复时间 |
innodb_flush_log_at_trx_commit |
1 | 日志刷盘策略 | 1=ACID 安全,2=性能折中 |
sync_binlog |
1 | Binlog 同步策略 | 主库必须为 1,从库可设为 0 |
max_connections |
2000-5000 | 最大连接数 | 根据业务峰值设置 |
innodb_io_capacity |
SSD: 2000-5000 | IO 能力估算 | 影响 dirty page flush 速度 |
innodb_io_capacity_max |
SSD: 4000-10000 | IO 峰值能力 | 建议 2x innodb_io_capacity |
innodb_buffer_pool_instances |
buffer_pool/1G | 缓冲池实例数 | 减少锁竞争 |
2.3 动态参数在线调整
-- 查看当前参数值
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';
SHOW VARIABLES LIKE 'max_connections';
-- 动态调整参数(不需要重启)
SET GLOBAL max_connections = 3000;
SET GLOBAL slow_query_log = ON;
SET GLOBAL long_query_time = 2.0;
SET GLOBAL innodb_flush_log_at_trx_commit = 1;
-- 在线调整 InnoDB Buffer Pool(MySQL 5.7.5+)
SET GLOBAL innodb_buffer_pool_size = 53687091200; -- 50GB
-- 注意:调整过程会阻塞,可通过 innodb_buffer_pool_chunk_size 控制步进
-- 查看参数状态
SHOW STATUS LIKE 'Innodb_buffer_pool_resize_status';
3. 系统内核/OS 层优化
3.1 sysctl 内核参数优化
# /etc/sysctl.d/99-mysql.conf
# MySQL 生产环境内核优化
# ============================================
# 内存管理
# ============================================
# 减少系统使用 swap 的倾向(推荐 1-10)
vm.swappiness = 1
# 脏页比例控制(用于数据库服务器)
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
vm.dirty_expire_centisecs = 1500
vm.dirty_writeback_centisecs = 300
# 内存过量分配策略
vm.overcommit_memory = 1
vm.overcommit_ratio = 80
# ============================================
# 文件系统
# ============================================
# 增加系统最大文件描述符
fs.file-max = 2000000
# 文件句柄缓存
fs.nr_open = 2000000
# ============================================
# 网络参数
# ============================================
# 增加系统端口范围
net.ipv4.ip_local_port_range = 1024 65535
# TCP 连接复用
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
# TCP 连接队列
net.ipv4.tcp_max_syn_backlog = 65535
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# TCP keepalive
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
# TCP 缓冲区
net.core.rmem_default = 262144
net.core.rmem_max = 16777216
net.core.wmem_default = 262144
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# TIME_WAIT 套接字数量限制
net.ipv4.tcp_max_tw_buckets = 262144
# ============================================
# 共享内存(MySQL 使用较少,但可能需要)
# ============================================
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
# 应用配置
sysctl -p /etc/sysctl.d/99-mysql.conf
3.2 用户限制配置
# /etc/security/limits.d/mysql.conf
# MySQL 用户资源限制
mysql soft nofile 65535
mysql hard nofile 65535
mysql soft nproc 65535
mysql hard nproc 65535
mysql soft stack 10240
mysql hard stack 32768
mysql soft memlock unlimited
mysql hard memlock unlimited
# 需确保 PAM 配置生效
# /etc/pam.d/common-session (Debian/Ubuntu)
session required pam_limits.so
# /etc/pam.d/login (RHEL/CentOS)
session required pam_limits.so
3.3 磁盘 I/O 调度器
# 查看当前调度器
cat /sys/block/sda/queue/scheduler
# 输出示例: [noop] deadline cfq
# SSD 推荐使用 noop 或 none
echo noop > /sys/block/sda/queue/scheduler
echo none > /sys/block/nvme0n1/queue/scheduler
# HDD 推荐使用 deadline
echo deadline > /sys/block/sda/queue/scheduler
# 永久配置(通过 udev 规则)
cat > /etc/udev/rules.d/60-scheduler.rules << 'EOF'
# SSD 使用 noop
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="noop"
ACTION=="add|change", KERNEL=="nvme[0-9]*", ATTR{queue/scheduler}="none"
# HDD 使用 deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="deadline"
EOF
# 应用规则
udevadm control --reload-rules
udevadm trigger
3.4 文件系统选择与挂载选项
# 推荐:XFS 或 ext4
# XFS 挂载选项(推荐用于 MySQL)
# noatime: 不更新访问时间
# nodiratime: 不更新目录访问时间
# nobarrier: 禁用写入屏障(有电池备份的 RAID 卡可用)
/dev/sdb1 /data/mysql xfs noatime,nodiratime,nobarrier 0 0
# ext4 挂载选项
/dev/sdb1 /data/mysql ext4 noatime,nodiratime,data=writeback 0 0
# 创建文件系统
mkfs.xfs -f -L mysql-data /dev/sdb1
mkfs.ext4 -L mysql-data /dev/sdb1
# 禁用文件系统日志(不推荐,仅极端性能场景)
# tune2fs -O ^has_journal /dev/sdb1
3.5 Transparent Huge Pages (THP)
# THP 可能导致 MySQL 性能问题,建议禁用
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
# 永久禁用
cat > /etc/systemd/system/disable-thp.service << 'EOF'
[Unit]
Description=Disable Transparent Huge Pages (THP)
DefaultDependencies=no
After=sysinit.target local-fs.target
Before=basic.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/defrag'
[Install]
WantedBy=basic.target
EOF
systemctl daemon-reload
systemctl enable disable-thp
systemctl start disable-thp
3.6 NUMA 配置
# 查看 NUMA 拓扑
numactl --hardware
numastat -m
# MySQL 在 NUMA 架构下的建议:
# 方法1:使用 numactl 启动 MySQL
numactl --interleave=all mysqld_safe &
# 方法2:在 systemd 服务中配置
cat > /etc/systemd/system/mysqld.service.d/override.conf << 'EOF'
[Service]
ExecStartPre=/usr/bin/numactl --interleave=all
EOF