If you are running MariaDB 10.1 branch on any major distro (Ububtu, CentOS/RHEL…) with WSREP=ON (indicating replication such as galera), then your MariaDB systemd unit will most likely fail to start.
Steps to reproduce:
- Install mariadb-10.1.27 or earlier for any distro
- enable WSREP (have a cluster running)
- upgrade one node to 10.1.28
- mysql will simply quit any time you start the systemd unit, with no fishy warning or error messages whatsowver, just a silent insta-death.
If you now try to run mysqld (do a „which mysqld“) in foreground, it will actually keep on running.
Note: use „killall mysqld“ on a different shell to gracefully quit mysqld when running in foreground, because mysqld will ignore you hammering Ctrl+C, duh.
So my guess was: „This has to do with the unit file, or with mysqld_safe“. Turns out i was right, after a bit of googling i found this patch for mysqld_safe, which is just broken in that release.
diff mysqld_safe.bug mysqld_safe.fixed
249c249
< local wr_logfile=$(mktemp wsrep_recovery.XXXXXX)
—
> local wr_logfile=$(mktemp -t wsrep_recovery.XXXXXX)
271c271
< eval_log_error „$mysqld_cmd –wsrep_recover $wr_options 2> $wr_logfile“
—
> eval_log_error „$mysqld_cmd –wsrep_recover $wr_options > $wr_logfile“
Source: click here
I wonder how this can happen, do they run their rpm package tests only without replication enabled!?
Of course I hit this in production, where else!? You don’t run replication in testing, because nobody is willing to pay the extra server(s) 😀