If you are running MariaDB 10.1 branch on any major distro (Ububtu, CentOS/RHEL…) with WSREP=ON (indicating replication such as galera), then your MariaDB systemd unit will most likely fail to start.
Steps to reproduce:
- Install mariadb-10.1.27 or earlier for any distro
- enable WSREP (have a cluster running)
- upgrade one node to 10.1.28
- mysql will simply quit any time you start the systemd unit, with no fishy warning or error messages whatsowver, just a silent insta-death.
If you now try to run mysqld (do a "which mysqld") in foreground, it will actually keep on running.
Note: use "killall mysqld" on a different shell to gracefully quit mysqld when running in foreground, because mysqld will ignore you hammering Ctrl+C, duh.
So my guess was: "This has to do with the unit file, or with mysqld_safe". Turns out i was right, after a bit of googling i found this patch for mysqld_safe, which is just broken in that release.
diff mysqld_safe.bug mysqld_safe.fixed
< local wr_logfile=$(mktemp wsrep_recovery.XXXXXX)
> local wr_logfile=$(mktemp -t wsrep_recovery.XXXXXX)
< eval_log_error "$mysqld_cmd –wsrep_recover $wr_options 2> $wr_logfile"
> eval_log_error "$mysqld_cmd –wsrep_recover $wr_options > $wr_logfile"
Source: click here
I wonder how this can happen, do they run their rpm package tests only without replication enabled!?
Of course I hit this in production, where else!? You don't run replication in testing, because nobody is willing to pay the extra server(s) 😀