GitLab, thanks for using PostgreSQL 9.6 and its replication and backup facilities.
We’re sorry that you lost your database:https://about.gitlab.com/2017/02/01/gitlab-dot-com-database-incident/
Thank you for posting this publicly to allow us to comment on this for your postmortem analysis.
I’m very happy that you monitor Replication Lag, that is good. Replication lag of 4GB is at times normal, so shouldn’t have caused major concern. I’ve recently fixed a bug in replication that caused replication to hang in some cases for up to a minute; we released a public fix to that and it will be included in the next maintenance release of PostgreSQL 9.6. It’s not certain that the bug was hit and, if it was, whether that was enough to cause the slow down noted. The openness of your response means we should do the same also, so I’m mentioning this issue here for that reason.
Restarting replication was probably unnecessary but if you shutdown the
Original URL: http://feedproxy.google.com/~r/feedsapi/BwPx/~3/ty8AsKhb4HI/