This morning around 6:00 am UTC Bitbucket began to fail with intermittent 500 errors, which continued for several hours.
Our investigation shows that the root cause was our syslog server crashing. The syslog queues on all of our other servers filled up and they became unresponsive.
We’re currently re-examining our syslog configuration, particularly looking at on-disk queuing, to ensure this single point of failure is avoided. We’re also adding new monitoring to detect if a similar situation were to reoccur.
We’re very sorry for the inconvenience this downtime caused. This and other service related updates can be found on our status site status.bitbucket.org.