Database Issue
Incident Report for Fundraise Up Status
Resolved
Donations made in the one and a half hours before the incident are now showing up on the Dashboard. There are no longer any delays in displaying, searching, or CRM synchronization. We are closing this incident.

However, our work does not stop here. We still have to conduct a detailed analysis of what happened and develop a comprehensive set of measures to prevent such incidents in the future.

Thank you for following our updates. If you have any questions, our support team is always here to help — support@fundraiseup.com.
Posted Apr 25, 2024 - 16:53 EDT
Update
As our engineers are working on data recovery, let's briefly explain what happened.

We store our clients' data across multiple database clusters. These consist of dozens of physical servers networked together, located in various data centers and countries. We routinely replace servers, update systems, and configurations — standard maintenance. We never perform updates on all clusters simultaneously. Likewise, before any significant change, we test everything we plan to do on test stands equivalent to the production environment.

Last night, during the routine reconfiguration of two clusters, we made a critical error in the configuration file. Due to several reasons, this error went undetected on the test stand. As a result, the "cluster collapsed," and data started to be deleted rapidly. Within 5 minutes, our incident response team was on a Zoom call, discussing emergency measures we needed to take.

We have several levels of backups set up. We make full backups of all databases daily, as well as incremental backups every hour. The data volumes are measured in terabytes, so the bulk of the time was spent simply transferring data across the network and rebuilding the clusters.

During the recovery process, we had to disable the ability to log into the Dashboard and Donor Portal for all organizations so that users would not make changes that we could not later reconcile with the data restored from backups. Also, many organizations whose data was stored in the damaged clusters could not accept donations during our recovery efforts.

The system is now fully operational, but it will take some more time to restore data that was changed an hour and a half before the incident began. We expect a full data recovery, with no data loss.

Incidents of this nature are extremely rare for us, and we've never faced such a significant problem in our history. Nonetheless, we thoroughly investigate every incident, identify the reason it occurred, and develop a comprehensive set of measures to prevent similar incidents in the future.
Posted Apr 25, 2024 - 10:09 EDT
Monitoring
The root cause of the incident has been addressed, and all major systems are back online. All organizations now have access to the dashboard, and both Checkout and Checkout Pages are fully operational. Donors are able to access the donor portal.

However, we still have some recovery work to do. Some organizations may temporarily not see donations made in the one and a half hours before the incident displayed in the Dashboard. Additionally, some organizations may experience delays in donations appearing in search, insights, and CRM synchronization.

We are diligently working to resolve these issues. Also, once things have settled down, we will provide a detailed account of what happened and the measures we will take to prevent such incidents in the future.
Posted Apr 25, 2024 - 08:01 EDT
Update
We're still working on resolving the issue.
Posted Apr 25, 2024 - 06:16 EDT
Update
We're still working hard to resolve the issue. Unfortunately, it's going to take us a bit longer to fix.
Posted Apr 25, 2024 - 05:32 EDT
Update
Our incident response team is in sync and working on resolving the issue. Currently, we estimate that the incident will be resolved in about 30 minutes.
Posted Apr 25, 2024 - 04:46 EDT
Identified
We have identified that two of our database clusters were damaged due to a configuration error. This has led to the system being unavailable for some organizations. They are unable to accept donations. We are working on resolving the issue.
Posted Apr 25, 2024 - 04:15 EDT
Investigating
Our monitoring system has reported a database malfunction, which has resulted in the unavailability of the dashboard and donor portal. We are currently investigating the incident and will keep you updated.
Posted Apr 25, 2024 - 03:54 EDT
This incident affected: Dashboard, Donor Portal, Elements, Checkout, and Rest API.