Backing up Validate data
This article describes backing up and restoring Validate data.
All data generated during project builds, including build information in the MariaDB database and defect data in Lucene, is stored in the projects_root directory. The data changes frequently as users load builds, cite issues, and change settings or configuration. Backing up the projects_root directory ensures that all relevant data can be restored, if needed. For information about the default location and other considerations, see Projects_root directory.
The Validate installation directory contains base checker configuration files, custom checkers, and database configuration files that may need to be backed up periodically when they are modified, added, or removed. If you modify configuration files in your Validate installation directory, back up them up after making changes. For examples of these files, see Before you begin.
If you have created and deployed custom checkers, back up the <server_install>/plugins
directory.
As part of an integration build analysis, Validate copies any necessary data from the tables directory to the projects_root directory, so you do not need to back up the tables directory.
Before you begin
Configuration settings and data in your Validate installation directory that are shared across all servers running on the Validate installation folder are not backed up automatically. If you will eventually change your machine or migrate your Validate installation, manually back up your Validate installation folder and include this data. Examples of this data include the following:
- Modified compiler configurations, settings, and custom checkers in the
config
andplugins
folders. - The
kwfilter.conf
andkwmysql.ini
files in theconfig
folder. - Custom taxonomies in the taxonomies folder.
- The
clients.json
file in theclients
folder.
Tips for protecting your data
We recommend taking the following precautions when backing up a project or Validate server.
- To protect a backup from a potential server or system failure, store the backup in a secure location that is separate from the server it was created on.
- To minimize security risks, unintended changes, and unauthorized access, use a service account instead of a system user account when creating a backup.
- To protect a backup from tampering, create a hash of the backup before storing it and keep the hash in a secure location. When you're ready to restore the backup, generate the hash again and make sure it matches the original to confirm that the backup has not been altered.
- To protect a backup from unauthorized access, encrypt the backup folder using a method of your choice. For example, place the backup folder in a password-protected location or apply password protection directly to the folder.
- To ensure all data is captured, back up any customized configuration files and custom checkers (non-volatile data on which it is safe to perform a hot backup).
What should I do if my data is corrupted?
Corruptions can happen at any time and without warning. To preserve analysis history that can be used for further investigation, make sure to preserve any corrupted databases. Without this data, you may have to recreate entire projects.
To protect against future corruptions, back up your data at regular intervals as described in the section How often should I back up my data?
How often should I back up my data?
We recommend backing up your data at regular intervals. The frequency you choose may range from days to months depending on several factors, which include the following:
- Organizational standards. For example, your organization may have an existing policy that determines backup frequency or maintenance schedules.
- Risk tolerance. For example, a large organization that uses analysis results for compliance certification may want to back up more frequently.
Backup methods
We recommend reading each of the following descriptions and procedures in the sections below to decide which method is right for you.
Cold backup
We recommend using a cold backup to back up an entire server without having to back up each project individually.
A cold backup shuts down the Validate server for a short period to ensure the following:
- In-flight data transactions are completed before backing up data
- Restoration from the backup is consistent
Hot backup
A hot backup is performed while the Validate server is running and users are logged into the system. While this minimizes downtime, it can lead to data loss or corruption as users continue to work during the process. The only hot backup method that is officially supported and will not lead to data loss or corruption is Method 1: Use supported scripts.
Hybrid backup
A hybrid backup is a hot backup followed by a cold backup. This minimizes downtime while ensuring data is consistent.
Performing a cold backup
-
Back up any customized configuration files from <server_install>/config separately.
- Stop the Validate server. See Stopping the servers running as regular processes.
- Use normal operating system (or backup utility) commands to copy the entire projects_root directory to your desired backup media.
- Start the Validate server. See Starting the servers as regular processes.
Restoring from a cold backup
- Stop the Validate server. See Stopping the servers running as regular processes.
- If the entire installation was corrupted and reinstalled, and if you backed up any configuration files during the preparation phase, copy the files back into the newly installed <server_install>/config directory.
- Delete the contents of the corrupted projects_root directory.
- Restore the projects_root from backup media to the projects_root location.
- Start the Validate server. See Starting the servers as regular processes.
Performing a hot backup
The following methods can be used to perform a backup while your server is running.
Method 1: Use supported scripts
To minimize downtime, you can back up your server and project information without ever stopping your server. Later, you can restore a backup to a new projects root.
This method is similar to Method 2: Use kwprojcopy with some additional benefits:
- To prevent file corruption, project information is locked while creating a backup archive.
- You never have to stop your server from running.
- You can customize the scripts to suit your needs. For example, you can execute your preferred file copy tools while creating a backup archive.
Instead of manually running the Python scripts, use the launchers validate_backup and validate_restore to run the scripts automatically. The launchers ensure that the scripts will work within a virtual environment, use the bundled Python installation, and install any required dependencies.
Run the commands that apply to your system.
Using supported scripts on Windows
Run the following commands:
<Validate-installation>/bin/validate_backup.cmd # For backup <Validate-installation>/bin/validate_restore.cmd # For restore
Using supported scripts on Linux
- Install the system libraries that are required to run backup.py and restore.py on Linux, run the example commands below, adjusting to suit your Linux distribution:
curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
sudo apt-get install libmariadb3=1:11.5.2+maria~ubu2404 libmariadb-dev=1:11.5.2+maria~ubu2404 - Next, run the following commands:
--db-host
parameter to the command, followed by the fully qualified domain name (FQDN) or the IP address of the host where the database is running. For example: backup --db-host 127.0.0.1
<Validate-installation>/bin/validate_backup # For backup
<Validate-installation>/bin/validate_restore # For restore
Backing up, restoring, and verifying project and server data
For full instructions, see the following articles:
Method 2: Use kwprojcopy
Kwprojcopy incrementally back ups each project. While kwprojcopy is running, it locks projects and collects the data required for a project backup. You can later restore the data using the kwprojcopy import function.
Using kwprojcopy requires no downtime, but only backs up project-specific data and not global data (such as reports definitions).
Method 3: Back up an entire Virtual Machine
You can use a virtual machine (VM) backup to increase backup speed while your server is shut down. To determine if this method is viable, first investigate how long it takes to create a snapshot in different VM architectures.
- Stop the Validate server.
- Back up the VM image according to your requirements and procedures.
- Start the Validate server.
To restore from the backup, stop the Validate server and replace your current VM image and/or projects_root directory. Then, start the Validate server to complete the restoration.
Performing a hybrid backup
You can use rsync to perform a double backup. Rsync can incrementally copy files at the binary level. If you perform a full backup, the next rsync will only copy files that have changed, reducing the time required for incremental backups.
- While Validate server is running, run "rsync -a projects_root projects_root_backup" on the live projects_root.
- Stop the Validate server.
- Run "rsync -a --delete projects_root projects_root_backup" to copy the changed files. Since the amount of files is likely smaller, it should take less time to complete.
- Start the Validate server.
To restore from the backup, stop the Validate server and replace your current projects_root directory with the projects_root_backup you created. Then, start the Validate server to complete the restoration.