Backing up Validate data

This article describes backing up and restoring Validate data.

All data generated during project builds, including build information in the MariaDB database and defect data in Lucene, is stored in the projects_root directory. The data changes frequently as users load builds, cite issues, and change settings or configuration. Backing up the projects_root directory ensures that all relevant data can be restored, if needed. For information about the default location and other considerations, see Projects_root directory.

The Validate installation directory contains base checker configuration files, custom checkers, and database configuration files that may need to be backed up periodically when they are modified, added, or removed. If you modify configuration files in your Validate installation directory, back up them up after making changes. For examples of these files, see Before you begin.

If you have created and deployed custom checkers, back up the <server_install>/plugins directory.

As part of an integration build analysis, Validate copies any necessary data from the tables directory to the projects_root directory, so you do not need to back up the tables directory.

Before you begin

Failure to follow the instructions for restoring a project from a hot backup can lead to inconsistent state.

Configuration settings and data in your Validate installation directory that are shared across all servers running on the Validate installation folder are not backed up automatically. If you will eventually change your machine or migrate your Validate installation, manually back up your Validate installation folder and include this data. Examples of this data include the following:

  • Modified compiler configurations, settings, and custom checkers in the config and plugins folders.
  • The kwfilter.conf and kwmysql.ini files in the config folder.
  • Custom taxonomies in the taxonomies folder.
  • The clients.json file in the clients folder.

Tips for protecting your data

We recommend taking the following precautions when backing up a project or Validate server.

  • To protect a backup from a potential server or system failure, store the backup in a secure location that is separate from the server it was created on.
  • To minimize security risks, unintended changes, and unauthorized access, use a service account instead of a system user account when creating a backup.
  • To protect a backup from tampering, create a hash of the backup before storing it and keep the hash in a secure location. When you're ready to restore the backup, generate the hash again and make sure it matches the original to confirm that the backup has not been altered.
  • To protect a backup from unauthorized access, encrypt the backup folder using a method of your choice. For example, place the backup folder in a password-protected location or apply password protection directly to the folder.
  • To ensure all data is captured, back up any customized configuration files and custom checkers (non-volatile data on which it is safe to perform a hot backup).

What should I do if my data is corrupted?

Corruptions can happen at any time and without warning. To preserve analysis history that can be used for further investigation, make sure to preserve any corrupted databases. Without this data, you may have to recreate entire projects.

To protect against future corruptions, back up your data at regular intervals as described in the section How often should I back up my data?

How often should I back up my data?

We recommend backing up your data at regular intervals. The frequency you choose may range from days to months depending on several factors, which include the following:

  • Organizational standards. For example, your organization may have an existing policy that determines backup frequency or maintenance schedules.
  • Risk tolerance. For example, a large organization that uses analysis results for compliance certification may want to back up more frequently.

Backup methods

We recommend reading each of the following descriptions and procedures in the sections below to decide which method is right for you.

Cold backup

We recommend using a cold backup to back up an entire server without having to back up each project individually.

A cold backup shuts down the Validate server for a short period to ensure the following:

  • In-flight data transactions are completed before backing up data
  • Restoration from the backup is consistent

Hot backup

A hot backup is performed while the Validate server is running and users are logged into the system. While this minimizes downtime, it can lead to data loss or corruption as users continue to work during the process. The only hot backup method that is officially supported and will not lead to data loss or corruption is Method 1: Use supported scripts.

Hybrid backup

A hybrid backup is a hot backup followed by a cold backup. This minimizes downtime while ensuring data is consistent.

Performing a cold backup

Inform your users that services will be temporarily unavailable before shutting down the Validate servers.
  1. Back up any customized configuration files from <server_install>/config separately.

  2. Stop the Validate server. See Stopping the servers running as regular processes.
  3. Use normal operating system (or backup utility) commands to copy the entire projects_root directory to your desired backup media.
  4. Start the Validate server. See Starting the servers as regular processes.

Restoring from a cold backup

Since you are restoring data from the last time you performed a cold backup, you will lose any analyses and transactions that were performed since that backup. You cannot automatically reapply transactions to the last backup.
  1. Stop the Validate server. See Stopping the servers running as regular processes.
  2. If the entire installation was corrupted and reinstalled, and if you backed up any configuration files during the preparation phase, copy the files back into the newly installed <server_install>/config directory.
  3. Delete the contents of the corrupted projects_root directory.
  4. Restore the projects_root from backup media to the projects_root location.
  5. Start the Validate server. See Starting the servers as regular processes.

Performing a hot backup

The following methods can be used to perform a backup while your server is running.

Method 1: Use supported scripts is the supported method for creating a hot backup of a project that can be restored later.

Method 1: Use supported scripts

This method does not restore user and group roles that were assigned to specific projects. However, this method will restore user and group global roles when you restore server information.

To minimize downtime, you can back up your server and project information without ever stopping your server. Later, you can restore a backup to a new projects root.

This method is similar to Method 2: Use kwprojcopy with some additional benefits:

  • To prevent file corruption, project information is locked while creating a backup archive.
  • You never have to stop your server from running.
  • You can customize the scripts to suit your needs. For example, you can execute your preferred file copy tools while creating a backup archive.

Instead of manually running the Python scripts, use the launchers validate_backup and validate_restore to run the scripts automatically. The launchers ensure that the scripts will work within a virtual environment, use the bundled Python installation, and install any required dependencies.

A backup created with one version of validate_backup cannot be restored with a different version of validate_restore. For example, a backup created with Validate 24.4 cannot be restored using Validate 24.3 or 25.1. Use the same version of Validate to back up and restore a project or server. To avoid compatibility issues when you migrate to a new server version, we recommend that you create new project and server backups after you finish the migration.

Run the commands that apply to your system.

Using supported scripts on Windows

Run the following commands:

<Validate-installation>/bin/validate_backup.cmd  # For backup
<Validate-installation>/bin/validate_restore.cmd  # For restore

Using supported scripts on Linux

  1. Install the system libraries that are required to run backup.py and restore.py on Linux, run the example commands below, adjusting to suit your Linux distribution:
    curl -sS https://downloads.mariadb.com/MariaDB/mariadb_repo_setup | sudo bash
    sudo apt-get install libmariadb3=1:11.5.2+maria~ubu2404 libmariadb-dev=1:11.5.2+maria~ubu2404
  2. Next, run the following commands:
  3. You may encounter a MySQL "socket error" while running the scripts on Linux if the database host is not properly specified in the command. Workaround: To connect the script to the correct MySQL host, add the --db-host parameter to the command, followed by the fully qualified domain name (FQDN) or the IP address of the host where the database is running. For example: backup --db-host 127.0.0.1
    <Validate-installation>/bin/validate_backup  # For backup
    <Validate-installation>/bin/validate_restore # For restore

Backing up, restoring, and verifying project and server data

For full instructions, see the following articles:

Method 2: Use kwprojcopy

This method does not restore user and group roles that were assigned to specific projects.

Kwprojcopy incrementally back ups each project. While kwprojcopy is running, it locks projects and collects the data required for a project backup. You can later restore the data using the kwprojcopy import function.

Using kwprojcopy requires no downtime, but only backs up project-specific data and not global data (such as reports definitions).

Method 3: Back up an entire Virtual Machine

You can use a virtual machine (VM) backup to increase backup speed while your server is shut down. To determine if this method is viable, first investigate how long it takes to create a snapshot in different VM architectures.

  1. Stop the Validate server.
  2. Back up the VM image according to your requirements and procedures.
  3. Start the Validate server.

To restore from the backup, stop the Validate server and replace your current VM image and/or projects_root directory. Then, start the Validate server to complete the restoration.

Performing a hybrid backup

You can use rsync to perform a double backup. Rsync can incrementally copy files at the binary level. If you perform a full backup, the next rsync will only copy files that have changed, reducing the time required for incremental backups.

  1. While Validate server is running, run "rsync -a projects_root projects_root_backup" on the live projects_root.
  2. Stop the Validate server.
  3. Run "rsync -a --delete projects_root projects_root_backup" to copy the changed files. Since the amount of files is likely smaller, it should take less time to complete.
  4. Start the Validate server.

To restore from the backup, stop the Validate server and replace your current projects_root directory with the projects_root_backup you created. Then, start the Validate server to complete the restoration.