Server Maintenance: Cleanup, Backup, and Restoration
| Author(s) |    | 
| Reviewers |  | 
OverviewQuestions:
Objectives:
How can I back up my Galaxy?
What data should be included?
How can I ensure jobs get cleaned up appropriately?
How do I maintain a Galaxy server?
What happens if I lose everything?
Requirements:
Learn about different maintenance steps
Setup postgres backups
Setup cleanups
Learn what to back up and how to recover
- slides Slides: Galaxy Installation with Ansible
- tutorial Hands-on: Galaxy Installation with Ansible
- A VM with at least 2 vCPUs and 4 GB RAM, preferably running Ubuntu 18.04 - 20.04.Time estimation: 30 minutesSupporting Materials:Published: Apr 16, 2023Last modification: Jul 13, 2023License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MITpurl PURL: https://gxy.io/GTN:T00324rating Rating: 4.5 (0 recent ratings, 2 all time)version Revision: 5
Keeping your Galaxy cleaned up is an important way to retain space, especially since for many groups that is the limiting factor in their deployment.
Additionally, backups are necessary to ensure that if you ever experience system level failures, you can safely recover from these.
Agenda
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy
Step 2backup-cleanup
Step 3customization
Step 4tus
Step 5cvmfs
Step 6apptainer
Step 7tool-management
Step 8reference-genomes
Step 9data-library
Step 10dev/bioblend-api
Step 11connect-to-compute-cluster
Step 12job-destinations
Step 13pulsar
Step 14celery
Step 15gxadmin
Step 16reports
Step 17monitoring
Step 18tiaas
Step 19sentry
Step 20ftp
Step 21beacon
Cleanups
There are two kinds of data that are produced when running a Galaxy: files users create and then delete or purge, and then files Galaxy creates itself. Both of these can be cleaned to save space.
User Created Files
You can use gxadmin to cleanup user created files. gxadmin is covered in more detail in its own dedicated
tutorial.
Hands On: Installing gxadmin with Ansible
Edit your
requirements.ymland add the following:--- a/requirements.yml +++ b/requirements.yml @@ -11,3 +11,6 @@ version: 0.3.1 - src: usegalaxy_eu.certbot version: 0.1.11 +# gxadmin (used in cleanup, and later monitoring.) +- src: galaxyproject.gxadmin + version: 0.0.12
Install the role with:
Code In: Bashansible-galaxy install -p roles -r requirements.yml
Add the role to your playbook:
--- a/galaxy.yml +++ b/galaxy.yml @@ -27,3 +27,4 @@ become: true become_user: "{{ galaxy_user_name }}" - galaxyproject.nginx + - galaxyproject.gxadmin
Setup a cleanup task to run regularly:
--- a/galaxy.yml +++ b/galaxy.yml @@ -28,3 +28,11 @@ become_user: "{{ galaxy_user_name }}" - galaxyproject.nginx - galaxyproject.gxadmin + post_tasks: + - name: Setup gxadmin cleanup task + ansible.builtin.cron: + name: "Cleanup Old User Data" + user: galaxy # Run as the Galaxy user + minute: "0" + hour: "0" + job: "SHELL=/bin/bash source {{ galaxy_venv_dir }}/bin/activate && GALAXY_LOG_DIR=/tmp/gxadmin/ GALAXY_ROOT={{ galaxy_root }}/server GALAXY_CONFIG_FILE={{ galaxy_config_file }} /usr/local/bin/gxadmin galaxy cleanup 60"This will cause datasets deleted for more than 60 days to be purged.
Run the playbook
Code In: Bashansible-playbook galaxy.yml
Whenever gxadmin runs, it will create logs you can read in /tmp/gxadmin which you can check later.
Galaxy Created Files
Before we begin backing up our Galaxy data, let’s set up automated cleanups to ensure we backup the minimal required set of data.
Hands On: Configuring PostgreSQL Backups
Edit
galaxy.ymlto installtmpwatch(if using RHEL/CentOS/Rocky) andtmpreaperif using Debian/Ubuntu--- a/galaxy.yml +++ b/galaxy.yml @@ -21,6 +21,14 @@ - name: Install Dependencies package: name: ['acl', 'bzip2', 'git', 'make', 'tar', 'python3-venv', 'python3-setuptools'] + - name: Install RHEL/CentOS/Rocky specific dependencies + package: + name: ['tmpwatch'] + when: ansible_os_family == 'RedHat' + - name: Install Debian/Ubuntu specific dependencies + package: + name: ['tmpreaper'] + when: ansible_os_family == 'Debian' roles: - galaxyproject.galaxy - role: galaxyproject.miniconda
Edit
group_vars/galaxyservers.ymland add some variables to configure PostgreSQL:--- a/group_vars/galaxyservers.yml +++ b/group_vars/galaxyservers.yml @@ -2,6 +2,7 @@ galaxy_create_user: true # False by default, as e.g. you might have a 'galaxy' user provided by LDAP or AD. galaxy_separate_privileges: true # Best practices for security, configuration is owned by 'root' (or a different user) than the processes galaxy_manage_paths: true # False by default as your administrator might e.g. have root_squash enabled on NFS. Here we can create the directories so it's fine. +galaxy_manage_cleanup: true galaxy_layout: root-dir galaxy_root: /srv/galaxy galaxy_user: {name: "{{ galaxy_user_name }}", shell: /bin/bash}Code In: Bash
ansible-playbook galaxy.yml- Check out the cleanup task which has been generated in:
/etc/cron.d/ansible_galaxy_tmpclean
This will setup tmpwatch to cleanup a few folders:
- the job working directory, important if you set cleanup: onsuccess, to cleanup old failed jobs once you’re done debugging their failures.
- the new file upload path, to catch uploaded temporary files that are no longer necessary.
Backups
There are a few important things to back up with your Ansible Galaxy:
- Galaxy
    - The Galaxy-managed config files
- The playbooks
 
- The Database
- The Data
Galaxy
By using Ansible, as long as you are storing your playbooks on another system, you are generally safe from failues of the Galaxy node, and you’ll be able to re-run your playbook at a later date.
However, playbooks often do not include:
- Which tools you’ve installed (have you ever installed a tool outside of ephemeris? This might be lost!)
- Conda environments, which will not always resolve identically over time. If strong guarantees of reproducibility are important, then consider backing these up as well.
Database Backups
We’re setting a couple of variables to control the automatic backups, they’ll be placed in the /data/backups folder next to our user uploaded Galaxy data.
Hands On: Configuring PostgreSQL Backups
Edit
group_vars/galaxyservers.ymland add some variables to configure PostgreSQL:--- a/group_vars/dbservers.yml +++ b/group_vars/dbservers.yml @@ -5,3 +5,7 @@ postgresql_objects_users: postgresql_objects_databases: - name: "{{ galaxy_db_name }}" owner: "{{ galaxy_user_name }}" + +# PostgreSQL Backups +postgresql_backup_dir: /data/backups +postgresql_backup_local_dir: "{{ '~postgres' | expanduser }}/backups"
This will setup our backups to run as a cron job.
Data Backup
With Galaxy it is technically only necessary to backup your inputs, as the downstream files should, in theory be re-createable due to the reproducibility of Galaxy.
In practice, some groups either choose to not backup, or to backup everything, often to extremely cheap and slow storage like Glacier or a tape library.
Most groups choose to implement this as a custom cron job, e.g.
post_tasks:
  - name: Setup backup cron job
    ansible.builtin.cron:
      name: "Backup User Data"
      minute: "0"
      hour: "5,2"
      job: "rsync -avr /data/galaxy/ backup@backup.example.org:/backups/$(date -I)/"
People who, let’s say, care strongly about backups will often insist that you need to version files. This is of course unnecessary in the Galaxy case as files are essentially Write Once Read Many (WORM)s, which is a really good file storage practice. Files can get removed so it isn’t a true WORM strategy that you’d use for e.g. audit logs, but it is close. That said, since files never get changed, keeping multiple versions is unnecesary.
Please consider communicating very well with your users what the data backup policy is.
Comment: Got lost along the way?If you missed any steps, you can compare against the reference files, or see what changed since the previous tutorial.
If you’re using
gitto track your progress, remember to add your changes and commit with a good commit message!
Restoration
Sometimes failures happen! We’re sorry you have to read this section.
Restoring the Database
This procedure is more complicated, you can read about the restoration procedure in the associated PR.
This step assumes you have pre-existing backups in place, you must check this first:
ls /data/backups/
If you have backups, you’re ready to restore:
# Stop Galaxy, you do NOT want galaxy to connect mid-restoration in case it
# tries to modify the database.
sudo systemctl stop galaxy
# Stop the database
sudo systemctl stop postgresql
# Ensure that it is stopped
sudo systemctl status postgresql
# Begin the backup procedure by becoming postgres:
sudo su - postgres
# Move the current, live database to a backup location just in case:
mkdir /tmp/test/
# ====
# NOTE THAT THIS NUMBER MAY BE DIFFERENT FOR YOU!
# You will need to change 12 to whatever version of postgres you're running
# in every subsequent command
# ====
mv /var/lib/postgresql/12/main/* /tmp/test/
# Add backup
rsync -av /data/backups/YOUR_LATEST_BACKUP/ /var/lib/postgresql/12/main
# Add the restore_command, to your backup file:
# restore_command = 'cp "/tmp/backup/current/wal/%f" "%p"'
$EDITOR ./12/main/postgresql.auto.conf
# Touch a recovery file
touch /var/lib/postgresql/12/main/recovery.signal
# As $username (with sudo right)
sudo systemctl restart postgresql
sudo systemctl status postgresql
# Restart Galaxy
sudo systemctl start galaxy
If you encounter issues, we suggest reading Lucille’s log of her experiences restoring as you might encounter similar issues.
Restoring Galaxy
Restoring Galaxy is easy via Ansible (maybe ensuring users cannot login by disabling the routes in nginx)
ansible-playbook galaxy.yml
And if you are following best practices, you probably have your tools stored in a YAML file to use with Ephemeris:
shed-tools install -g https://galaxy.example.org -a <api-key> -t our_tools.yml
Restoring User Data
This should simply be rsyncing your data from the backup location back into /data/galaxy.
Comment: Galaxy Admin Training PathThe yearly Galaxy Admin Training follows a specific ordering of tutorials. Use this timeline to help keep track of where you are in Galaxy Admin Training.
Step 1ansible-galaxy
Step 2backup-cleanup
Step 3customization
Step 4tus
Step 5cvmfs
Step 6apptainer
Step 7tool-management
Step 8reference-genomes
Step 9data-library
Step 10dev/bioblend-api
Step 11connect-to-compute-cluster
Step 12job-destinations
Step 13pulsar
Step 14celery
Step 15gxadmin
Step 16reports
Step 17monitoring
Step 18tiaas
Step 19sentry
Step 20ftp
Step 21beacon
