Question
· Jan 28, 2024

External backup - exceeding the maximum duration specified

Hi, I was wondering if anyone already dealt with this issue:
"System has been suspended for over X seconds, exceeding the maximum duration specified. Allowing system activity to resume. Any ongoing backup has presumably failed. Next InterSystems IRIS backup must be a full one"

our backup system "Commvault" is automatic, how do you tell it once you get this message that the next backup should be full?

thanks,

 

Eyal

Product version: IRIS 2023.3
Discussion (11)2
Log in or sign up to continue

Hi Ambrogio,

here are my scripts:
Pre:

#!/bin/bash

 

LOG_DIR=~/Commvault_backup
LOG_FILE=$LOG_DIR/backup-log_$(date +'%d-%m-%y').txt
mkdir -p $LOG_DIR 2>/dev/null
touch $LOG_FILE
if [ `hostname | grep data`  ]; then 
  #Freeze Write Daemon
  echo -e "##################################\n"  | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
  echo "Freezing IRIS Write Daemon"  | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
  iris session iris -U%SYS "##Class(Backup.General).ExternalFreeze(,,,,,,,,,480)"
  status=$?
  if [[ $status -eq 5 ]]; then
    echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] IRIS WD IS FROZEN Performing backup (UTC time! +03:00) STATUS = $status (need to be 5) "  | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
    while [ true ]; do
      if [ "$(tail -n 1 /irissys/data/IRIS/mgr/messages.log | grep "Journal File Compression" | awk '{print $8}')" = "Compressed"  ]; then
        echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] Running... "  | tee -a $LOG_FILE  >>  /irissys/data/IRIS/mgr/messages.log
        echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] Starting Backup... (UTC time! +03:00) "  | tee -a $LOG_FILE  >>  /irissys/data/IRIS/mgr/messages.log
              break
      fi
          done
  elif [[ $status -eq 3 ]]; then
    echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] IRIS WD FREEZE FAILED  (UTC time! +03:00) STATUS = $status (need to be 5)"  | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
    exit 1
  fi
  echo 
else 
   echo -e "Not data Pod" | tee -a $LOG_FILE

fi

 The question is how do I tell commvault that it's backup actually failed?

thanks,

Eyal

Post:

#!/bin/bash
LOG_DIR=~/Commvault_backup
LOG_FILE=$LOG_DIR/backup-log_$(date +'%d-%m-%y').txt
mkdir -p $LOG_DIR 2>/dev/null
touch $LOG_FILE
if [ `hostname | grep data`  ]; then 
  #Thaw Write Daemon
  echo -e "\nThaw Write Daemon" | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
  iris session iris -U%SYS "##Class(Backup.General).ExternalThaw()"
  status=$?
  if [[ $status -eq 5 ]]; then
    while [ true ]; do  
      if [ "$(tail -n 1 /irissys/data/IRIS/mgr/messages.log | grep "Backup.General.ExternalThaw: System resumed" | awk '{print $6,$7}')" = "System resumed"  ]; then
        echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] IRIS WD IS THAW! (UTC time! +03:00) STATUS = $status (need to be 5) "  | tee -a $LOG_FILE  >>  /irissys/data/IRIS/mgr/messages.log
        break
      else
        echo "wait"
      fi
    done
  elif [[ $status -eq 3 ]]; then
    echo -e "$(date +'%m/%d/%y-%T.%3N') (Wizards) [Backup.Event] [Commvault Backup] IRIS WD THAW FAILED  (UTC time! +03:00) STATUS = $status (need to be 5)"  | tee -a $LOG_FILE  >>  /irissys/data/IRIS/mgr/messages.log
    exit 1
  fi
  echo -e "##################################\n"  | tee -a $LOG_FILE  >> /irissys/data/IRIS/mgr/messages.log
else 
  echo -e "Not data Pod" | tee -a $LOG_FILE
fi

 

Your scripts are a bit complex but the backup pre-post command should return 0 if completed correctly.

So backup software can understand if the pre-command (preparation for backup) is completed ok.
If return code is 0 the backup can start saving data.
When backup completes commvault will start the post command.

Also the post command should return 0 if completed correctly.

In your script maybe it's better you should specify exit 0 when needed.

Yes, but only if it really fails.

Commvault should know of the backup cause it's do a backup.
The pre-script is used to configure the DB in freeze mode, so you can take a snapshot or a backup.

If the pre-script is ok it should return 0 so commvault can start the backup process.

After the backup completes commvault will launch the post-script, so the DB can return in normal way (thaw).

If the post-script is ok it must return 0 so commvault can understand that the process is completed.

I think you should reengineer your entire backup process with those steps in mind.

Having an instance frozen for 8minutes is not so good in my experience.

Have you considered to move to snapshot based external backups? Using ShadowCopies on windows and LVM snaps on Linux? This will reduce the freeze time to the time used to actually take the snapshot. Then CommVault can backup the snapshot drive while IRIS continues on unfrozen.