The MONITOR process (also called the Caché Monitor) scans the messages in your cconsole.log file and sends you emails based on the severity of those messages. The MONITOR is configured using the ^MONMGR utility in terminal.
The MONITOR should not be confused with the similarly named System Monitor, which checks a variety of system health and performance metrics and can log messages regarding them to the cconsole.log, where they can then be scanned by the MONITOR.
The process begins automatically at Caché startup and scans your cconsole.log for new messages on a regular interval, by default every 10 seconds. All you need to do is set up your email settings in the ^MONMGR menu, which is simple to configure. Simply enter this command from the %SYS namespace: “do ^MONMGR”. From there the configuration menu is fairly self-explanatory, and, should you disagree, there’s a whole chapter of documentation on it in case you wanted to know more.
You can configure the details for your email server and recipients. You can change the severity level of messages you’ll get emails about. Messages in the cconsole.log are marked as level 0,1,2, or 3 (from least severe to most severe), and you can specify the minimum level you’d like to be notified about. Level 0 messages are merely informational, and level 3 messages indicate a fatal problem. The default level for emails is 2, which should be fine for many environments. You can review your cconsole.log to check the severity of the messages you may be interested in receiving notifications for. The following example shows a message with a severity level of 2:
01/25/16-14:35:28:684 (7750) 2 Backup failed to take over as primary
You can also change how often the MONITOR scans the log, though the default of 10 seconds should be fine for the vast majority of all cases.
Processes that generate 3 or more messages in less than a minute have their notifications suspended for an hour so that you are not inundated with emails about that process’s messages (this setting is not configurable).
You can read the full documentation for the MONITOR here:
http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GCM_monitor
Samples of the Menu:
%SYS>do ^MONMGR
1) Start/Stop/Update MONITOR
2) Manage MONITOR Options
3) Exit
Option? 2
1) Set Monitor Interval
2) Set Alert Level
3) Manage Email Options
4) Exit
Option? 3
1) Enable/Disable Email
2) Set Sender
3) Set Server
4) Manage Recipients
5) Set Authentication
6) Test Email
7) Exit
The "whole chapter of documentation" link is broken.... It should probably be the following link (which I notice you mention later on...):
http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=...
Should be fixed now. Thanks!
Related to this and alerting in general - in my organization and I'm sure many others you may want to send these type of alerts as pages or text messages. This of course can be done by adding a recipient email of something like 2065551234@txt.att.net. One drawback though is that when glancing over a list of emails like this it can be difficult to tell who a particular number is actually associated with (or even with normal emails that might be short or otherwise ambiguous). One thing that I found can be helpful is that you can enter emails in the format of: "Aric West" <2065551234@txt.att.net> with quotes around the name and angle brackets around the actual email, just as is often seen in email clients. This format works well in "to", "from", "reply to", etc fields in ^MONMGR, ^%SYSMONMGR, task manager, Ensemble's EnsLib.EMail.AlertOperation, etc throughout Cache and Ensemble.
Is there a way to exclude a certain time period, such as when typically a person is getting false positives/expected behavior (e.g., a false positive for a WD freeze because a backup is happening) and not other times (like a WD freeze on off hours but has nothing to do with a freeze.
Everything I've seen is on for everything or off for everything.
Unfortunately not through ^MONMGR.
Since it's just reporting on what's appearing in the messages (or cconsole) log above a certain severity level, an external job that periodically scans the file is a potential solution.
Thanks. That's what I thought but wanted a second opinion.
Maybe InterSystems might add some scanning features to the ^SYSMONMGR/^MONMGR apps? *puts bug into InterSystems' ear*