Message Router Troubleshooting

The Message Router provides a method of consuming and storing large amounts of different types of messages from various Endpoint Privilege Management services. For example, pbmasterd and pblogd services employ it to write event data to the event log and to other integrated products such as BeyondInsight and Solr. This increases the performance when systems are loaded, while reducing load on the system. However, this also increases the complexity of the overall system. This means that while the service is running smoothly, no maintenance is required. However, if there is a problem with the service, other vital services may be affected.

The pbadmin command is used to monitor normal statistics for the Message Router.

This example output is typical, and shows the Message Router is running and is receiving requests.
pbadmin -P --info --msgs
{
  "pid": 8486,                    /* Process ID of the Message Router */
  "num_svcs": 2,                /* The number of services it provides */
  "curr_clnts": 0,                /* The number of currently connected clients */
  "tot_clnts": 3487,                /* The total number of clients connected this ses
sion */
  "total_msgs": 10635,                /* The total number of messages routed */
  "svcs": [                    /* Per service statistics */
  {
    "name": "session authenticate",        /* The name of the service */
    "last_active": "2018-10-25 10:13:14",  /* Last event seen by that service */
    "restarts": 1,                /* The number of process restarts */
    "pid": 8487,                /* The process ID of the Service Router */
    "queue_len": 0,                /* The current number of messages in the queue */
    "maxq_len": 200,                /* The maximum number of entries in the queue */
    "replies": 0,                /* (not applicable for authentication service) */
    "maxq_sz": 0,                /* The maximum queue size in bytes */
    "requests": 3487                /* The number of messages routed by this service 
*/
  },
  {
    "name": "event log",            /* The name of the service */
    "last_active": "2018-10-25 10:13:01",  /* Last event seen by that service */
    "restarts": 1,                /* The number of process restarts */
    "pid": 8488,                /* The process ID of the Service Router */
    "queue_len": 0,                /* The current number of messages in the queue */
    "maxq_len": 200,                /* The maximum number of entries in the queue */
    "replies": 0,                /* The number of replies to services */
    "maxq_sz": 102408,                /* The maximum queue size in bytes */
    "requests": 0            /* The number of messages routed by this service 
*/
  } ]
}

If the error 6101.53 Error accessing Message Router statistics - No such file or directory is displayed, the Message Router is not running, and should be restarted to continue normal operations.

The setting restservice <yes|no>can be configured in pb.settings so that pblighttpd-svc does not start up the REST service. However, it allows the Message Router to run.

If the Message Router is not running and messages need to be routed to the various services, the requests are queued in the Message Router directory (normally /opt/pbul/msgrouter) and are similar to those in the example:

# ls -l /opt/pbul/msgrouter/
total 100
-rwx------ 1 root root  5482 Oct 25 10:26 wq_0004
-rwx------ 1 root root 10251 Oct 25 10:26 wq_0255
-rwx------ 1 root root 10251 Oct 25 10:26 wq_0501
-rwx------ 1 root root  5482 Oct 25 10:26 wq_0550
-rwx------ 1 root root 10251 Oct 25 10:26 wq_0684
-rwx------ 1 root root  5482 Oct 25 10:26 wq_0755
-rwx------ 1 root root 10251 Oct 25 10:26 wq_0785
-rwx------ 1 root root  5482 Oct 25 10:26 wq_0858
-rwx------ 1 root root  5482 Oct 25 10:26 wq_0869
-rwx------ 1 root root 10251 Oct 25 10:26 wq_0912

These Message Router Write Queue files are used for temporary storage when the Message Router is unavailable or under severe load. Once the Message Router is available again, it consumes these files and stores the data in the appropriate databases.

This presents a significant difference from previous EPM-UL functionality in that event logs and other integrated products are not updated while the Message Router is unavailable. The data is stored securely in the Message Router Write Queues until it can be processed in the normal manner when the Message Router is available again.

The Message Router logs all errors in the REST log, in the specified directory, and so we recommend that this log be regularly monitored. One of the warnings that may be displayed is WARNING: Out of free slots for the Message Router. Consider increasing 'messagerouterqueuesize' to avoid slowdown.

This warning is not critical and is simply stating that the Message Router is under an increased load and would run faster if the specific setting is increased. If your system is displaying this warning on a regular basis, and often experiences heavy load, we recommend that you increase the messagerouterqueuesize setting, incrementing by roughly 25% each time. This does, however, use up more memory resources on the system as larger shared memory queues are used.

pbconfigd has a --call option. This action requires a JSON string parameter to process and processes as if the call was made over REST. This allows specific calls that are required to action licensing and message router queuing calls to be made.

For more information, see the following: