Alert for Stuck System Monitor Tasks?

Is it possible to get alerts for stuck system monitor tasks?

We’re trying to upgrade how we manage our environments, and it’d really be nice to know when a task is stuck. Currently we (IT / SysAdmins) either get lucky and notice something, or a user lets us know that Epicor is slow (nah, never happens!), and we go looking.

Is there any way to interface Epicor with remote monitoring and management platforms?

Most tasks take seconds, a few take minutes, and outside of MRP Regen and a few finance processes, everything is done within an hour. I figure we could set a cut-off limit of X minutes, and if it’s exceeded, and email gets sent to the IT Team or Epicor Admin…

I thought this was discussed before, but I could find the thread with my (lack-of) search kung-fu here, or in IDEAS.

Not sure whether I posted it here or not, but at my last company I had an SSIS task running every 5 mins - it would perform a simple SQL query against the SysTask table. It wasn’t looking for something that had run too long, but instead looking for any task that had a schedule of “Immediate Run” but wasn’t active and the start time was more than 1 minute ago. This was good to indicate early that the Task Agent had failed.

You could use a similar approach but modify the SQL query to instead look for certain long running tasks and then send an email.

You could write this into an Epicor Function and schedule that to run, but obviously it’s subject to the task agent being responsive whereas SSIS is independent.

3 Likes

Mark,

That’s brilliant! We’ll have to try that.

I do this in SQL like @markdamen did. I check for any reports longer than 10 minutes every 5 minutes with certain things and users excluded. The second query just checks for MRP Regen hanging with no activity.

If (select count(*)
from [EpicorProduction].[Ice].[SysTask]
where EndedOn is null and TaskType='report' 	
		and TaskDescription not like 'BOM Listing%' and TaskDescription not like 'material%'  and TaskDescription not like 'Inspection Pending%'  
		and TaskDescription not like 'AP Invoice Balance%'  and TaskDescription not like '%Ledger%' and TaskDescription not like '%Method Master%'
		and TaskDescription not like 'Inventory Usage%' and TaskDescription not like 'Inventory WIP%' and TaskDescription not like 'Open DMR%'  
		and TaskDescription not like 'Stock Status%' and TaskDescription not like 'Work In Process%' and TaskDescription not like '%PartWhereUsed%'  
		and SubmitUser not in ('vluk01','jgil01','nkha01','syem01','acal01','acct01','nkoc01') and TaskStatus <> 'Error' and TaskStatus <> 'CANCELED' 
		and  (DateDiff(minute, [StartedOn],  GetutcDate()) > 10)) >= 1
begin
EXEC msdb.dbo.sp_send_dbmail

@profile_name = 'admin',
@recipients = 'me@domain.com ;usupport@domain.com',
@subject = 'Epicor Report over 10 minutes - Check Display All Tasks and then cancel in Epicor System Monitor',
@query = 'select SubmitUser,UserPIDInfo,TaskDescription, StartedOn
from [EpicorProduction].[Ice].[SysTask]
where EndedOn is null and TaskType=''report'' 
		and TaskDescription not like ''BOM Listing%'' and TaskDescription not like ''material%'' and TaskDescription not like ''Inspection Pending%''  
		and TaskDescription not like ''AP Invoice Balance%''   and TaskDescription not like ''%Ledger%''  and TaskDescription not like ''%Method Master%''
		and TaskDescription not like ''Inventory Usage%'' and TaskDescription not like ''Inventory WIP%'' and TaskDescription not like ''Open DMR%''  
		and TaskDescription not like ''Stock Status%'' and TaskDescription not like ''Work In Process%'' and TaskDescription not like ''%PartWhereUsed%''  
		and SubmitUser not in (''vluk01'',''jgil01'',''nkha01'',''syem01'',''acal01'',''acct01'',''nkoc01'') and TaskStatus <> ''Error'' and TaskStatus <> ''CANCELED'' 
		and  (DateDiff(minute, [StartedOn],  GetutcDate()) > 10)'


end

MRP hung

If (select count(*)
from [EpicorProduction].[Ice].[SysTask]
where EndedOn is null and 
		TaskType='PROCESS' 	
		and TaskDescription like 'Process MRP%'  and TaskStatus <> 'Error' and TaskStatus <> 'CANCELED' 
		and  DateDiff(minute, LastActivityOn,  GetutcDate()) > 10 or
		(DateDiff(minute, startedon,  GetutcDate()) > 300 and LastActivityOn is null)) >= 1
begin
EXEC msdb.dbo.sp_send_dbmail

@profile_name = 'admin',
@recipients = 'me@domain.com ;usupport@domain.com',
@subject = 'Epicor - MRP hung over 10 minutes - Stop Task in System Monitor',
@query = 'select SubmitUser,UserPIDInfo,TaskDescription, StartedOn
from [EpicorProduction].[Ice].[SysTask]
where EndedOn is null and 
		TaskType=''PROCESS'' 	
		and TaskDescription like ''Process MRP%''  and TaskStatus <> ''Error'' and TaskStatus <> ''CANCELED'' 
		and  (DateDiff(minute, LastActivityOn,  GetutcDate()) > 10 or
		(DateDiff(minute, startedon,  GetutcDate()) > 300 and LastActivityOn is null))
		
		'
--@attach_query_result_as_file = 1,
--@query_attachment_filename = 'Misc Invoices.txt'

end
2 Likes

The great thing about the solutions from @markdamen and @gpayne is that they don’t rely on the Task Agent to work. Because, if the Task Agent is hung up… :person_shrugging:

You can also monitor the system with REST calls if you don’t have access to the SQL Server, like some Cloud Users.

Awesome! Thank you @markdamen, @gpayne, and @Mark_Wonsil. This will help for sure.

@Mark_Wonsil - Do you have any examples on how to do it with Rest? We should be able to integrate that into the RMM platform.

If you follow Mark’s suggestion and use a function, then you can call Epicor Functions via REST.

One possible capability with functions is you might be able to check the Process ID of a Task and see if it is still running. I haven’t tried it but I’m sure that @klincecum knows a way.

The other REST method is using the System Montior Task Service endpoint:

To REALLY crank up a solution, one could keep a table of stats for tasks (low, high, average times) and compare the current running jobs to identify some potential candidates for further action.

2 Likes