Submit a ticketCall us

Systems Monitoring for Dummies
Our new eBook will teach you the fundamentals and help you create monitors and alerts that are effective, meaningful, and actionable. Monitoring is more than a checkbox on your to-do list. This free eBook will give you practical advice to help you succeed in all aspects of monitoring – discovery, alerting, remediation, and troubleshooting. Don’t miss out on this indispensable resource for newbies, experienced IT pros, and everyone in between. Register Now.

Home > Success Center > Database Performance Analyzer (DPA) > SQL Server long running jobs alert may fire false positives if a SQL Job was started and not allowed to finish

SQL Server long running jobs alert may fire false positives if a SQL Job was started and not allowed to finish

Table of contents
No headers

In DPA 9.0 and higher, the SQL Server Long Running Jobs alert may fire false positives if a SQL Server job was started and not allowed to finish.

To work around this issue, create a custom SQL alert that excludes SQL Server jobs that are started but not completed and will never be allowed to finish. 
 

Create a new Custom SQL Alert – Multiple Numeric Return

Replace the following values in the below SQL Statement text with the values to evaluate:

  • {Historical_Time_Frame} – Time frame in days used to calculate historical mean and standard deviation
    Note: Ensure this is a negative in the query, otherwise it will only look to see if there are executions in the future
  • {Minimum_Executions} – Minimum number of job executions required in historical timeframe for job to be evaluated
  • {Minimum_Job_Duration} – Minimum historical average job duration in seconds required for a job to be evaluated

 

The alert returns the following information:

JobName, AvgRuntime, ThresholdRunTime         CurrentRunTime

LongerJob Avg:240.3333 Threshold:241.397         243

 

Ensure that any values configured for Alert Levels and Recipients are Less than or Equal to the {Minimum_Job_Duration} becasue how long the current execution has taken is the evaluation parameter for this (CurrentRunTime).  Different jobs may have different minimums and any result should be considered abnormal.  SolarWinds recommends setting a Minimum Value of 1 and no maximum value for the alert level desired from this alert.

The Notification Policy for this Alert should be set to:  Notify when level is not normal

Custom SQL Alert Multiple Numeric Return

SQL Statement:

DECLARE @currently_running_jobs TABLE (

    job_id UNIQUEIDENTIFIER NOT NULL

    ,last_run_date INT NOT NULL

    ,last_run_time INT NOT NULL

    ,next_run_date INT NOT NULL

    ,next_run_time INT NOT NULL

    ,next_run_schedule_id INT NOT NULL

    ,requested_to_run INT NOT NULL

    ,request_source INT NOT NULL

    ,request_source_id SYSNAME NULL

    ,running INT NOT NULL

    ,current_step INT NOT NULL

    ,current_retry_attempt INT NOT NULL

    ,job_state INT NOT NULL

    )

 

--capture details on jobs

INSERT INTO @currently_running_jobs

EXECUTE master.dbo.xp_sqlagent_enum_jobs 1,''

 

;WITH JobHistData AS (

                           SELECT job_id,

                                    date_executed=msdb.dbo.agent_datetime(run_date, run_time),

                                secs_duration=run_duration/10000*3600 + run_duration%10000/100*60 + run_duration%100

 

                          FROM msdb.dbo.sysjobhistory with (NOLOCK)

                          WHERE step_id=0                                              -- Job Outcome

                          AND run_status=1                                             -- Succeeded

                          AND msdb.dbo.agent_datetime(run_date, run_time) >= DATEADD(DD, -{Historical_Time_Frame} ,GETDATE())

)      

-- Only jobs run in last X days

                     -- Job Stats (from Job History, get avg duration and avg2stdDev (threshold) filtering by job count and avg duration)

            ,JobHistStats AS (

                            SELECT job_id,

                                    AvgDuration=avg(secs_duration*1.),

                             AvgPlus2StDev=(avg(secs_duration*1.)+2*stdevp(secs_duration) )

                      FROM JobHistData

                      GROUP BY job_id

-- How many runs in history need to have occurred before this alert will fire

                      HAVING count(*)>= {Minimum_Executions}

--Minimum length of time (in seconds) a query must have previously run to be included

                      AND avg(secs_duration*1.)>= {Minimum_Job_Duration} )

 

-- Grab currently running jobs that exceed threshold and return for alerting

         SELECT

              JobName=j.name + ' Avg: ' + CAST(AvgDuration AS VARCHAR) + ' Threshold: ' + CAST(AvgPlus2StDev as VARCHAR),

SecondsElapsed=DATEDIFF(SS, act.start_execution_date, GETDATE())

 

                      FROM JobHistData jd with (NOLOCK)

                      JOIN JobHistStats jhs with (NOLOCK) on jd.job_id=jhs.job_id

                      JOIN msdb..sysjobs j with (NOLOCK) on jd.job_id=j.job_id

                      JOIN msdb..sysjobactivity AS act with (NOLOCK) ON act.job_id = jd.job_id

                      JOIN msdb..syssessions sess with (NOLOCK) on act.session_id = sess.session_id  -- discard jobs without a session
                      JOIN ( SELECT MAX(agent_start_date) AS max_agent_start_date FROM msdb..syssessions ) sess_max on sess.agent_start_date = sess_max.max_agent_start_date  -- only look at the active session

                      AND act.stop_execution_date IS NULL         -- if no stop date, then job is considered running

                      AND act.start_execution_date IS NOT NULL    -- job must have started

                      WHERE secs_duration>AvgPlus2StDev                  -- want to grab jobs running longer than 2StdDev from mean

                     AND DATEDIFF(SS, act.start_execution_date, GETDATE()) > AvgPlus2StDev  -- want jobs that started recently

                      GROUP BY jd.job_id, j.name, AvgDuration, AvgPlus2StDev,act.start_execution_date;

Last modified

Tags

Classifications

Public