| « Managed to get 2 things working | Using 3rd-Party Certificates for Monitoring of Workgroup OpsMgr Agents » |
Uncontrolled Growth on the OpsMgr DB
All last week the opsmgr DB has been growing. It appears that for some reason the grooming job could not run at midnight.
Some Symptoms
1. The RMS was red with many db related errors.
2. Many Objects were grey.
3. At a bit after midnight the OpsMgr db went into recovery mode
4. An event saying the Grooming had failed.
stored proc p_partitioningandgrooming was failing when it ran p_AlertGrooming
In short... There was not enough disk space (and therefore log space) to run the grooming job.
The solution I came up with..
p_AlertGrooming needs a huge temporary table containing all alerts that are to be deleted and the log file needed about 5Gb to hold it.
I increased the DaysToKeep setting for alerts then re-ran the cleanup.
By steadily decreasing the value by one day at a time and rerunning the job I was able to reduce the db size.
Last Friday the job went normally.
----------------------------
UPDATE dbo.PartitionAndGroomingSettings
SET DaysToKeep = 30
WHERE (ObjectName = 'Alert')
BACKUP LOG OperationsManager WITH TRUNCATE_ONLY
EXEC p_AlertGrooming
----------------------------
I started at a value of 30 and then reduced it back down to the original value 7.
Keywords:-
OpsMgr, SCOM, 2007, Database, Logfile, Growth, Uncontrolled.