Turning a “pain-relieving pill” request into a cure!
Just this week, a major local retail group with over 200 outlets turned to us for urgent help. They had been advised by their software support vendors that they needed to immediately replace and expand their disk storage infrastructure or face continuing and escalating performance problems in their SQL Server operations. Trying to understand frequent performance degradation, analysis by their existing database performance monitoring tool SolarWinds showed repetitive patterns of heavy disk traffic reaching readings of 10K or more of IOPS, equating to hundreds of gigabytes of data volume. But, seeing the problem in SolarWinds was one thing, finding the cause was something the analysis tool could not do.
The solution the vendors recommended would cost many thousands of dollars, would take several months to implement, and would impact operations severely while being installed. Plus, the problem would stay until the whole process was finished. In desperation, they asked us if there was any short-term solution that could carry them over until all the new hardware and software was in place, tested and operational. They were looking for a ‘pill’ to ease the pain until the ‘surgery’.
Responding to their plea for a short-term remedy to carry them over the hump until the recommended solution could be installed, our flagship DPM tool was installed, up and running within 15 minutes of their request. Immediately, the pattern of heavy disk traffic was obvious, as shown here in these two images.
Up to our intervention, the cycle of extremely high levels of IOPS is clear. But AimBetter allows drill-down to the level below, to see what caused this, as seen here …
Taking the drill-down one level further, AimBetter enabled analysis of one query that was generating the disk traffic, with full exposure of the calling app, the resources it used, the plan being executed, and much more. Within a few minutes, we could recommend a patch to the code, and the effects are immediately apparent in the top picture – right-hand side. IOPS dropped from frequent peaks of around 10K to barely one-quarter of that, and the average level of around one-tenth of where it was prior to our fix.