What Caused That Load Spike?

Every now and then, we find that we will have a sudden increase in the number of apache processes, load average will spike up, and then go back down to normal. In rare cases, we will see the same thing happen, and the load avg spike WAY up, all queries appear locked up, and the server must be rebooted. I am looking for ways of determining what caused this. I should note that it happens extremely rarely, and has never shown up in a load test.

On the MySQL end, I use show processlist to try to figure out what’s causing the issue. However, sometimes there’s just 150 queries in there doing nothing (occasionally just selects). I’m guessing it’s either a locking issue or perhaps it’s an issue with to much disk access causing the problem.

On the web server end, it’s a little more difficult. Ideally I’d like to know what url was originally called to create the hung apache process - does anyone know how to figure this out? Running on Fedora Core release 5 (Bordeaux).

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.