Archive for the ‘Sys Admin’ Category

MySQL Problems Update

Tuesday, June 15th, 2004

I’m still having trouble with our MySQL server. It stops responding at least once a day. I’ve tried just about everything I can think of. I even dumped all the databases, dropped them and imported all the data again. No luck. I think I’ll probably have to install the debug edition and try and figure out what’s locking it up.

MySQL Woes

Tuesday, June 8th, 2004

We’ve been having some trouble with our MySQL server running on the HPUX machine. It’s been crashing multiple times for the last few days. When it crashes it becomes completely unresponsive. No clients can connect and any clients that were connected don’t respond. I tried keeping a few windows open to monitor the processlist and extended-status but they didn’t help a whole lot when the server crashed (all values seemed normal when the crashes occurred). I was even unable to shutdown the server. I tried running the script to stop the server but the server wouldn’t respond to any kill commands. Basically the only thing I could do was kill the process with a “kill -9″ (NEVER A GOOD THING).

I’ve been watching the logs and I haven’t found anything out of the ordinary. Nothing significant in the error log. Nothing in the binary log and nothing in the query log file. I assumed it was one particular query that was killing the server but the last queries in the log files never matched. I checked the system log files and couldn’t find anything suspicious except for one thing. The server was restarted a few days ago for unknown reasons. I put in a ticket to engineering inquiring why the server had been restarted (assuming they’d made some changes). I got a response back indicating the server had crashed. I suspected the database might have some corruption problems due to the kill -9’s but it wasn’t until I learned the server had crashed that I thought corruptions could be the cause of the problem.

I decided to take the server down and run a myisamchk. It found quite a few errors. I ran it again with the –recover option. Just to be safe I then ran a CHECK TABLE [table name] EXTENDED on all the tables (MyISAM and Innodb) to verify they all had an “OK” status. They did. Hopefully this solved the problem. We’ll have to see how things go tomorrow.

MySQL Straight_Join

Thursday, May 20th, 2004

I learned the benefit of using the “straight_join” keyword while working on a complex query at work today. For some reason the query would only complete if I was running it on the mysql client on the mysql server. All remote connections would simply die. Actually they didn’t die, they would hang. I watched the processlist as the query was running. The status indicated “copying to tmp table”. Eventually the query would disappear from the processlist but no results or information would return and the client appeared to still be waiting for a response. I tried increasing the tmp_table_size variable but that didn’t help. I suspect it has something to do with the tmp directory (possibly not enough space available or something like that).

Anyway, I was able to work around the problem by rewriting the query using the straight_join keyword. Apparently MySQL isn’t necessarilly good at choosing the join order in complex queries. By specifying a complex query as a straight_join the query executes the joins in the order they’re specified. By placing the table I assumed to be the least common denominator first and specifying straight_join I was able to improve the query performance by a few minutes. The new query also completed successfully on the remote clients. Now…if I could just figure out why the first query hung.

Root on RAID 1 With Debian Woody

Saturday, May 15th, 2004

When I reconfigured my server using Debain I wanted the same harddrive raid setup I had with Redhat. Basically a completely redundant file system that would boot off either drive in the event of a failure. I found a number of howto’s (this one, and this one). I followed both instructions but still could not get the system to boot. I was 95% sure it was a problem with my initrd.img file since I was getting cramfs errors and a kernel panic with “kill init”. So, for the benefit of those who have the same problem, here’s what I did.

(more…)

Dangerous Grep

Sunday, May 9th, 2004

If you’re in the root folder of your filesystem, don’t type grep -r “some stuff to search for” *. Why? Because you’ll get a message like “grep: memory exhausted” and your system won’t respond (at least it won’t respond to ssh, http, smtp, etc). Actually, it appears it just crashes the network port. Executing a simple ifdown eth0 followed by an ifup eth0 from the console appeared to solve the problem. Then again, if you’re like me and your server is housed 20 minutes away and this happens at midnight it’s a little inconvenient to go access the server.

Why does this happen? Well, apparently grep keeps appending to its buffer until it comes across a newline character ‘\n’, at which point it clears the buffer and starts reading from the next line. This is a problem when it tries to search a large binary file (like something in /dev or /proc). I should have known better and excluded /dev and /proc to begin with, or even better run the command inside a subfolder such as /etc (which is probably all I needed to search). I’ll remember that next time. Still, after this, I’m nervous about executing a grep command. What if it comes across some large binary file I didn’t anticipate. I’m thinking of writing a simple shell script and add it to cron to run every fifteen minutes to cycle the network interface if it can’t ping the gateway. I think that should take care of it, but that solution seems a little hackish. Anyway, I’m open to suggestions for a better one.

Some Subtle Changes

Saturday, May 8th, 2004

Debian logoI finally got around to making some upgrades and changes I’ve been planning for a while. Since I reached the end of life cycle for Redhat 9.0 and I wasn’t interested in moving to Fedora, I decided this was a good time to move to Debian. Initially I planned to bring up a temporary system while I rebuilt the main server but I decided I might as well just get it over with. So, after I backed up the server I wiped it clean and started from scratch.

The installation was a little tricky since I was using a multiprocessor system with a RAID 1 setup (software). Getting both processors working was simple. I used apt-get to install the SMP kernel and that was it. The RAID was quite a bit more involved and troublesome but I’ll save that “Howto” for another time.

For the most part I’ve built the system using the debian packages. I wanted to use Exim as my SMTP server but I also wanted it to use MySQL to store the configurations for virtual domains and email usernames. Since that wasn’t possible with the Debian Exim package I built it from source by following these instructions. I also built Qpopper from source using the same instructions. The instructions are ok but seemed a little incomplete at parts. Basically I just looked at the Exim and Qpopper configuration files (included at the end of the instructions), commented out the Amavis sections and that was about it. It’s working perfectly.

I also made some changes to my Url’s. I moved my weblog from http://www.daylate.com/blog/ to http://www.daylate.com/. So basically that URL is for my blog. I also have another URL, http://www.jwholmes.com where I moved my picture gallery and such. The best thing is, all the old links work, thanks to mod_rewrite.

So, if you’ve noticed the site has been down a little bit in the past week, this is why. Now that I’ve worked through all the kernel issues it should be up and running for a long time (I hope, knock on wood). Debian really is cool, by the way.

New MySQL Book for the Library

Thursday, April 29th, 2004

High Performance MySQL book coverThe High Performance MySQL book, by Jeremy Zawodny and Derek Balling, finally arrived from Amazon. There’s very few tech books I can read from cover to cover but this is one of them. I’m three chapters into it and I have a decent list of things I want to try/change with our MySQL installation, at work.

Our installation at work is version 3.23 running on an HP UX system (first on the list is upgrading to 4.0). I wouldn’t call our implementation trivial. The alumni database has several tables with over a million records. I inheritated the implementation when I was hired a little over 8 months ago and I’ve been surprised at how fast everything runs. Esspecially since I know there wasn’t any real optimization work done when it was designed and implemented. Our alumni database is merely a copy of the actual one, which is running on an IBM AS/400 DB2 system. Today I was running a few queries comparing data in our tables with the tables on the DB2 system. Most of the queries executed on our system with a few on the DB2 system. Just out of curiosity I decided to reverse the queries, executing them against the DB2 system. The same page (set of queries) loaded about 3 to 4 times faster using our MySQL system. I’m sure the bandwidth limitations between our web server and the DB2 system had something to do with it, but I think it’s still safe to say MySQL outperformed.

Anyway, with all that said there’s a lot of things I’d like to do to increase our performance and reliability. Replication, load balancing, backup and a few other things. The type of things this book was written for. We have another HP UX machine sitting in the datacenter, unused. Perfect for experimenting. If you’re someone like me, who uses MySQL and knows enough to make things work but wants to take it to the next level, this is the book to read. Also, Jeremy has written some good articles on MySQL in Linux Magazine recently that are worth reading.

Strange IIS Behaviour

Sunday, March 28th, 2004

Unfortunately the only “supported” web servers offered by our IT department are servers running windows and IIS. As a result I have to deal with unreliable performance and when we do have a problem its basically impossible to debug. When I place a call to engineering about a problem the answer I always get is: “uhhh…ok, well we’ll reboot the server and see if that fixes it.” Brilliant solution.

Well, anyway the other day I noticed a strange problem. I wrote a script to email me any web site errors that occur. Previously the errors would only be submitted if the person who received the error clicked “submit error report.” I was curious to see how many people don’t bother so I decided to submit the reports automatically. The result, I think people clicked “submit” roughly 10 or 20% of the time.

Anyway, I was looking through the errors and I noticed quite a few created by search engines scanning the site (broken link errors and such). However, I noticed a few errors on an area of the site that supposedly no longer existed. I checked the server and sure enough, the files that were causing the errors no longer existed. Just out of curiosity I pointed a browser at the page and was confused when content was returned. Most of the graphics weren’t visible, since they’d been deleted, but all the text was there. I couldn’t figure out why a file I had deleted months ago was still being served so I checked the directory once again to verify that yes the file was deleted. It was; no where to be found. I checked some of the other files I’d deleted in that folder and they were also still being served up. I figured it was a caching problem but after clearing the IIS cache, clicking “expire content immediately”, neither of which worked I was stumped. So, I decided to create a file with the same name that would redirect to the home page. I refreshed my browser and sure enough it worked. Then I deleted the file just to see if it would still be served up. Strangely enough this time it actually deleted the file and I got a “file not found” error. Strange strange strange.