Webserver status update.

It’s clear that our current hosting solution is like trying to fit 10 pounds of shit into a 5 pound sack, as my mother is often wont to say, but we’re managing to limp along. Some days are better than others, but the performance is clearly not what I’d like it to be and it’s mainly due to trying to run in an environment that is less than the minimum recommended for the package we’re using. In blocking some of the more aggressive search engine crawlers out there we’ve also managed to block Bloglines from scanning our RSS feeds so folks who were keeping up with us via that conduit are missing out and it’s not clear which block we need to remove to restore that functionality. Occasionally when trying to view the site you may get an error message instead indicating that the script has used up all the available memory. When that happens simply hitting reload should be enough to get the site to render. We’re just bumping against our limits and causing the script to crap out on occasion is all.  In addition to that I need to have our new hosting providers set up a reverse DNS map so that email notifications can be sent to AOL and Roadrunner as those are currently failing.

I’ve held off on putting the ticket in because I’m kicking around whether or not to tough it out here by stepping up to the next available package or restarting the search for a better hosting solution. I’m also debating whether to stick with ExpressionEngine or see if I can make use of one of the other packages out there that aren’t as resource intensive. Of course, not being overly familiar with what the minimum requirements of other packages happens to be doesn’t make that choice any easier to make. I could end up switching to something else only to find it’s just as bad under the load SEB generates as EE is.

Anyway, just wanted to let you guys know that while we may be limping along at the moment we’re not ignoring the situation in the least. Options are still being considered and alternatives are being explored. Bear with us, this could take awhile.

11 thoughts on “Webserver status update.

  1. I was messing around behind the scenes a bit, but I haven’t had time to do something decisive. From what I can tell, today we’ve seen another instance of overabundant search engines.

    My personal headache is that even instructing the webserver to tell them to bugger off doesn’t do squat against the sheer number of requests. Something I wanted to set up and held off on is a throttling module. I suppose it’s time to go and do it.

    At the risk of being repetitive, I see but two long-term solutions. Reduce SEB’s memory and CPU footprint, which almost certainly involves a painful migration to another script, and/or shunting search engines into a search-engine optimized clone of SEB. If somebody has any ideas, don’t be shy.

  2. You could try changing Apache to run on say port 81 and then set up Squid as a proxy server on port 80 to retrieve the pages from Apache. That way you should drastically reduce the CPU and memory usage as instead of Apache, PHP, MySQL having to recompile each page, Squid can just pull it’s cached copy straight off the hard drive.

    Other optimisation techniques include making semi-dynamic content (such as the “SE Comments” on the left) be generated every 10 minutes to a plain HTML file and file being included on the page (again instead of generating it each time), tweaking the MySQL configuration, stripping PHP “to the bare bones” (do you really need all the modules that are compiled in?), tweaking Apache’s MaxClients settings, disabling KeepAlive in Apache and a few other speed tweaks.

  3. Thanks for the suggestions, Richy.

    You could try changing Apache to run on say port 81 and then set up Squid as a proxy server on port 80 to retrieve the pages from Apache. That way you should drastically reduce the CPU and memory usage as instead of Apache, PHP, MySQL having to recompile each page, Squid can just pull it’s cached copy straight off the hard drive.

    I’m receptive to that idea.

    EE does its own caching, but I have no idea how effective it is. On top of that, I can see that it’s vastly more expensive to use EE instead of squid, which is optimized for that job.

    Depending on the kind of access control squid is capable of, it may also be possible to punt on unwanted spiders well before they hit apache.

    Other optimisation techniques include making semi-dynamic content (such as the “SE Comments

  4. Richy said:

    You could try changing Apache to run on say port 81

    Just a technicality, but I beg to differ.

    IANA reserves ports 0-1023 as “well-known” ports that cannot be reassigned.  Port 81 is reserved for “HOSTS2 Name Server.”  Here is the list of (well-known ports

    I’m not too familiar with Squid, but for the same reason I listed above, you probably cannot set up Squid on port 80.

    HTTP Proxy servers are commonly set up on 8080. 

    Probably the most difficult thing about the setup you suggested is actually getting your traffic to route through the alternate port.

    Elwed said:

    to tell them to bugger off doesn’t do squat against the sheer number of requests.

    Not much you can do about that, except to make sure that those requests take as little time to execute as possible.  As I suggested before, and you have re-iterated, the dynamic content is ultimately what is slowing you down and causing your resource usage to explode:

    SEB said:

    Page rendered in 8.5140 with 66 SQL queries.

    Sorry to rain on your parade…  I really don’t have any ideas to suggest other than rendering more content statically.

    Shawn.

  5. Not at this time, no. We made the jump to a Virtual Private Server solution elsewhere. It’s just not quite as robust as we really need.

  6. It’s time to get more serious about performance testing. The primary objective is to harden against traffic spikes, a secondary objective is to improve performance under normal load.

    Since I can’t recreate the exact environment, a few simplifying assumptions are necessary. Specifically, using an elderly PC with RAM limited to the guaranteed allotment to the VPS and no swap should be close enough – just as long as performance differences scale proportionally to the live VPS.

    To properly regress and benchmark sample configurations, it looks like feeding siege URL lists culled from the server logs will do nicely.

    Now the trick will be finding the time to do all of that…

  7. Les, have you talked to Nevin at Pmachine hosting?  I bet he would have some great advice for you.  Maybe he would even give you a screaming deal for hosting.

  8. Not as of yet, but only because I’ve looked at the packages they offer and my needs are way beyond that. I have talked about what requirements are needed with Paul, though, and that’s how I learned I had undershot a bit.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.