Low Memory Computing

There is a seemingly unstoppable trend in computing to have ever more and more memory available to applications. When we run across performance bottlenecks, one of the easiest fixes is usually to “add more ram.” However, there is a trend towards getting virtual private servers to host websites. This trend isn’t new. Businesses have been consolidating production servers onto virtual servers for a while now. The “new thing” is that more and more people are able to get their own slice of a real server from hosting companies. I just moved this blog, Urban Pug, and Clean Your Microfiber to a small virtual server from Quantact.com for a very affordable price. We’re at the point where you can split a decent server up 80-100 ways and give everyone decent performance.

There is a problem, however. If you put 8 gigs of ram in a system with, say, two high powered Xeon or Opteron servers, you can split the CPU cyles up and guarantee everyone a minimum amount of performance. This part is straightforward. The host system gives all guests as much CPU as they want, but when there’s contention, resources are limited with “fair” mechanism to ensure even CPU cycle distribution. However, with RAM, you can guarantee a “limited” amount of ram, but you can’t “burst” ram like you can with CPU cycles. The guest operating system can’t just be told, “hey, you have more ram now.”

That leaves us with the situation of having pretty damn fast virtual servers that are set to run with small amounts of ram. The problem that THIS creates is that standard applications such as MySQL and Apache expect a certain amount of ram to be present in modern configurations. When this happens, you can easily have a situation where you run out of ram and start swapping to disk a lot. If you’re in this situation, you might actually be better off limiting the applications in some way to use less ram (and thus potentially be slower under certain conditions).

It’s not a “win-win” situation. If you need a big fast server, you’ll still have to get a big fast server. If you just need something medium-to-small, it’s possible to do this on the cheap and still get good performance.

So, how do we do this? Basically, it comes down to limiting the ram MySQL uses and limiting the ram and number of processes apache uses (or using other applications altogether, like lighttpd). In my next few posts, i’ll discuss what i’ve learned from tweaking my own VPS, and hopefully get some feedback on how to do a better job.

Illegal XML Characters

When you’re working with XML, there are certain characters that are considered “illegal” The following is a potentially incomplete list of characters — I’m not an expert on text, unicode, utf-8, or the like. All i know is that I have to filter these characters out of any string that i send in a SOAP call in Java.

These are the java character codes, provided, for convenience, in a Java array.

private static final char[] XML_ILLEGALS = new char[] { 0x00, //0
0x01, //1
0x02, //2
0x03, //3
0x04, //4
0x05, //5
0x06, //6
0x07, //7
0x08, //8
0x0B, //11
0x0C, //12
0x0E, //14
0x0F, //15
0x10, //16
0x11, //17
0x12, //18
0x13, //19
0x14, //20
0x15, //21
0x16, //22
0x17, //23
0x18, //24
0x19, //25
0x1A, //26
0x1B, //27
0x1C, //28
0x1D, //29
0x1E, //30
0x1F //31
};

Note that when doing RSS, there are other illegal characters, like ampersand, which need to be escaped through encoding, but not all encoding is supported everywhere.

For example, “Oslash” referenced by the w3c here has an acceptable encoding. However, when in an rss xml feed parsed by firefox, it’s “not ok”

Resolution to follow.