Illegal XML Characters

When you’re working with XML, there are certain characters that are considered “illegal” The following is a potentially incomplete list of characters — I’m not an expert on text, unicode, utf-8, or the like. All i know is that I have to filter these characters out of any string that i send in a SOAP call in Java.

These are the java character codes, provided, for convenience, in a Java array.

private static final char[] XML_ILLEGALS = new char[] { 0x00, //0
0x01, //1
0x02, //2
0x03, //3
0x04, //4
0x05, //5
0x06, //6
0x07, //7
0x08, //8
0x0B, //11
0x0C, //12
0x0E, //14
0x0F, //15
0x10, //16
0x11, //17
0x12, //18
0x13, //19
0x14, //20
0x15, //21
0x16, //22
0x17, //23
0x18, //24
0x19, //25
0x1A, //26
0x1B, //27
0x1C, //28
0x1D, //29
0x1E, //30
0x1F //31

Note that when doing RSS, there are other illegal characters, like ampersand, which need to be escaped through encoding, but not all encoding is supported everywhere.

For example, “Oslash” referenced by the w3c here has an acceptable encoding. However, when in an rss xml feed parsed by firefox, it’s “not ok”

Resolution to follow.

Tips for Debian

Here are two things that I think are important, or at least they became important for me, with configuring a debian server.

First off, the debian box I’m configuring is behind a linksys NAT firewall/router.
Secondly, the users of the system are not on the same network (they are one ip address off in the ‘private’ network, separate from the ‘public’ network)

Ok, now for the problems: logging in w/ ssh is SLOW and sending mail out is SLOW.

The solutions:

1) add “UseDNS no” to your /etc/ssh/sshd_config file. Then, restart ssh. This causes sshd to not perform reverse dns lookups on each host that logs in. The “slowness” is that there is no dns entry, and the lookup has to time out before it will let users log in. If the problem is still not fixed, try upgrading to a new version of your software — I’ve heard problems with PAM authentication mucking with things (and this has been fixed).

2) in /etc/exim4/exim4.conf.template, comment out the line that says “host_lookup = *” Then run update-exim4.conf and restart exim. This does the same thing as the sshd trick. You can also try changing “rfc1413_query_timeout” to “0s”

Hope this helps someone!