HTML & more
Monday, June 25th, 2007If you’ve ever coded anything in raw HTML you likely have come across the need to escape a certain special character now and then. For example, you should never use the ampersand (&), or the greater-than (>) or less-than (<) symbols in raw html because those symbols represent special control characters. Instead you use coded character entities, sometimes called ampersand-escape sequences because they look like ampersand, followed by a code, followed by a semi-colon (see what the raw ampersand is used for?)
So, to get an ampersand you would put & There are a handful of commonly used escape sequences like this one such as ” = “, < = <, and > = >. These mnemonic escapes are nice, but they don’t cover everything. To get at the rest, you use &#NNN; where NNN is the decimal character code in the ISO-8859-1 character set …. uh, yeah whatever that means. Sometimes I think it would be easier to just look at a list and find the code for the character you want.
Here’s an awk script that generates a primitive html list of all the characters up to 1000:
awk 'BEGIN {
print "<html><body>";
for (i=0;1000>=i;i++){
printf "&#%0.3d; = &#%0.3d;<br/>\n",i,i;
}
print "</html></body>";
}' - > ampersand.html

