Tuesday, May 12, 2015

XML syntax

I have always been annoyed by the imperfections on webpages, e.g., weird characters. One of the websites I used a lot have the following defects:
while the other one doesn't:
I am always interested in figuring out what have caused this problem.

Last night, I discovered a new feature on my browser: view source (forgive my ignorance). So naturally I looked the source code of both websites. Apparently I didn't figure out the reason is because of my lack of knowledge on the XML:

XML has a feature called entity reference: in order to avoid errors generated from parsing special characters in XML, replace such characters with entity reference: e.g., '<' -> '&lt'
There are 5 pre-defined entity references:
source: http://www.w3schools.com/xml/xml_syntax.asp
So the answer for the previous question is: the first website parses the entity reference instead of the character itself. I believe a simple code should fix this problem, which I hope they will do it.

I don't have much knowledge on the XML, but here is some useful resources:

W3school
W3C

I probably will dive a little bit on the topic, e.g., the namespace: http://www.w3.org/1999/xhtml/

P.S. All copyright reserved to the websites which I got the snapshots from. Nevertheless, they are great websites. :)



No comments:

Post a Comment