Standards for Markup Languages

One thing you may or may not have noticed when it comes to markup languages is that they all seem to follow a similar set of rules for how they work. In part this similarity is due to the fact that they are being used to do similar things. It is more than just that though because in many cases the markup languages themselves are more similar and compatible than you would expect just based on the similarity of the tasks that they perform.

The main reason for the similarity between various markup languages is that they all follow the same standard. That standard is called Standardised General Markup Language (SGML) and it is the standard for defining markup languages. By defining markup languages using SGML those developing the markup language benefit by having all sorts of issues taken care of for them in advance. SGML defines a standard way for defining your markup language that will allow it to use common processing that has already been developed for use by other markup languages and hence makes implementing a new markup language easier because the way the language is supposed to work follows known rules that other markup languages already use.

Often when new markup languages are first developed those developing it are unaware of SGML and they reinvent the wheel in that they develop their markup language completely from scratch using their own set of rules for how they expect it to work which may or may not work in practice. There is nothing hugely wrong with this since usually their non-standard markup language is good enough to test out the feasibility of what they are trying to do and they can convert it to make use of the standards once they have shown that their concept is feasible but needs to be developed properly to handle the issues that they overlooked.

Such was the case with the creation of HTML (HyperText Markup Language). When it was first created it didn't follow the standards the way that other markup languages did at the time. The benefits of following the standards were soon realised though and HTML was soon rewritten to follow the SGML standards and that version was released as HTML 2.

The rapid growth of the web and its HTML pages presented the concept of markup languages to a lot of people who had never come across them before and it was soon realised that there were lots of other places where similar markup languages could be used. One problem though is that SGML is fairly complicated in how it works in that it has to cater for all sorts of complexities in markup languages and those wanting to develop new markup languages were not experienced enough to be able to properly use it.

There was an obvious demand for something simpler than SGML that could be used for developing markup languages that use a simpler set of rules than SGML provides for. XML (eXtensible Markup Language) was developed to fill this gap.

XML itself was created using the SGML standard and it basically extracts a subset of the SGML rules in a way that makes it easier to develop further useful markup languages. Some of the ways that XML simplified things included that the markup tags be identified using < and > to enclose the tags and that all tags need to be closed,.This produced a standard for developing new markup languages similar to HTML but following slightly different rules that would make actually writing code to use it simpler.

Once XML became available it was realised that if HTML were to be redesigned to use the XML rules that it would then be compatible with all the other new markup languages that can be developed from XML. At the time HTML 4 was being developed and so a new variant of HTML to follow the XML rules was developed in parallel with it. The new HTML that is valid XML was called XHTML and version one of tthat markup language was released at the same time as HTML 4.

What all this means is that because of the development of all these different standards that we now have a markup language called XHTML which is somewhat similar to HTML and which can also be used in conjunction with other markup languages based on XML for all sorts of different uses. The only thing holding back all this additional cross compatibility is that not all of the popular browsers support XHTML and it therefore cannot be used to create web pages for the internet. That limits its cross compatibility to situations where you have control of which web browsers are going to be used to access it as a web page. Of course you also need other programs that will handle the other XML languages built into your content in order to make use of the content in other places.

The latest development for HTML is actually moving in the opposite direction from this as the group developing what they hope will be the next version of HTML (which they call HTML 5) have decided that standards don't matter and they are developing their new version without regard for the SGML standards for how markup languages should work. So the people developing HTML 5 have decided that standards are not important.


This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow