From what I have seen, most people do not bother with trying to learn how to write HTML in accordance with the standards. Even fewer try to use it correctly by choosing the semantically appropriate tags for their specific content. Most people write HTML that just works despite its use of long dead tags and wrapping content in the completely wrong tag in order to get it to display the way they want in specific browsers.
Most web pages are written in something that might be HTML 3.2 if it weren't for the various proprietary tags that the page uses.
The cause of this is that very few people actually attend a course to learn how to write HTML properly. They therefore miss out on the reasons why it is best to follow the standards and use semantically correct tags. Of course there are exceptions where an alternate approach might be required but you need to know the rules very well before you can determine properly the rare instances where it might be appropriate to break them and most people simply never learn the rules that ought to be followed in writing HTML.
It is because most people use what works rather than doing things properly that those working on HTML 4 have had to ignore that HTML 4 listed a number of tags that were to be removed completely and no longer supported in HTML 5 simply because most people still use those tags even though their removal was announced back in 1997. Not only have the pages that existed back then that used the tags not been updated to remove them but millions of new web pages have been created since then that use tags that were flagged for removal.
As a result of this continuing use of garbage code that works in some browsers but which almost certainly produces a mess in others (but which the web page owners don't know about because they never checked their page in those browsers) browsers are now forced to continue to support that garbage simply because if they don't and other browsers do then people will switch to using the browser that tries to make sense of the garbage rather than sticking with the one that has abandoned the long dead tags. Of course there is always the possibility that markup might be so poor that while it works in one version of a browser it ceases to work in the next version of the same browser because the bug in the browser that allowed the poor markup to work has been fixed.
Of course browsers also need to support the standards and the latest version all browsers support the HTML 4 standard and so can understand web pages that use that version (and will all process it the same way - unlike the way they handle non-standard markup where different browsers may treat it differently. The biggest problem here is that some old browsers take a long time to die - such as Internet Explorer 6 which was the most standard compliant browser at the time it was introduced but was still used by a significant number of people five years after it had become the least standard compliant browser. Other languages don't suffer from this problem as the author of the code usually has more control over the environment that it runs in.
While a markup language such as HTML is extremely simple when compared to any programming language, it doesn't mean that you can learn it properly simply by examining the markup that other people have used. Just because they copied it from somewhere else because it worked way back in whichever versions of the various browsers were around when the original page that used it, doesn't mean that it is necessarily even remotely close to being the best way of coding it - or even that it will work at all now. I often see people asking for help where their web page doesn't work properly in the latest web browsers (but where it presumably works fine in some antiquated browser that may or may not still have a significant number of people using it).
By learning HTML properly and using the standard tags in a semantic way you also make your web page source easier for search engines to read and understand. This means that the page is likely to be listed much higher in the search results than the same page content would be if the HTML used was a jumbled mess. While the search engine is primarily concerned with the content, the HTML is what tells the search engine what the content is.
The entire purpose of HTML is to identify what the various pieces of content are. It needs to clearly identify what are headings, paragraphs, lists, tabular data etc. HTML 5 is going one step further in this direction with introducing specific tags to identify larger portions of the content as page headers and footers, navigations, articles, asides, sections etc that will make the significant part of the content easier to identify. Converting a page to make proper use of these new tags will be much easier if the semantically correct tags are already being used for the smaller components of the page as a semantically coded HTML 4 page will generally use divs with specific ids or classes to identify those larger portions while waiting for the next standard to be completed and to introduce them properly into HTML. Pages that do not use semantic markup will require a complete rewrite in order to be able to make use of these new tags (people might still decide to use them in their pages but they will not be of much use if the rest of the tags are meaningless with respect to their content).
This article written by Stephen Chapman, Felgall Pty Ltd.