Just about every computer language allows lots of things to be done which you will be better off to avoid using.This is because it can be impossible to say when first creating a language as to which of the constructs that are allowed will actually be useful and which will add unnecessary complexity. Also the languages develop over time so that constructs that were inititally useful end up becoming something you should avoid.
Only by actually using the language for a period of time does it become obvious as to how best to achieve certain results. Unfortunately since not everyone actually gets to share in this knowledge the not so good parts of any language still end up getting used by those whose experience hasn't taught them the better techniques. In some cases books have been produced to advise people as to what parts of a language should and shouldn't be used based on experience of how the language works. These books are dependent on the actual experience of those who write them and so are only really useful where the author has been able to properly identify how the language ought to be defined so as to label all those other parts as bad. Unfortunately not all autghors necessarily have that experience and not everyone who writes in that language will necessarily learn from the book even when it does provide accurate information.
While HTML is a markup language rather than a programming language the same good and bad usages apply to it as apply to other languages.
One mistake that people make is in thinking that because their page validates correctly to the latest standard (at the time of writing this HTML 4.01 strict) that it is necessarily using the most appropriate HTML. This mistake is due to the fact that the HTML standards are not written for the web page authors, they are written for the people who write the browsers so as to tell them what tags and attributes they should support. For the browsers to work properly they need to be able to at least try to make sense of bad HTML as well as being able to process good HTML and so the standard includes both the good and the bad. The absolute worst parts of prior versions of HTML were relegated to the transitional version of HTML 4 when it was released in 1999 so as to give the browsers writers time to make sure that their browsers support the new standard correctly and eb page authors to update their pages to replace those absolute worst tags with the better alternatives that the new version of HTML introduced once the browsers supported those new ways.
Being able to write web pages completely in HTML 4 without using any of the bad parts of HTML 4 will only be possible once those browsers that do not fully support the standard cease to be used by any significant percentage of web users. Until then some of the bad convoluted approaches that HTML 4 allows and treats as valid are still required to support those still using antiquated browsers that don't fully support the standard (such as IE7).
The obvious next step once all the browsers in use do fully support the standard would be to define a new standard that only includes the good parts of HTML and which relegates the bad parts into a new transitional version so as to allow web page authors the additional time needed to restructure their pages to use the most effective HTML constructs. Unfortunately there are so many people writing web pages that many of them do not understand the reasons for trying to write good HTML and they produce any old junk HTML that happens to work now and completely disregard that it will take thousands of times as long to implement changes in the future as it would if it were done the right way. These people don't care if their pages are done right or not as long as they work as they are simply creating the pages for other people and then leave those people to worry about future changes to the pages. I have lost count of the number of people who have asked me about how to make minor changes to a web site that they just paid someone to build for them where it would be quicker and easier to start over and do it properly than it would to make the minor update to the junk code that they unknowingly bought.
That there isn't a newer version of HTML to follow on from HTML 4.01 that identifies the bad constructs as deprecated so that people can move forward toward using good HTML doesn't prevent you from taking that next step. It just makes things slightly harder for you as there isn't a validator to tell you when you have used bad HTML in your page so that you will know to rewrite it.As the changes that you need to make in avoiding bad HTML are not so much to do with which tags exist as in when they should be used, writing such a validator would not be so easy anyway.
So what changes do we need to make to get rid of the bad HTML?
Well the key to that is to write your HTML semantically so that the most appropriate HTML tag to define each piece of your content is the one that you use. In addition any tags which the standard defines as optional, you should treat as mandatory. Only optional attributes should remain as optional so as to only be used where required since it is obvious where a missing attribute should go but not so obvious where a missing tag should go.
Working toward getting your page to not only use valid HTML 4.01 strict but to also only use the good parts of that version of HTML will be the best thing you can do for a number of reasons. Not only will your pages be far easier to maintain and work better for more visitors but your pages will also be appropriately coded for use with future versions of HTML as well. Apart from the few good parts of HTML that have been mistakenly flagged for removal from the next version of HTML (eg the rev attribute) a page using good HTML 4 should also validate as future HTML versions as well (since it makes no sense whatever to remove the good parts while the bad parts ought to be and hopefully eventually will be removed). You will then be able to make simple modifications to your page in order to implement new good features of those future versions of HTML (such as the datalist tag which makes combo boxes in forms possible) while ignoring all the bad HTML that may have been mistakenly added by those involved in the standards who do not propelry understand how markup languages are supposed to work.
This article written by Stephen Chapman, Felgall Pty Ltd.