Write HTML, the HTML Way (Not the XHTML Way) | CSS-Tricks

Write HTML, the HTML Way (Not the XHTML Way) | CSS-Tricks

You may not use XHTML (anymore), but when you write HTML, you may be more influenced by XHTML than you think. You are very likely writing HTML, the XHTML way.

What is the XHTML way of writing HTML, and what is the HTML way of writing HTML? Let’s have a look.

In the 1990s, there was HTML. In the 2000s, there was XHTML. Then, in the 2010s, we switched back to HTML. That’s the simple story.

You can tell by the rough dates of the specifications, too: HTML “1” 1992, HTML 2.0 1995, HTML 3.2 1997, HTML 4.01 1999; XHTML 1.0 2000, XHTML 1.1 2001; “HTML5” 2007.

XHTML became popular when everyone believed XML and XML derivatives were the future. “XML all the things.” For HTML, this had a profound effect: The effect that we learned to write it the XHTML way.

The XHTML way is well-documented, because XHTML 1.0 describes in great detail in its section on “Differences with HTML 4”:

Does this look familiar? With the exception of marking CDATA content, as well as dealing with SGML exclusions, you probably follow all of these rules. All of them.

Although XHTML is dead, many of these rules have never been questioned again. Some have even been elevated to “best practices” for HTML.

That is the XHTML way of writing HTML, and its lasting impact on the field.

One way of walking us back is to negate the rules imposed by XHTML. Let’s actually do this (without the SGML part, because HTML isn’t based on SGML anymore):

Let’s remove the esoteric things; the things that don’t seem relevant. This includes XML whitespace handling, CDATA sections, doubling of attribute values, the case of pre-defined value sets, and hexadecimal entity references:

Peeling away from these rules, this looks a lot less like we’re working with XML, and more like working with HTML. But we’re not done yet.

“Documents may not be well-formed” suggests that it was fine if HTML code was invalid. It was fine for XHTML to point to wellformedness because of XML’s strict error handling. But while HTML documents work even when they contain severe syntax and wellformedness issues, it’s neither useful for the professional — nor our field — to use and abuse this resilience. (I’ve argued this case before in my article, “In Critical Defense of Frontend Development.”)

The HTML way would therefore not suggest “documents may not be well-formed.” It would also be clear that not only end, but also start tags aren’t always required. Rephrasing and reordering, this is the essence:

How does this look like in practice? For start and end tags, be aware that many tags are optional. A paragraph and a list, for example, are written like this in XHTML:

In HTML, however, you can write them using only this code (which is valid):

Developers also learned to write void elements, like so:

This is something XHTML brought to HTML, but as the slash has no effect on void elements, you only need this:

In HTML, you can also just write everything in all caps:

It looks like you’re yelling and you may not like it, but it’s okay to write it like this.

When you want to condense that link, HTML offers you the option to leave out certain quotes:

As a rule of thumb, when the attribute value doesn’t contain a space or an equal sign, it’s usually fine to drop the quotes.

Finally, HTML–HTML — not XHTML–HTML — also allows to minimize attributes. That is, instead of marking an element as required and read-only, like this:

Images Powered by Shutterstock