[ Love Thy HTML : Ge Ricci ]

Love Thy HTML

December 20, 2012

The web is about data sharing, about content. If we focus on content during all the phases of our web development we are sure to produce not just a fair and fine-tuned site for a great user experience, but also an accessible, adaptive and future-friendly content source.

From the web design to JavaScript behavior, they're all at the service of the content, and when we talk about content on the web, we're talking primarily about HTML.

Not long ago, I've stumbled upon a job add looking for web designers, and something in the job description made me shiver a little: textually, they were looking for young people, arguing that, as young people were born with the web, they'd naturally understand it better than anyone.

Is this a bad start? Don't get me wrong here. I don't think there's something wrong with young people, and it is not because I'm a web dinosaur that I don't like them. It is just that I'm sure that being born with the web doesn't mean you know it. It means you're familiar with it, it means you know what you can do with it, but in any way it means you understand it better.

I don't really know the quality of the training given to those who want to design or to develop for the web front, but I know that many of us, me included, learned HTML and CSS design by ourselves, getting what we could from the web itself and developing our work habits as our time and curiosity allowed us.

I truly believe that is a fine and healthy way to learn, but there's a problem: the essence of the web and of HTML is not the thing people are looking for, so it is not learned, and the way the web will evolve may suffer from it.

We “old web guys” must have a role of preachers among young people, to preach exactly what the web is about, because they are building the future of the web today, and let's face it, I'm afraid we're losing ground. We are so many preaching good web practices and trying to make newbies look at it the right way, but to some of those bold young people, our speech seems to be the boring moral talk of an old-fashioned mother explaining why classical music is priceless.

Somehow the web is all about looks and awesome effects, powerful APIs, but this merely scratches the surface of the web front. Front developers are eager to get new challenges and learn new ways of doing smashing things for the web, and they really do, but what are they missing exactly?

Just take, for example, the wonderful CSS frameworks that pop up everywhere in the web, made by a new motivated web generation of web developers. First of all, they're "CSS Frameworks", what, for me, is a weird thing in itself: why somebody needs a CSS framework? Why is there so much fuzz about CSS and not a second look to HTML? Worst, why do we sacrifice the HTML in behalf of the CSS design? Is this happening because some of us simply don't get CSS design, and those frameworks or JavaScript plug-ins make us feel “safer”?

On one side we have the happy recklessness of youth; on the other side we have those that simply don't get CSS design. So what can be better than a framework that gives you a whole choice of class names that can be attributed to all — precisely all — HTML tag and that will suddenly transform your span or div in a beautifully shaped button?

Surely, there's nothing wrong with it if you don't care for accessibility, or web evolution, or data sharing, or reworking, or hours of debugging, or CSS optimization, or... There's nothing wrong except for the fact that we're forgetting what the web is all about: it is about content sharing.

Yes, some CSS frameworks, along with JavaScript functions or plug-ins, are spreading bad, unstructured and non-accessible markup. And the spiral goes deeper down. But everybody is happy. Who cares if we class a button "button" and if our content is no longer identifiable otherwise than by a class? Who cares if our form fields don't have labels, as long as they look nice? Today, divs and spans are becoming the new lingua-franca, and I keep wondering why some people are still working on resources like microformats or microdata.

HTML is about content

Think back. Think again. Are you sure you know what HTML is?

Above all, HTML was the answer to a major digital content issue. Before HTML, we had non-exploitable digital content in multiple proprietary formats, and we were drowning in a soup of style instructions meant to be interpreted by a printer.

Bunches of instructions for font, alignment, color, etc. were melted with our carefully chosen content in a way that only printers (and the text editor in use) could interpret.

No need to say that if your text editor was MSWord and somebody else had WordPerfect, the exchange of files could not be easily done. You could send your document already printed, so whoever got it could simply retype everything with WordPerfect (why? To have content in digital format, of course!), or you could send them the content in ASCII format. There would be no retyping, but your contact would lose all the information about content structure ("Is this a title? Where does my paragraph begin and where does it end?"), and would have to go through the entire document to "tag" content by its role or, for the careless ones, to change the style of the content as they believed necessary in order to give the document some structure.

So, there would they go trying to figure out the structure of your content in order to assemble something that could resemble the printed version. Maybe retyping wasn't a bad idea after all.

We didn't realize then how much paper, and time, and money, and fingers we were wasting by doing and redoing things so to keep our content in some digital format (or another). But, let's face it, where's the advantage to have it digital if it can't be easily accessed, shared, stored or reused? We were still printing everything anyway…

Good for us, along with the http protocol and the concept of browsers, the HTML arrived to offer us a content meant to be shared by everyone, targeting the right problem: instead of trying to fix the issues of data sharing collateral effects, it proposed a straightforward, non-proprietary language meant to be a standard in a controlled, standardized universe, the web.

The big picture foreseen was simple: digital content would be created to be kept digital and to be digitally exploitable. We would separate content from its appearance and we would label our content based on its role: this is a title, this is a sub-title, this is a paragraph, this is a list of items, etc.

Once something is labeled, we know what we're supposed to do with them, don't we? Obviously, if the labels used are standards. Take for example the laundry care symbols. Today, those symbols are (almost) worldwide standards and instruct us on how to deal with our laundry. A quick look at them and we are able to sort clothes that can be washed at 60 degrees from those that can't, or those that are suitable for drier or not, etc.

That's pretty much the same with HTML tags. Each tag gives a specific role (the care “symbol”) to a portion of content (our cloth), and so every "HTML interpreter" can understand and sort those tags to do whatever it has to do with it. And this “whatever” does not imply content display only (even if it is also quite helpful for it). It can be used to give extra meaning and logic to our content, regardless of how it will look like.

HTML tags that give this extra meaning to our content are what we call “semantic tags”:

Se-man-tic adjective: 1: of or relating to meaning in language; Merriam Webster

We have then the digital content and the indicator of its role (our HTML semantic tag). Content is now ready to be accessed and shared by any simple system that is able to process those tags.

What if my HTML is not semantic?

Well, let's take a look at this portion of markup (an example based in oh! so many existing examples):

<div class="theplugin">
<div class="borderTop"></div>
<div class="title">
This is the title
</div>
<div class="content">
<span>This is the content</span>
<div class="button">
 Close
</div>
</div>
<div class="borderBottom"></div>
</div>

Hey, how cool is it? We have divs! And, of course, we have plenty of classes to handle them... in some way or another. I especially like class names like “title” or “button”: why bother trying to figure out which tag to use for a title (!?) or for a button? It is not as if HTML had tags for those...

So much for sarcasm! Let's get to the facts: what's wrong with this code? First of all, we have just divs and spans, I mean, no semantic tag is used here. As we don't have semantic tags to give the content a role we can't differentiate one portion of content from another. It means that, in lack of proper tags, we have to differentiate our content by adding classes to our divs. So, instead of using a <hX> tag for a title, we have a div with a class named “title”. If we can't make this difference, we cannot access this content in an optimized way. For example, this div classed “title” cannot profit of the cascading values of a proper <hX> tag on the style-sheet.

We have then a complex markup for a simple and semantic-less content that demands, at the same time, a logical structure.

If we look at this portion of markup with a browser, we'd have something like:

This is the title

This is the content

What do we see there? It is just three simple strings of text, without relation between each other and no way to distinguish between them. Worst of all: we have a div classed “button” that I'm pretty sure should be a link.

If content is not correctly labeled, it cannot be correctly interpreted, like in this case, with the browser. In fact, the browser doesn't “see” the content here. It just displays the divs without any attempt of interpretation of what they hold inside them.

We can of course force this interpretation visually by means of style, but it will stay as it is, only visual. We can also force content behavior by adding a layer of JavaScript, and suddenly our div becomes a clickable element, but it will stay as it is, a div.

When content cannot be recognized or interpreted, it cannot be easily targeted or correctly exchanged also. Still, if this block of content is not properly structured so to mirror the relation between the different data besides than visually, it cannot benefit from some powerful resources like document outlines, for example.

The image is nice, the design is cool. But again, we're talking about digital data and content exchange, not about nice Photoshop posters.

Web Design is about content

Design, in its essence, is about logic, structure and meaning. Being also a painter myself, I believe the main difference between design and pure art is that design has an objective purpose and it is meant to solve a problem, while art is a free expression of more or less unconscious feelings.

Web design is at the service of usability and of content, and it is an integrating and important part of the user experience.

Can you imagine designing a page without content? No, of course you can't. Focus on content is the only way of getting the best from our design. It is what allows us to give coherence and visual hierarchy to our page, in a macro or micro perspective. It is only by considering our content that we're able to assure readability and to promote interest.

Above all, when we're talking about web design, the fact that we focus on content opens us up to the opportunity to embrace the web as its own medium. [The Dao of Web Design, John Allsopp]

The content-first method

Forget "mobile-first", forget “desktop-first”. Think “content-first”.

The adoption of a content-first approach will demand a little review of the way we work. And the good news here is — I like to believe it — that this new way of working will improve not only the quality of our sites, but also the quality of our designs.

I must confess that it is not right for me to say that this is a new way of working. In fact, this approach gets its lights from points of view defended back in 2009, as we're going to see later.

For the web design, the "content-first" approach is twofold: the creative design process and the web design integration.

When designing a site, our first inputs may be the design brief, our client's identity, the site goals, etc. Those inputs will allow us to start our creative process. We'll study color palettes, font families, textures and effects (or the absence of them), images, etc. All those elements put together will create a whole visual universe meant to answer the issues presented in our first inputs.

Yes, you got it right: we're not worried here about content yet. Our concerns during the creative process are free from content, but strongly tied up to more general and macro site goals. At this point, we're not able yet to concern ourselves with page layout, content structure or hierarchy, and we don't want to. This is a pure creative search, even if it is focused on answering a problem.

In a stage of the project when content is not yet defined, we designers are not just sat down waiting; we can go forward by choosing and validating the look-and-feel of the future site.

The right tool for the right job, isn't it? So, I'd not kill Photoshop too soon, because it is a powerful and very useful tool during the creative search stage. In this stage we're playing, we're searching, we're in a total creative process, and Photoshop can be an ally then, if we keep in mind that, yes, we'll no longer need it once our content is defined.

When the content of the site is finally defined, let's say, in a wireframe form, then we're prepared for web design.

As Andy Clark once put, We aren't designing photocopies of web pages, we're designing web pages (An Event Apart, Chicago 2009, On Designing with Photoshop).

You see, that makes us go back to 2009! And he is still right today. I'd say, more than never, because if we stick to our content-first approach, we have another good reason to do so.

That's exactly the moment when the fusion between the web design process and the web page integration is at its most, and yes, we must go straight to the HTML to do our design if we really want to “embrace the web as its own medium” and to make content rules all.

Choose a good HTML editor (not a WYSIWYG one, please!) and start by analyzing your content. Remember that each portion of content has a purpose and a role, and choose the appropriate structure and tags for it and then, just then, start your CSS design using your style-tiles as reference.

I like to build a whole HTML page before starting to work on my CSS. That makes me avoid prolixity or useless markup, and that allows me to focus on the logical structure of my content, independently on how it will look like later.

One thing to have in mind is to carefully analyze your content inputs. If you have a wireframe as a starting basis, you are able to identify and sort the specific blocks of contents so to foresee how to optimize your code by avoiding too many exceptions or inconsistency between pages.

Conclusion

The web is an amazing universe. It is opened, democratic, welcoming. It gives space for somebody like my mother — who knows almost nothing about computers — to be proud to publish her recipes without much effort. At the same time, it allows web professionals to create powerful web applications that will be used by thousands. But, of course, there's a difference here. The responsibility that we web professionals have are much higher, and that responsibility lies not just on the amount of time and money spent on our work, but also in respecting it, our users and our clients.

If we disregard HTML in favor of the CSS design or of JavaScript, we're just doing things wrong, and as web professionals we can't miss such a basic standard.

Having the content as a focus during all phases of our project will help us craft our HTML in a logical, exploitable way, which will lead us to an effective CSS design and to a smooth, natural content behavior.