In the sense used here, a "Straw Man Argument" is a misrepresentation of the opposing view, set up in such a way that it is easy to demolish. This "set-up" is meant to bring the opponent's position into disrepute, in the hope of avoiding having to address the real arguments.
There are several Straw Man Arguments that appear on the WWW authoring newsgroups so often that they are no longer funny.
First, however, the HTML Purist Fallacy.
Before we start counting, let's examine the myth of the "HTML Purist". No-one has captured a real live HTML Purist that genuinely does express the views attributed to him - or her - in the Straw Man Arguments, so we can't put one "on the stand" and question them as to their beliefs. Nevertheless, the myth of the "HTML Purist" is one that has persisted since quite early in the life of the WWW: although the details have changed, the underlying principles remain. Well, I am one of the people accused of being an HTML Purist, so, in default of finding a genuine witness, you're now getting my views on the matter.
otherwise known as the "Conservative Subset of HTML/1"[b] fallacy.
The HTML Purist stands accused of wanting authors to only use the HTML "Lowest Common Denominator"[c] of all available browsers, wanting only to use HTML in such a way that it "looks the same on all browsers".
The second claim is particularly absurd, since HTML purists want HTML documents to be accessible also to speaking browsers, as well as to indexing robots, where they self-evidently don't "look the same" as on mainstream browsers.
The HTML language is designed, from the start, to represent the logical structure of the content (emphasis, blockquote, cite etc.) which can be depicted in different ways according to the situation in which it's being rendered - graphics display, with or without auto image loading; character mode display; speaking machine; etc. - last but by no means least the indexing robot, without which your page is effectively becalmed on the vastness of the WWW without hope of rescue.
"HTML Purists" look to the browser developers to develop increasingly attractive and novel ways to present the content[d]. They consider that the HTML author's job is primarily to create, organise and mark up their content, whatever it may be, such that each one of a whole range of browsers can present the content to the best of its ability, covering between them a whole range of different presentation situations. They were frustrated that the browsers' inherent ability to present well-structured HTML seemed to have stagnated somewhere around the heyday of NCSA Mosaic (and even gone backwards in some respects), while the browser makers added more and yet more glitzy effects that had little or nothing to do with the quality of their HTML rendering. Newcomers then approach HTML as if it were a DTP page design language, missing the point of the flexibility that was designed-in to HTML from the start.
So, vendors added more and more presentation-specific hacks to HTML, resulting in the awful compromise that was HTML/3.2. The upshot was that the HTML Purists were frustrated at the poverty of design exhibited by browsers, and at browsers' relative inability to adapt to different browsing environments and user choices, while at the same time the "DTP designer" crowd are frustrated at HTML's inability to guarantee the precise appearance that they have convinced themselves they need; and nobody feels that they are doing a good job on the WWW. So sad.
But HTML4 reversed the trend by deprecating presentation-specific attributes in HTML, and re-establishing the long-standing principle of structural markup in HTML, with presentation proposals delegated to stylesheet(s). On the browser front too, we have seen improvements, with browsers increasingly doing a good job of appealing to their users - whereas previously there were rumours that browsers were being designed primarily to deliver uncomplaining readers to wealthy advertisers.
This "HTML Purist" encourages you, as author, to specify presentation as much as you find appropriate, provided your message still makes sense when the presentation details fail[e].
It simply isn't true to claim that HTML purists tell you to avoid all newly-defined markups. Often the newly-defined markups have useful fallback behaviours, and so those browsers that implement the new markup get the benefit in terms of improved presentation, while those that don't can still access your content or message.
HTML Purists do not for a moment claim that appearance isn't important - far from it: some of them indeed have long been campaigning for better stylesheets support in browsers. They are happy for you to improve the appearance of your pages for your favourite browsing situation(s) - just so long as they are capable of graceful fallback in other browsing situations, without impairing the content of your message[g].
also known as the "Netscape Defaults for Everybody" fallacy.
The HTML Purist stands accused of wanting everyone to see HTML documents with black text on a mid-grey background, just like Netscape's browser, and NCSA Mosaic before it, had always done when installed straight out of the box.
HTML Purists consider that every reader has the right to choose how they read their text. That means, the freedom to choose the author's proposals, if any, or to reject them if the reader finds them inappropriate. The problem with printed books, for instance, is that everyone has to make do with the same font size, the same paper colour..., irrespective of their choices or abilities. One of the benefits of the WWW is that readers are relieved of this problem, for example if they suffer from poor eyesight or colour-blindness.
Nevertheless, pages can be made more visually interesting, to those readers that have no problems with it. The principle is to find ways that work well, for those readers who can and wish to take advantage of them, without impairing the results for others. HTML markup techniques can (and, in recognition of the reality of access on the WWW, "should") be assessed for their ability to fall back gracefully, when viewed in a wide range of browsing situations and settings. To take a concrete example: the HTML/3.2 BODY color attributes could be used effectively, so long as the author specified all of the color attributes, since browsers typically allow the reader to insist on their own color configuration if they need to; in CSS there is a similar best-practice principle of specifying explicit colours either for both text and background, or for neither, at any given specificity.
"HTML purists" can point to the mention of style sheets already in the HTML2.0 specification - even one of the earliest web browsers used a stylesheet to govern its presentation (see screenshot) - wonder why it has taken so much time and effort to get them deployed; can explain that, when used appropriately, they are the most reliable way to propose a specific presentation for those readers able to take advantage, without the risk of inadvertently impairing the presentation for those who cannot; and wonder why the browser makers took sooooo long to get started on implementing them.
Finally, most of the "HTML Purists" known to me do not actually like mid-grey backgrounds, and are quite baffled that that this remained the installation default for so long. Of course, they support every reader's democratic right to select a mid-grey background, if they happen to be one of those few who really do prefer it!
HTML purists have no objection to you specifying a colour scheme if you wish, so long as you do it in a way which can fall-back gracefully when circumstances are against it. HTML purists equally defend an author's right to specify no colour scheme, although many of them find it curious how long the installation default remained mid-grey in spite of so many complaints about it.
also known as "HTML purists use no other media".
Disclaimer: This document consists almost entirely of boring text. If you think that proves anything, there's this bridge you might be interested in buying...
The HTML purist calls attention, however, to the fact that text is the one medium that can be made accessible to all browsing situations: it can be presented on a character-mode display, on a graphic display, it can be printed out for later consideration, it can be input to a Brailler, fed to a speaking machine for a vehicle driver or telephone caller, or to blind readers. By comparison, images cannot be read-out over the telephone, nor are they very accessible to a blind reader, nor can they be indexed by robots in any content-related fashion; audio cannot be perceived without appropriate equipment, nor by a deaf reader, nor is it always convenient (in a library, for example). In short, text is the most widely accessible of all the media, as well as being the reliable way to get your material indexed by the web robot services.
HTML Purists recognize that readers may access your pages in more than one way (from the multimedia station at home over dial-up; from the now-quite-old PC their company has at the office; from the top-end workstation in the design office; WebTV and similar TV-based appliances; from a laptop with an inadequate display, dialling up over an overpriced hotel phone line; palmtop/cellphone combo; from the public library...); and the trend is clearly towards ever more diverse browsing situations: if you repel them, keep them pointlessly waiting, or stick a load of unusable junk onto their display, without justifiable reason as far as they can see, what's the chance that they'll bother with your pages even when they're in the situation that fits your demands? Instead, they might be so annoyed as to complain about you to their friends and colleagues, something that you could do well to avoid.
On the other hand if you welcome them with the basic information that they need, they'll be realistic enough to understand that your other excitements (VRML, whatever) were inaccessible to them for justifiable reasons, and might want to revisit later when they're in a position to enjoy.
In recognition of realities, HTML Purists encourage you to make full use of all appropriate media for your purpose, but to ensure that the core of your message be available as well-marked-up text, so that it can be perceived by any reader in any reasonable browsing situation.
Those pages with meaningful text will be the ones that count at the robot indexers. Those who welcomed the indexing robot with 'helpful' messages like "get a proper browser", "your browser does not support frames" etc. are right there on the record: you surely don't want to join them?
The HTML "Purists" are really "Pragmatists". They have seen what happens when over-ambitious pages collapse in a heap, in browsing situations that are just a little outside of what the author expected. They have some familiarity with the HTML specifications and drafts, and the actual browsers that are out there, and they have developed some idea of how to enhance their pages for a high-end browsing situation without causing the page to become inaccessible to other situations.
They would, indeed, love to be able to rely on the full range of the many useful HTML constructs that have been drafted over the years, but they recognize that it is unrealistic to do so. They are authoring for the World Wide Web, for the readers that are out there, with their various abilities, using the variety of browser/versions that readers use; they are not authoring exclusively for some vendor-narrow or 21-inch-screen-24-bit-color readership. They understand how to use optional enhancements, in ways that do not harm those unable to use those enhancements; and try to avoid relying on a feature without which the content may make no sense, or become misleading. They make their authoring decisions accordingly.
[a] "gray", for USA readers ;-)
[b] There never was an "HTML/1" - the original HTML didn't have a number at the time, and HTML 2.0 was the first version to be numbered and codified.
[c] Lowest Common Denominator is another of those terms borrowed from a specialist field (in this case of course mathematics), where they have a precise technical meaning, and misused to mean something quite different, just for the sake of sounding impressive. To get the sense that's intended in common parlance, the correct mathematical term would be "Highest Common Factor".
[d] Most browsers have really been very primitive, lacking even the most obvious user-friendly refinements such as Overviews; multiple windows into a corpus of knowledge (e.g a user-oriented "frame" view, rather than the crude and limited author-oriented "frames" "design" introduced by Netscape); user-driven access to footnotes and asides; well-designed print formatting; etc., all based on fitting one, well-authored well-structured HTML document to each of many different browsing situations and user choices, optionally with hints taken from an author-provided style sheet.
"Purists" consider that the HTML author's job is primarily to marshall their content, and mark it up honestly according to its logical structure. This can then be presented beautifully, in those browsing situations that are capable of it, by supplying appropriate stylesheet(s), without harming the accessibility of the content to more-unusual browsing situations. Many newcomers seem blissfully unaware of what could be possible, and simply take it for granted that HTML is nothing more than another DTP facility for them to spend their time and effort "designing" page layouts, one at a time, when good stylesheet design would leverage its designer's ability across a whole corpus of work, and good browser design could leverage the browser designer's expertise across the whole WWW.
Purists do not for a moment decry good graphic design, indeed they welcome it when used in genuinely effective ways. All too often, though, on the WWW, "graphic design" is applied in misguided ways that only work in a limited range of browsing situations, and are actively hostile to effective browsing in other situations.
[e] Don't say "click on the green text" - you have no idea whether the text is green for every reader. In fact, don't say "click on" at all; make the actual purpose of the link into the link's active text, leaving the sense to flow naturally. As an old but still relevant Web Style Guide at the W3C said, Don't mention the mechanics. There are some good reasons for this, quite apart from it being good style: some browsers and indexers can give a summary of the links on a page, and it really isn't helpful when the summary reads something like this:
- Click here
- Click here
- and here
without any indication in the active text of the purpose of each link. You say you've never seen such a useful feature in a browser? Well, Microsoft provided this as an add-on for IE5 for example; and recent versions of Opera (e.g 7.01) include such a feature as standard, making it possible to show a link list in at least two different ways: on the "links panel", or by means of the "View"> "Links in Frame" menu.
See also the next footnote, for a specific problem of this kind.
For dealing with adverse viewing situations, browsers
typically offer the user the ability to override the
author's colour specifications.
However, Netscape versions up to and including
Netscape 4 are/were notorious in allowing the author's
font color specification to take precedence over the
user's "use my colors" option: thus, if the author's text colour
happens to match the user's choice of background, the text disappears.
It's kind-of poetic that
was at its most risky on the
browsers from the very vendor that introduced this extension.
Style sheets, even in the face of buggy implementations,
are a safer way to handle colour proposals.
[g] ..and if your material is of the kind where that genuinely isn't possible (calligraphy, say) then you really should be asking yourself whether HTML is the appropriate medium for your material. There's nothing wrong, in appropriate cases, with using HTML merely as a thin deposit of "hyperglue" to paste together the various media that you use for your graphic designs or whatever, but this isn't what I'm talking about when I'm seriously discussing the "authoring of HTML for the WWW".
[h] There's a curious superstition around that the "purist" insistence on syntactically valid HTML means that client-side scripting is ruled out. This is entirely wrong, but the propagation of factually-incorrect myths and superstitions seems to be an essential part of the "anti-purist" approach.
[?] If you aren't using a CSS-aware browser, that's no problem: this page has been designed to work with or without, although naturally I think it's better with the style sheet. By now, you will probably have worked out that these numerous footnotes were intended as a bit of an academic-type joke.
And the background image isn't a "straw man", it's a Corn Dolly.
Original materials © Copyright 1994 - 2006 by A.J.Flavell