References, Tutorials, Structure, Tags
Administrative
Decide on which HTML editor is to be used; get ftp or other file upload/download mechanism working.
Upper case or lower case? Lower case for tags and attributes! Quotes or no quotes for attributes? Put all attributes in quotes! Close all tags! Conform to the latest CSS, HTML and XHTML specifications!
Online tutorials and references for HTML, XHTML, XML
- Internationalization (I18n), Localization (L10n), Standards, and Amusements has many useful links.
- Build your website in 6 steps, a comprehensive article at ZDNET
- Cyber Aspect - has helpful information for web page builders, including feature articles, hardware and software reviews, book reviews, industry news, and more
- Developer Zone at Project Cool - many reference pages on different topics: HTML, XML, JavaScript, DHTML, Graphics, etc.
- Netscape Navigator Gold Authoring Guide
- A Guide to Creating Web Sites with HTML, CGI, Java, JavaScript, Graphics at WDVL
- HTML 4.01 at W3C -- the official document
- HTML School at w3schools.com
- HTML - The HyperText Markup Language at WDVL
- HTML 4.01 Tags at WDVL
- WebSIG's XHTML document, with links to other references
- A Beginner's Guide to HTML at NCSA
- HTML Station at december.com
- List of HTML 4 Elements at december.com
- HTML Card, Tags & Attributes One-page comprehensive reference at visibone
- Web Page 101 at ZDNET
- ZDNet Web Page Developer's Tag Library: Information about HTML and the browsers that support it
- Bare Bones Guide to HTML 4.0 available as ASCII version, downloadable version in ZIP format, and HTML-formatted, also in various languages. By Kevin Werbach
- The WWW Help Page by Kevin Werbach offers help on common problems in creating HTML documents.
- Kevin Werbach's links to some of the excellent resources available on the Web
- Boogie Jack's Web Depot with HTML and CSS tutorials, graphics tutorials, cut-and-paste JavaScript, sound effects, software reviews, computer tips, news, guest articles, web page graphics, fonts, links to freeware & shareware downloads, and more.
- www.reallybig.com largest directory of Web building resources on the internet!
- Web Wise Wizard, a web authors and webmasters toolbox, is rated highly by Google, and has all kinds of information in addition to HTML, CSS, JavaScript and links to other sources of information.
- Programming & Scripting Languages Help
- Bobby evaluates web pages for accessibility.
- XML Essentials at XML.com (part of the O'Reilly Network) is a good place to start learning about XML. Includes a Syntax Checker and a Resource Guide.
HTML and CSS code Validators
Remember to put a Document Type Declaration at the very top of the HTML document, of form (the text after PUBLIC is case sensitive!):
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
or for a page containing a <frameset> use:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/frameset.dtd">
to make the validator happy (the second line with the URL seems optional).
- W3C HTML Validator
- W3C CSS Validator
- HTML Toolbox at NetMechanic
- HTML Tidy HTML correction, formatter and conversion to SHTML application
- Online Web front-end to HTML Tidy, specifically for converting individual HTML documents to XHTML form. Note: there is a problem with special characters, so if you use them, check that the conversion did not replace the codes with the characters.
- W3C® Link Checker checks for "link rot" and gives a report.
- Xenu's Link Sleuth (free) checks Web sites for broken links. Link verification is done on "normal" links, images, frames, plug-ins, backgrounds, local image maps, style sheets, scripts and java applets.
Other useful online references
Web Content Accessibility Guidelines 1.0. The Checklist has detailed guidelines of what to check for.
HTML Document Structure
- The strict doctype declaration:
<!!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
is used if all formatting is in Cascading Style Sheets (CSS). That is, <font> and <table> tags are not used to control how the browser displays the documents.
- The transitional doctype declaration:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
is used when you need to use presentational markup in your document. Most of us will be using the transitional DTD for quite some time, because we don't want to limit our audience to users with browsers that support CSS.
- The frameset declaration:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/frameset.dtd">
is used when your documents have frames.
- At the very beginning and at the very end of the document, we declare the file is HTML <html>[whole document]</html>
- The head contains administrative information that is not rendered, but which helps or instructs the browser how to render the body, such as JavaScript and CSS declarations, document title, various meta tags, etc. <head>[Head section]</head>
- The title is the only part of the Head that is visible - it appears in the upper left corner of the Browser <title>[Title, appears in the upper left of Window]</title>. It is a required element.
- Meta tags are put in the head, with structure like <meta name="robots" content="[noindex|index],[nofollow|follow]">. There are many meta tags that can be used. All META tags must contain:
- a content="" (text string that provides the value information for the META tag, always enclosed in quotes) attribute and
- either a name="" (defines the kind of META tag, always enclosed in quotes, usually containing information for the browser to interpret)
- or HTTP-EQUIV="" (tells the server to include the name="" and value="" pair in the MIME document header passed to the Web browser, used to control or direct the actions of the browser) attribute.
- <meta name="ROBOTS" content="NONE"> well-behaved robots will ignore the page. <meta name="ROBOTS" content="NOARCHIVE"> Google will not archive this page.
- One good meta tag to include is the owner or contact for the web page, e.g.:
<meta name="contact" content="Hutchings, Stan">
- To switch off the IE6 annoying Pictures pop-up toolbar feature for an entire page or site, the following META tag will do the trick:
<meta http-equiv="imagetoolbar" content="no" />
- I18n Guy has a good introduction for the most common META tags.
- Character encoding declaration became a requirement with the HTML 4.01 specification. Generally, UTF-8 (8-bit UCS/Unicode Transformation Format) character encoding is able to represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. However, if you find you need to support the Asia - Pacific character sets, then use either language specific encoding or UTF-16 [ most browsers will handle UTF-16, just not as well as UTF-8].
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
- The Body contains the parts rendered by a browser, and the Tags that instruct the browser how to render them <body>[Body section]</body>
- New for XHTML:
- All HTML must be in lowercase
- All attribute values must be quoted
- All non-empty elements must be terminated (In addition to the <p> element, this also applies to list elements which are often left unterminated: <li></li>, <dt></dt>, and <dd></dd>)
- Elements must nest, not overlap
- All documents must have a doctype declaration
Body Attributes
The BODY of a document contains the document's content. The content may be rendered by a user agent (e.g. a browser) in a variety of ways. For example, for visual browsers, you can think of the BODY as a canvas where the content appears: text, images, colors, graphics, etc. For audio user agents, the same content may be spoken. Since style sheets are now the preferred way to specify a document's presentation, the presentational attributes of BODY have been deprecated.
- bgcolor="#FFFFFF" sets the background color (default white)
- text="color" sets the text color (default black)
- background = "URL" sets a background image (default is none)
- link = "color" sets the color of the unvisited site (default blue)
- alink = "color" sets the color of the active link
- vlink = "color" sets the color of the visited links (default red)
- ONLOAD = "Script" runs a JavaScript program when the page loads
- ONUNLOAD = "Script" runs a JavaScript program when you leave the page
- Since the background attributes are deprecated, see W3's advice for using CSS instead of attributes.
Tags (Controls) format the Content
- (Note Many of these (colored orange) are deprecated in HTML 4.01; see CSS Replacements for alternatives)
- Headings <h#>Heading Text</h#> # = 1 - 6, 1 is largest, 4 is "normal text" size
- Bigger text <big>Bigger text</big> Note: deprecated tag, use CSS instead
- Smaller text <small>Smaller text</small> Note: deprecated tag, use CSS instead
- Paragraph <p>Paragraph text</p> align="left|right|center|justify" It cannot contain block-level elements. Note: align is a block-level deprecated tag, use CSS instead (especially for inline elements, such as IMG).
- Break <br> to clear alignment, <br clear=left|right|all> End tag: forbidden. Use <br /> in XHTML.
- Center <center>Text to center</center >
- Bold <b>Text to Bold</b> Note: deprecated tag, use CSS instead
- Strong <strong>use when a page element needs to be emphasized, an important keyword or topic</strong>. Strong and emphasis are inline structural tags used to indicate the most important text on the page, but bold and italic are are HTML formatting tags.
- Italic <i>Text to italicize</i> Note: deprecated tag, use CSS instead
- Emphasis <em>use when a page element needs to be emphasized, an important keyword or topic</em>
- Citation <cite>Citation text</cite>
- <address>text using the address tag</address> (note: ADDRESS is a block element)
- Blockquote for longer quotations to preserve the original spaces and tabs (HTML collapses spaces and tabs, destroying some formatting that may be desired). This is not an inline tag, so will create a line before and after the text.
<blockquote cite="http://www.mycom.com/tolkien/twotowers.html"> Text in blockquote (which could be the section referenced in the cite attribute. </blockquote>
Preformatted text, no width set, a block element, so preceding and following text is separated with a line.
<pre width="#">Text which should have its spacing and formatting preserved</pre>
- Abbreviations WWW <abbr title="World Wide Web">WWW</abbr>. abbr is not supported in all browsers (if you don't see a dotted line around abbr, it isn't supported by yours), and may be deprecated.
- Acronyms WWW <acronym title="World Wide Web">WWW</acronym>. On mouse-over, the expanded definition in title is popped up. Acronym is not supported the same in all browsers. Use CSS to ensure cross-browser compatibility; for example, <style type="text/css"> acronym {cursor:help; border-style:dotted; border-width:1px;}</style>, or put into external CSS file (as is in spaugstyle.css).
-
Computer Code (displayed in a monospace font) <code>Code text</code>
- Underline <u>Underlined text</u> Note: deprecated tag, use CSS instead
- Strike-through <s>Strike-through text</s> Note: deprecated tag, use CSS instead
- Insert or delete <ins>inserted text</ins> and <del>deleted text<del> are used to markup sections of the document that have been inserted or
deleted with respect to a different version of a document, either block-level or inline elements.
- Superscript E = mc2 <sup>Superscript text</sup> E = mc<sup>2</sup> Note: deprecated tag, use CSS <span style="vertical-align:super; font-size:60%">2</span> instead
- Subscript H2O <sub>subscript text</sub> H<sub>2</sub>O
- Literal text (teletype) <tt>Fixed font text (re-defined by CSS for this document)</tt> Note: deprecated tag, use CSS instead; e.g. font-size:90%; font-family:" 'courier new', courier, monospace"; color:#006633; background:transparent;
- 92 characters can be pasted as-is: 52 letter characters (26 lower-case letter, 26 upper-case letters), 10 digits, and the following 30 punctuation marks:
` - = [ ] \ ; * , . / ~ ! @ # $ % ^ ( ) _ + { } | : ? | " '
- Character references in HTML may appear in two forms: numeric character references (either decimal &#D; or hexadecimal &#XH; format) and character entity references. Any "special" character will need to be described in one of three ways:
- using the numeric syntax "&#D;", where D is a decimal number, refers to the ISO 10646 decimal character number D or using the syntax "&#xH;" or "&#XH;", where H is a hexadecimal number referring to the ISO 10646 hexadecimal character number H;
- using character entity references (character entity references use symbolic names so that authors need not remember code positions. Character references within comments have no special meaning; they are comment data only.) See and www.bbsinc.com/iso8859.html (good visual display) for more information.
- as a very last resort, as a small .GIF image, which is useful in some cases, such as hiding email addresses from bots.
See also ISO 8859-1 (Latin-1) Characters List and Web Characters Reference at Visibone, which includes thumbnails of all special HTML characters from � through  as they appear on your system. Another reference is the Unicode Homepage. If you want the alpha name for a character (not all characters have a name), look at www.visibone.com/htmlref/char/ceralpha.htm.
Note the ellipsis was previously ƒ (); but recently … (…) or … (…) is required; previous m-dash — (); now — (—) or — (—) is required. Here's a checked box <span style="font-family:wingdings">þ</span> (þ), using the font-family:wingdings style. Unfortunately, not all browsers support these codes - if you see something strange in the parentheses, yours doesn't!
- OS support for other languages - Windows 2000's native Unicode encoding (UTF-16 little-endian) and its complete support for Unicode. The Windows 2000 MultiLanguage Version (MUI) allows users - for the first time - to select the language of the User Interface (dialog boxes, menus, HTML help files etc). Windows XP provides all of Windows 2000 multilingual experience to home users for the first time; all flavors of Windows XP (Personal, Professional, and Server) provide the same level of international support.
- You can open an RTF-formatted document of the Special characters up to ASCII 255, which includes the number and the character entity (if it exists).
- The following 4 characters ASCII names should be used to avoid confusion on the browser's part: ampersand (&) & greater-than symbol (>) > less-than symbol (<) < quotation marks (") " and another character worth remembering is the non-breaking space. Browsers usually don't care about white space, but accurate spacing can be achieved by specifically encoding your spaces with non-breaking space
- Horizontal rule <hr width="#%" align="left|right|center|justify" size="#">
- Comment <!-- a comment is not rendered by a browser -->
- Non-breaking space or   acts as a non-breaking space where no line break is wanted; also, a "space holder" to add space.
-
This is a structure division: 10 pt blue arial, usually used for block elements
<div name="">Division text</div>
- Span tag to enclose <span id="bldblu">a short inline (not block) section</span> for CSS formatting (not supported by some early browsers)
Website structure
Folder hierarchy is optional. It can make some housekeeping chores easier, but hyperlinks become a little more complicated. Duplicate the hierarchy of the website onto your hard drive, and test the links. Suggestion: create separate folders for images, sounds, multimedia, and styles, as applicable, and in addition a separate subdirectory for each major topic your website will contain.
Font Attributes
Note: the FONT tag is now officially deprecated in HTML 4.0, even though it is still very common. It is better to control various font attributes with Cascading Style Sheets (CSS) instead.
Font <font face="times new roman">Times New Roman text</font> Use of FONT is risky, if not necessary don't use. If you designate fonts, make sure to indicate second and third choices that are common: face="Verdana,Arial,Helvetica". Common PC fonts: Verdana, Arial, Helvetica, "Courier New", Palatino, "Times New Roman"; Common Unix fonts: Helvetica, Times; Common Macintosh fonts: Helvetica, Courier, Palatino, Times, Verdana, Arial. Generic: sans-serif, serif, cursive.
Font color
Note: the FONT COLOR tag is now officially deprecated in HTML 4.0, even though it is still very common. It is better to control various color attributes with Cascading Style Sheets (CSS) instead.
Color <font color="#000000|color_name">Text to be colored</font> The 16 common colors with names and their sRGB values (the name or value can be used in the quoted value for color):
black="#000000"; green="#008000"; silver="#c0c0c0"; lime="#00ff00"; gray="#808080"; olive="#808000"; white="#ffffff"; yellow="#ffff00"; maroon="#800000"; navy="#000080"; red="#ff0000"; blue="#0000ff"; purple="#800080"; teal="#008080"; fuchsia="#ff00ff"; aqua="#00ffff".
In addition, there are another 124 named colors that most browsers understand and can display. These include the standard colors and some with more exotic names, such as cornflower blue, papayawhip, and peachpuff. There are swatches of the 140 colors and their names at MountainDragon's Web site.
An important note on color: currently web browsers only share 216 common colors. When designing key elements in your web site you should stay within the 216-color pallet. If you go outside the 216 color pallet, you start to use colors that do not exist within that browser, and the browser has to mix the colors that do not exist. Some displays will distort the tiny dots used to create the "dithered" color to the point where the image is so speckled that it does not appear to be a solid color. This makes text very hard to read if it is placed over the dithered color. You should always use a browser safe color when using solid color as a design element, or as a background for text. You should use yellow and red colors sparingly in your web site itself. Only use them in areas where you want the visitor to focus on. For a good dispay of the 216-browser-safe-color palette, visit the Color Lab at VisiBone. For an extended colorchart, see HYPE's 360+ Color Specifier. For a lot more information on using color, visit About.com and search on web colors. See also the article Splash Color onto Your Web Pages by Darrell Elmore at ZDNET. John Buck suggests Web Designer Community at SitePro Central, it's also interesting to play with it. The sliders let you adjust the color to your liking, then you can use the Hex RGB code. There are other website-building tools on its home page. Found on digg.
Size <font size="#">Font whose size is changed</font> 1=smallest, 8=largest. Relative is +1|2|3|4 or -1|2|3|4. The actual size depends on browser and user-set preferences. See http://www.netmechanic.com/news/vol3/design_no8.htm
Combined face, color, size <font size="1" color="#000000" face="times new roman">affected text</font>.
Font Type, Face and White Space
Here is some advice from Gregory McNamee -- see the full article at www.webreview.com/2001/03_23/designers/index01.shtml. Choose your font to match the purpose of your content. For long pieces of text, a serif face is preferred; for headlines, display type, pull quotes, click-here-to-purchase callouts, and any other material that you want your reader to linger over, use a sans serif font. Some modern digital sans serif fonts -- Arial, Optima, Verdana, and Helvetica -- also have the roundness and easily distinguished letterforms of the serif counterparts, and they're well suited for on-screen presentation of all kinds of writing.
Most serif fonts are designed to be optimally readable at 10 to 14 points; most sans serif fonts can be reduced a point or two below that and remain legible.
Allow ample margins around your text -- at least an inch, or even two, on all sides. Ideally, legible paragraphs will have only ten to twelve words per line, a feat best accomplished not by blowing up the type to larger sizes (use large type only to emphasize headlines and other important elements), but by affording your paragraphs plenty of breathing space.
Allow plenty of white space both in the margins and between lines of type. This between-line spacing (called "leading") helps the type stand out against the background. Another rule of thumb is to allow leading of at least 25 percent of the text weight -- that is, with 10 point type use at least 2.5 points of leading, with 12 point type use at least 3 points.
Accessibility
Consider how your site will appear on a PDA, to a blind person's browser, to a cell phone, etc. Accessibility needs to be considered and coded into your site.
- Have your pages checked, go to Bobby, and enter the URL of the page(s) you want checked. Consider whether the suggestions are necessary for your purpose.
- Identify the language of the text: <html lang="EN">. The value of the LANG attribute must be set to one of the ISO 639 language codes.
- Add a title="purpose and content of frame" attribute to each FRAME element to describe the purpose and content of the FRAME.
- Provide alternative content for each SCRIPT that conveys important information or functionality. For example, after a JavaScript that gives today's scores,
<NOSCRIPT>
<P>To access today's scores, <A href="scores.html">visit our text-only version.</A></P>
</NOSCRIPT>
- Separate adjacent links with more than whitespace. Images or bulleted or numberd lists are good choices. "Whitespace characters", such as spaces, line breaks, carriage returns, and paragraph breaks, are not sufficient.
- Use the ABBR and ACRONYM elements to denote and expand any abbreviations and acronyms that are present. For example,
<ABBR title="California">CA</ABBR>
<ACRONYM title="World Wide Web">WWW</ACRONYM>
- Avoid specifying a new window as the target of a link with the target="_blank" or target="_new" attributes of links unless users know that a new window will open.
- If you use color to convey information, make sure the information is also represented another way.
- Add a description to a FRAME if the TITLE does not describe its contents. Use the tag longdesc="filename.txt" to link to a verbose explanation.
- Mark up any quotations with the Q and BLOCKQUOTE elements.
