Search Tools Links Login

A Beginner's Guide to HTML Part I: (a brief reference)


You can't get too far in ASP without an intimate knowledge of HTML, so this tutorial will take a newbie through the ABC's of HTML...one step at a time. It's also a great reference for pros who forget how to use little known tags!
By pubs@ncsa.uiuc.edu

Original Author: Found on the World Wide Web

Code

A Beginner's Guide to HTML


This is a primer for producing documents in HTML, the
hypertext markup

language used on the World Wide Web. This guide is intended to be an

introduction to using HTML and creating files for the Web. Links are

provided to additional information. You should also check your local

bookstore; there are many volumes about the Web and HTML that could be

useful.


   * Getting Started

        o Terms to Know

        o What Isn't Covered

        o HTML Version

   * HTML Documents

        o What an HTML Document Is

        o Tags Explained

        o The Minimal HTML Document

        o A Teaching Tool

   * Markup Tags

        o HTML

        o HEAD

        o TITLE

        o BODY

        o Headings

        o Paragraphs

        o Lists

        o Preformatted Text

        o Extended Quotations

        o Addresses

        o Forced Line Breaks/Postal Addresses

        o Horizontal Rules

   * Character Formatting

        o Logical Versus Physical Styles

        o Escape Sequences

   * Linking

        o Relative Pathnames Versus Absolute
Pathnames

        o URLs

        o Links to Specific Sections

        o Mailto

   * Inline Images

        o Image Size Attributes

        o Aligning Images

        o Alternate Text for Images

        o Background Graphics

        o Background Color

        o External Images, Sounds, and
Animations

   * Tables

        o Table Tags

        o General Table Format

        o Tables for Nontabular Information

   * Fill-out Forms

   * Troubleshooting

        o Avoid Overlapping Tags

        o Embed Only Anchors and Character
Tags

        o Do the Final Steps

        o Commenting Your Files

   * For More Information

        o Style Guides

        o Other Introductory Documents

        o Additional Online References



----------------------------------------------------------------------------


                             
Getting Started


Terms to Know


WWW  World Wide Web

Web  World Wide Web

SGML

     Standard Generalized Markup Language--a standard for
describing markup

     languages

DTD  Document Type Definition--this is the formal specification of a
markup

     language, written using SGML

HTML

     HyperText Markup Language--HTML is an SGML DTD

     In practical terms, HTML is a collection of
platform-independent styles

     (indicated by markup tags) that define the various
components of a

     World Wide Web document. HTML was invented by Tim
Berners-Lee while at

     CERN, the European Laboratory for Particle Physics in
Geneva.


What Isn't Covered


This primer assumes that you:


   * know how to use NCSA Mosaic or some other
Web browser

   * have a general understanding of how Web servers and client
browsers

     work

   * have access to a Web server (or that you want to produce HTML
documents

     for personal use in local-viewing mode)


HTML Version


This guide reflects the most current specification--HTML
Version 2.0-- plus

some additional features that have been widely and consistently implemented

in browsers. Future versions and new features for HTML are under

development.


                              
HTML Documents


What an HTML Document Is


HTML documents are plain-text (also known as ASCII)
files that can be

created using any text editor (e.g., Emacs or vi on UNIX machines; BBEdit on

a Macintosh; Notepad on a Windows machine). You can also use word-processing

software if you remember to save your document as "text only with line

breaks."


Tags Explained


An element is a fundamental component of the structure
of a text document.

Some examples of elements are heads, tables, paragraphs, and lists. Think of

it this way: you use HTML tags to mark the elements of a file for your

browser. Elements can contain plain text, other elements, or both.


To denote the various elements in an HTML document, you
use tags. HTML tags

consist of a left angle bracket (<), a tag name, and a right angle bracket

(>). Tags are usually paired (e.g., <H1> and </H1>) to start and
end the tag

instruction. The end tag looks just like the start tag except a slash (/)

precedes the text within the brackets. HTML tags are listed below.


Some elements may include an attribute, which is
additional information that

is included inside the start tag. For example, you can specify the alignment

of images (top, middle, or bottom) by including the appropriate attribute

with the image source HTML code. Tags that have optional attributes are

noted below.


NOTE: HTML is not case sensitive. <title> is
equivalent to <TITLE> or

<TiTlE>. There are a few exceptions noted in Escape Sequences below.


Not all tags are supported by all World Wide Web
browsers. If a browser does

not support a tag, it (usually) just ignores it.


The Minimal HTML Document


Every HTML document should contain certain standard HTML
tags. Each document

consists of head and body text. The head contains the title, and the body

contains the actual text that is made up of paragraphs, lists, and other

elements. Browsers expect specific information because they are programmed

according to HTML and SGML specifications.


Required elements are shown in this sample bare-bones
document:


    <html>

    <head>

    <TITLE>A Simple HTML Example</TITLE>

    </head>

    <body>

    <H1>HTML is Easy To Learn</H1>

    <P>Welcome to the world of HTML.

    This is the first paragraph. While short it is

    still a paragraph!</P>

    <P>And this is the second paragraph.</P>

    </body>

    </html>


The required elements are the <html>,
<head>, <title>, and <body> tags (and

their corresponding end tags). Because you should include these tags in each

file, you might want to create a template file with them. (Some browsers

will format your HTML file correctly even if these tags are not included.

But some browsers won't! So make sure to include them.)


Click to see the formatted version of the example. A
longer example is also

available but you should read through the rest of the guide before you take

a look. This longer-example file contains tags explained in the next

section.


A Teaching Tool


To see a copy of the file that your browser reads to
generate the

information in your current window, select View Source (or the equivalent)

from the browser menu. The file contents, with all the HTML tags, are

displayed in a new window.


This is an excellent way to see how HTML is used and to
learn tips and

constructs. Of course, the HTML might not be technically correct. Once you

become familiar with HTML and check the many online and hard-copy references

on the subject, you will learn to distinguish between "good" and
"bad" HTML.


Remember that you can save a source file with the HTML
codes and use it as a

template for one of your Web pages or modify the format to suit your

purposes.


                             
   Markup Tags


HTML


This element tells your browser that the file contains
HTML-coded

information. The file extension .html also indicates this an HTML document

and must be used. (If you are restricted to 8.3 filenames (e.g.,

LeeHome.htm, use only .htm for your extension.)


HEAD


The head element identifies the first part of your
HTML-coded document that

contains the title. The title is shown as part of your browser's window (see

below).


TITLE


The title element contains your document title and
identifies its content in

a global context. The title is displayed somewhere on the browser window

(usually at the top), but not within the text area. The title is also what

is displayed on someone's hotlist or bookmark list, so choose something

descriptive, unique, and relatively short. A title is also used during a

WAIS search of a server.


For example, you might include a shortened title of a
book along with the

chapter contents: NCSA Mosaic Guide (Windows): Installation. This tells the

software name, the platform, and the chapter contents, which is more useful

than simply calling the document Installation. Generally you should keep

your titles to 64 characters or fewer.


BODY


The second--and largest--part of your HTML document is
the body, which

contains the content of your document (displayed within the text area of

your browser window). The tags explained below are used within the body of

your HTML document.


Headings


HTML has six levels of headings, numbered 1 through 6,
with 1 being the most

prominent. Headings are displayed in larger and/or bolder fonts than normal

body text. The first heading in each document should be tagged <H1>.


The syntax of the heading element is:

<Hy>Text of heading </Hy>

where y is a number between 1 and 6 specifying the level of the heading.


Do not skip levels of headings in your document. For
example, don't start

with a level-one heading (<H1>) and then next use a level-three
(<H3>)

heading.


Paragraphs


Unlike documents in most word processors, carriage
returns in HTML files

aren't significant. So you don't have to worry about how long your lines of

text are (better to have them fewer than 72 characters long though). Word

wrapping can occur at any point in your source file, and multiple spaces are

collapsed into a single space by your browser.


In the bare-bones example shown in the Minimal HTML
Document section, the

first paragraph is coded as


    <P>Welcome to the world of
HTML.

    This is the first paragraph.

    While short it is

    still a paragraph!</P>


In the source file there is a line break between the
sentences. A Web

browser ignores this line break and starts a new paragraph only when it

encounters another <P> tag.


Important: You must indicate paragraphs with <P>
elements. A browser ignores

any indentations or blank lines in the source text. Without <P> elements,

the document becomes one large paragraph. (One exception is text tagged as

"preformatted," which is explained below.) For example, the following
would

produce identical output as the first bare-bones HTML example:


    <H1>Level-one
heading</H1> <P>Welcome to the world of HTML. This is the

    first paragraph. While short it is still a

    paragraph! </P> <P>And this is the second
paragraph.</P>


To preserve readability in HTML files, put headings on
separate lines, use a

blank line or two where it helps identify the start of a new section, and

separate paragraphs with blank lines (in addition to the <P> tags). These

extra spaces will help you when you edit your files (but your browser will

ignore the extra spaces because it has its own set of rules on spacing that

do not depend on the spaces you put in your source file).


NOTE: The </P> closing tag can be omitted. This is
because browsers

understand that when they encounter a <P> tag, it implies that there is an

end to the previous paragraph.


Using the <P> and </P> as a paragraph
container means that you can center a

paragraph by including the ALIGN=alignment attribute in your source file.


    <P ALIGN=CENTER>

    This is a centered paragraph. [See the formatted version
below.]

    </P>


                      
This is a centered paragraph.


Lists


HTML supports unnumbered, numbered, and definition
lists. You can nest lists

too, but use this feature sparingly because too many nested items can get

difficult to follow.


Unnumbered Lists


To make an unnumbered, bulleted list,


  1. start with an opening list <UL> (for
unnumbered list) tag

  2. enter the <LI> (list item) tag followed by the individual item;
no

     closing </LI> tag is needed

  3. end the entire list with a closing list </UL> tag


Below is a sample three-item list:


    <UL>

    <LI> apples

    <LI> bananas

    <LI> grapefruit

    </UL>


The output is:


   * apples

   * bananas

   * grapefruit


The <LI> items can contain multiple paragraphs.
Indicate the paragraphs with

the <P> paragraph tags.


Numbered Lists


A numbered list (also called an ordered list, from which
the tag name

derives) is identical to an unnumbered list, except it uses <OL> instead
of

<UL>. The items are tagged using the same <LI> tag. The following
HTML code:


    <OL>

    <LI> oranges

    <LI> peaches

    <LI> grapes

    </OL>


produces this formatted output:


  1. oranges

  2. peaches

  3. grapes


Definition Lists


A definition list (coded as <DL>) usually consists
of alternating a

definition term (coded as <DT>) and a definition definition (coded as
<DD>).

Web browsers generally format the definition on a new line.


The following is an example of a definition list:


    <DL>

    <DT> NCSA

    <DD> NCSA, the National Center for Supercomputing
Applications,

         is located on the campus of the
University of Illinois

         at Urbana-Champaign.

    <DT> Cornell Theory Center

    <DD> CTC is located on the campus of Cornell University
in Ithaca,

         New York.

    </DL>


The output looks like:


NCSA

     NCSA, the National Center for Supercomputing
Applications, is located

     on the campus of the University of Illinois at
Urbana-Champaign.

Cornell Theory Center

     CTC is located on the campus of Cornell University in
Ithaca, New York.


The <DT> and <DD> entries can contain
multiple paragraphs (indicated by <P>

paragraph tags), lists, or other definition information.


The COMPACT attribute can be used routinely in case your
definition terms

are very short. If, for example, you are showing some computer options, the

options may fit on the same line as the start of the definition.


<DL COMPACT>

<DT> -i

<DD>invokes NCSA Mosaic for Microsoft Windows using the

initialization file defined in the path

<DT> -k

<DD>invokes NCSA Mosaic for Microsoft Windows in kiosk mode

</DL>


The output looks like:


-i   invokes NCSA Mosaic for Microsoft Windows
using the initialization file

     defined in the path.

-k   invokes NCSA Mosaic for Microsoft Windows in kiosk mode.


Nested Lists


Lists can be nested. You can also have a number of
paragraphs, each

containing a nested list, in a single list item.


Here is a sample nested list:


    <UL>

    <LI> A few New England states:

        <UL>

        <LI> Vermont

        <LI> New Hampshire

        <LI> Maine

        </UL>

    <LI> Two Midwestern states:

        <UL>

        <LI> Michigan

        <LI> Indiana

        </UL>

    </UL>


The nested list is displayed as


   * A few New England states:

        o Vermont

        o New Hampshire

        o Maine

   * Two Midwestern states:

        o Michigan

        o Indiana


Preformatted Text


Use the <PRE> tag (which stands for
"preformatted") to generate text in a

fixed-width font. This tag also makes spaces, new lines, and tabs

significant (multiple spaces are displayed as multiple spaces, and lines

break in the same locations as in the source HTML file). This is useful for

program listings, among other things. For example, the following lines:


    <PRE>

      #!/bin/csh

      cd $SCR

      cfs get mysrc.f:mycfsdir/mysrc.f

      cfs get myinfile:mycfsdir/myinfile

      fc -02 -o mya.out mysrc.f

      mya.out

      cfs save myoutfile:mycfsdir/myoutfile

      rm *

    </PRE>


display as:


      #!/bin/csh

      cd $SCR

      cfs get mysrc.f:mycfsdir/mysrc.f

      cfs get myinfile:mycfsdir/myinfile

      fc -02 -o mya.out mysrc.f

      mya.out

      cfs save myoutfile:mycfsdir/myoutfile

      rm *


The <PRE> tag can be used with an optional WIDTH
attribute that specifies

the maximum number of characters for a line. WIDTH also signals your browser

to choose an appropriate font and indentation for the text.


Hyperlinks can be used within <PRE> sections. You
should avoid using other

HTML tags within <PRE> sections, however.


Note that because <, >, and & have special
meanings in HTML, you must use

their escape sequences (&lt;, &gt;, and &amp;, respectively) to
enter these

characters. See the section Escape Sequences for more information.


Extended Quotations


Use the <BLOCKQUOTE> tag to include lengthy
quotations in a separate block

on the screen. Most browsers generally change the margins for the quotation

to separate it from surrounding text.


In the example:


    <BLOCKQUOTE>

    <P>Omit needless words.</P>

    <P>Vigorous writing is concise. A sentence should
contain no

    unnecessary words, a paragraph no unnecessary sentences, for
the

    same reason that a drawing should have no unnecessary lines
and a

    machine no unnecessary parts.</P>

    --William Strunk, Jr., 1918

    </BLOCKQUOTE>


the result is:


     Omit needless words.


     Vigorous writing is concise. A
sentence should contain no

     unnecessary words, a paragraph no unnecessary
sentences, for the

     same reason that a drawing should have no unnecessary
lines and a

     machine no unnecessary parts.


     --William Strunk, Jr., 1918


Addresses


The <ADDRESS> tag is generally used to specify the
author of a document, a

way to contact the author (e.g., an email address), and a revision date. It

is usually the last item in a file.


For example, the last line of the online version of this
guide is:


    <ADDRESS>

    A Beginner's Guide to HTML / NCSA / pubs@ncsa.uiuc.edu
/ revised April 96

    </ADDRESS>


The result is:

A Beginner's Guide to HTML / NCSA / pubs@ncsa.uiuc.edu
/ revised April 96


NOTE: <ADDRESS> is not used for postal addresses.
See "Forced Line Breaks"

below to see how to format postal addresses.


Forced Line Breaks/Postal Addresses


The <BR> tag forces a line break with no extra
(white) space between lines.

Using <P> elements for short lines of text such as postal addresses
results

in unwanted additional white space. For example, with <BR>:


    National Center for Supercomputing
Applications<BR>

    605 East Springfield Avenue<BR>

    Champaign, Illinois 61820-5518<BR>


The output is:


National Center for Supercomputing Applications

605 East Springfield Avenue

Champaign, Illinois 61820-5518


Horizontal Rules


The <HR> tag produces a horizontal line the width
of the browser window. A

horizontal rule is useful to separate sections of your document. For

example, many people add a rule at the end of their text and before the

<address> information.


You can vary a rule's size (thickness) and width (the
percentage of the

window covered by the rule). Experiment with the settings until you are

satisfied with the presentation. For example:


<HR SIZE=4 WIDTH="50%">


displays as:

                  
--------------------------------------


                   
        Character Formatting


HTML has two types of styles for individual words or
sentences: logical and

physical. Logical styles tag text according to its meaning, while physical

styles indicate the specific appearance of a section. For example, in the

preceding sentence, the words "logical styles" was tagged as a
"definition."

The same effect (formatting those words in italics) could have been achieved

via a different tag that tells your browser to "put these words in
italics."


NOTE: Some browsers don't attach any style to the <DFN>
tag, so you might

not see the indicated phrases in the previous paragraph in italics.


Logical Versus Physical Styles


If physical and logical styles produce the same result
on the screen, why

are there both?


In the ideal SGML universe, content is divorced from
presentation. Thus SGML

tags a level-one heading as a level-one heading, but does not specify that

the level-one heading should be, for instance, 24-point bold Times centered.

The advantage of this approach (it's similar in concept to style sheets in

many word processors) is that if you decide to change level-one headings to

be 20-point left-justified Helvetica, all you have to do is change the

definition of the level-one heading in your Web browser. Indeed many

browsers today let you define how you want the various HTML tags rendered

on-screen.


Another advantage of logical tags is that they help
enforce consistency in

your documents. It's easier to tag something as <H1> than to remember that

level-one headings are 24-point bold Times centered or whatever. For

example, consider the <STRONG> tag. Most browsers render it in bold text.

However, it is possible that a reader would prefer that these sections be

displayed in red instead. Logical styles offer this flexibility.


Of course, if you want something to be displayed in
italics (for example)

and do not want a browser's setting to display it differently, use physical

styles. Physical styles, therefore, offer consistency in that something you

tag a certain way will always be displayed that way for readers of your

document.


Try to be consistent about which type of style you use.
If you tag with

physical styles, do so throughout a document. If you use logical styles,

stick with them within a document. Keep in mind that future releases of HTML

might not support physical styles, which could mean that browsers will not

display physical style coding.


Logical Styles


<DFN>

     for a word being defined. Typically displayed in
italics. (NCSA Mosaic

     is a World Wide Web browser.)

<EM>

     for emphasis. Typically displayed in italics.
(Consultants cannot reset

     your password unless you call the help line.)

<CITE>

     for titles of books, films, etc. Typically displayed in
italics. (A

     Beginner's Guide to HTML)

<CODE>

     for computer code. Displayed in a fixed-width font.
(The <stdio.h>

     header file)

<KBD>

     for user keyboard entry. Typically displayed in plain
fixed-width font.

     (Enter passwd to change your password.)

<SAMP>

     for a sequence of literal characters. Displayed in a
fixed-width font.

     (Segmentation fault: Core dumped.)

<STRONG>

     for strong emphasis. Typically displayed in bold.
(NOTE: Always check

     your links.)

<VAR>

     for a variable, where you will replace the variable
with specific

     information. Typically displayed in italics. (rm
filename deletes the

     file.)


Physical Styles


<B>  bold text

<I>  italic text

<TT>

     typewriter text, e.g. fixed-width font.


Escape Sequences (a.k.a. Character Entities)


Character entities have two functions:


   * escaping special characters

   * displaying other characters not available in the plain ASCII
character

     set (primarily characters with diacritical marks)


Three ASCII characters--the left angle bracket (<),
the right angle bracket

(>), and the ampersand (&)--have special meanings in HTML and therefore

cannot be used "as is" in text. (The angle brackets are used to
indicate the

beginning and end of HTML tags, and the ampersand is used to indicate the

beginning of an escape sequence.) Double quote marks may be used as-is but a

character entity may also be used (&quot;).


To use one of the three characters in an HTML document,
you must enter its

escape sequence instead:


&lt;

     the escape sequence for <

&gt;

     the escape sequence for >

&amp;

     the escape sequence for &


Additional escape sequences support accented characters,
such as:


&ouml;

     the escape sequence for a lowercase o with an umlaut:


&ntilde;

     the escape sequence for a lowercase n with an tilde: ??

&Egrave;

     the escape sequence for an uppercase E with a grave
accent: ?ê


You can substitute other letters for the o, n, and E
shown above. Check this

online reference for a longer list of special characters.


NOTE: Unlike the rest of HTML, the escape sequences are
case sensitive. You

cannot, for instance, use &LT; instead of &lt;.


                        
          Linking


The chief power of HTML comes from its ability to link
text and/or an image

to another document or section of a document. A browser highlights the

identified text or image with color and/or underlines to indicate that it is

a hypertext link (often shortened to hyperlink or link).


HTML's single hypertext-related tag is <A>, which
stands for anchor. To

include an anchor in your document:


  1. start the anchor with <A (include a space
after the A)

  2. specify the document you're linking to by entering the parameter

     HREF="filename" followed by a closing right
angle bracket (>)

  3. enter the text that will serve as the hypertext link in the current

     document

  4. enter the ending anchor tag: </A> (no space is needed before the
end

     anchor tag)


Here is a sample hypertext reference in a file called
US.html:


    <A HREF="MaineStats.html">Maine</A>


This entry makes the word Maine the hyperlink to the
document

MaineStats.html, which is in the same directory as the first document.


Relative Pathnames Versus Absolute Pathnames


You can link to documents in other directories by
specifying the relative

path from the current document to the linked document. For example, a link

to a file NYStats.html located in the subdirectory AtlanticStates would be:


    <A HREF="AtlanticStates/NYStats.html">New
York</A>


These are called relative links because you are
specifying the path to the

linked file relative to the location of the current file. You can also use

the absolute pathname (the complete URL) of the file, but relative links are

more efficient in accessing a server.


Pathnames use the standard UNIX syntax. The UNIX syntax
for the parent

directory (the directory that contains the current directory) is "..".
(For

more information consult a beginning UNIX reference text such as Learning

the UNIX Operating System from O'Reilly and Associates, Inc.)


If you were in the NYStats.html file and were referring
to the original

document US.html, your link would look like this:


    <A HREF="../US.html">United
States</A>


In general, you should use relative links because:


  1. it's easier to move a group of documents to
another location (because

     the relative path names will still be valid)

  2. it's more efficient connecting to the server

  3. there is less to type


However use absolute pathnames when linking to documents
that are not

directly related. For example, consider a group of documents that comprise a

user manual. Links within this group should be relative links. Links to

other documents (perhaps a reference to related software) should use full

path names. This way if you move the user manual to a different directory,

none of the links would have to be updated.

About this post

Posted: 2002-06-01
By: ArchiveBot
Viewed: 120 times

Categories

ASP/ HTML

Attachments

No attachments for this post


Loading Comments ...

Comments

No comments have been added for this post.

You must be logged in to make a comment.