The Design and Construction of eBooks, by Steve Thomas

The Structure of a Book

The books are formatted using HTML and a style sheet. The HTML takes care of the structure — the division of a book into chapter, section, paragraph etc. — while the style sheet takes care of the presentation — how each structural component should look on your screen or printer.

Major Parts

Books typically consist of a number of parts, and these are pretty well described by the Chicago Manual of Style and similar documents, which we follow fairly closely. Essentially, a book is divided into three parts, called the front matter, body and back matter, and these in turn consist of zero or more parts, as follows2-1:

In this collection, the above structure is coded in HTML using the DIV tag along with a “class” attribute named after the part which the text enclosed by the DIV represents. For example, a preface would be represented as:

<div class="preface">
     . . .     </div>

Note that a class attribute is intended (in HTML) to refer to a style defined in a style sheet. In our case, this is often not so, and there is in most cases no style associated with the above-mentioned class names. They are used more to describe structure, rather than presentation.

In many cases, there will be multiple occurrences of a division — for example, the great majority of novels contain multiple chapters — and these are distinguished with the “id” attribute.

<div id="chapter1" class="chapter">
     . . . </div>
<div id="chapter2" class="chapter">
     . . . </div>
     . . . 

1 This list should be regarded as “flexible” — the list is not definitive and additional classes might be added as deemed appropriate at the time.


Headings are encoded using either h3 or h4 tags. Major headings — the heading for one of the divisions defined above, such as the heading for a preface or a chapter title — are encoded using h3. Minor headings — sub-headings or section headings within a chapter — are encoded using h4. Rarely, I’ve also used h5 as a sub-sub-heading.

So, for example, the start of chapter 1 of Richard Burton’s Pilgrimmage to Al-Medinah and Meccah is encoded like this:

<div id="chapter1" class="chapter" title="CHAPTER I.">
<div class="header">
<h3>CHAPTER I.</h3>
<p>A few words concerning      . . . 

Sometimes, I’ve used h2 for a “super-heading”, where a chapter is the first of a major section, usually labelled “Book” or “Volume” or “Part”. Such headings are omitted if they were merely an artefact of the printed work, e.g. where the work was printed as two volumes and the chapter numbering was continuous across the two volumes. If numbering of chapters began again from 1 with the second volume, then I’ve retained the “Volume” header rather than renumbering chapters, in order to avoid confusion with references.


The most used component of any book is the paragraph, which is simply encoded with the p tag. Mostly the p tag is used without attributes, except for the first line of chapters, where I use the “dropcap” class.


Tables are quite straight-forward, but a bit of a pain to code, because every cell has to be formatted separately. In theory, this shouldn’t be necessary, because the colgroup and col tags should make it possible to style all cells in a column together, but unfortunately, the use of classes on these tags is not yet well supported. For example:

This is the left column. The text should be justified and aligned to the top, while the next column should be right justified and aligned to the bottom, and use 30% of the table width. 1,000
Second row. 2,000

So, I’ve defined a number of “atoms”, or simple classes defining single properties to use in styling table cells, which makes it a little less tedious:

.tc { text-align:center; }
.tr { text-align:right; }
.vat { vertical-align:top; }
.vab { vertical-align:bottom; }

So, for example, for a column of numbers, you might code each cell with <td class="tr vab">, which will align the text to the right and bottom of the cell.

table { margin:1em auto; }
th { font-weight:normal; } /* override broswer default */
.bt { border-top:1px solid gray!important; }
.br { border-right:1px solid gray!important; }
.bb { border-bottom:1px solid gray!important; }
.bl { border-left:1px solid gray!important; }

table.tb1 { border:1px solid gray; }
table.tb1 tr td { border:1px dotted gray; }
table.tb1 tr th { border:1px dotted gray; }
table.tb1 tr td { border:1px dotted gray; }

table.nb { border:none; }
table.nb tr th { border:none; }
table.nb tr td { border:none; }

Other structural components

Various other components are catered for using class attributes, which can be applied either to a single P tag, or to a DIV tag where the component has multiple paragraphs. The most common of these is the “quote”, which is not text in quotation marks, but text which is quoted within the text; this is essentially the same as the blockquote. Similar classes exist for verse, the parts of plays, footnotes etc.

Minimal HTML

As a deliberate policy, our web books use only a limited subset of the full range of HTML encoding. In particular, presentation tags such as B (bold), U (underline) and I (italic) are not used; tags relating to forms and frames are not used; other deprecated tags are not used. Also, although BLOCKQUOTE is used, current practice is to replace this with a <div class="quote"> block.

All of our web books should conform to strict HTML5

Last updated Tuesday, January 26, 2016 at 23:27