This is not surprising, once you know how these texts are created: usually, someone will create a machine-readable plain text version from a print edition, either by scanning and OCR, or sometimes by simply typing it in. I have also heard of someone dictating a text using speech recognition software. Whatever the method, errors are bound to occur. (OCR typically has a 2% error rate, even for the best software, due to imperfections in the original. With a typical 200 word page, this equates to about four errors per page.)

Once a “raw” plain-text file has been created from OCR, it gets spell-checked. Spell checking only gets you so far: “he” and “be” are both valid spellings, yet confusing “h” and “b” is one of the common OCR errors. After spell-cheking, the text is proof-read. If you’ve ever done proof-reading, you’ll know that it is a boring and thankless task. Anyway, proof-reading will find the majority of errors, but it is always possible that some will remain. Which is where you, the reader, comes in:

If you do find errors, either in the text or in formatting, please let us know and we’ll do our best to fix them.

