Recreating “The Effects of Nuclear Weapons” for the Web

Published 1 month, 2 weeks past

In my previous post, I wrote about a way to center elements based on their content, without forcing the element to be a specific width, while preserving the interior text alignment.  In this post, I’d like to talk about why I developed that technique.

Near the beginning of this year, fellow Web nerd and nuclear history buff Chris Griffith mentioned a project to put an entire book online: The Effects of Nuclear Weapons by Samuel Glasstone and Philip J. Dolan, specifically the third (1977) edition.  Like Chris, I own a physical copy of this book, and in fact, the information and tools therein were critical to the creation of HYDEsim, way back in the Aughts.  I acquired it while in pursuit of my degree in History, for which I studied the Cold War and the policy effects of the nuclear arms race, from the first bombers to the Strategic Defense Initiative.

I was immediately intrigued by the idea and volunteered my technical services, which Chris accepted.  So we started taking the OCR output of a PDF scan of the book, cleaning up the myriad errors, re-typing the bits the OCR mangled too badly to just clean up, structuring it all with HTML, converting figures to PNGs and photos to JPGs, and styling the whole thing for publication, working after hours and in odd down times to bring this historical document to the Web in a widely accessible form.  The result of all that work is now online.

That linked page is the best example of the technique I wrote about in the aforementioned previous post: as a Table of Contents, none of the lines actually get long enough to wrap.  Rather than figuring out the exact length of the longest line and centering based on that, I just let CSS do the work for me.

There were a number of other things I invented (probably re-invented) as we progressed.  Footnotes appear at the bottom of pages when the footnote number is activated through the use of the :target pseudo-class and some fixed positioning.  It’s not completely where I wanted it to be, but I think the rest will require JS to pull off, and my aim was to keep the scripting to an absolute minimum.

LaTeX and MathJax made writing and rendering this sort of thing very easy.

I couldn’t keep the scripting to zero, because we decided early on to use MathJax for the many formulas and other mathematical expressions found throughout the text.  I’d never written LaTeX before, and was very quickly impressed by how compact and yet powerful the syntax is.

Over time, I do hope to replace the MathJax-parsed LaTeX with raw MathML for both accessibility and project-weight reasons, but as of this writing, Chromium lacks even halfway-decent MathML support, so we went with the more widely-supported solution.  (My colleague Frédéric Wang at Igalia is pushing hard to fix this sorry state of affairs in Chromium, so I do have hopes for a migration to MathML… some day.)

The figures (as distinct from the photos) throughout the text presented an interesting challenge.  To look at them, you’d think SVG would be the ideal image format. Had they come as vector images, I’d agree, but they’re raster scans.  I tried recreating one or two in hand-crafted SVG and quickly determined the effort to create each was significant, and really only worked for the figures that weren’t charts, graphs, or other presentations of data.  For anything that was a chart or graph, the risk of introducing inaccuracies was too high, and again, each would have required an inordinate amount of effort to get even close to correct.  That’s particularly true considering that without knowing what font face was being used for the text labels in the figures, they’d have to be recreated with paths or polygons or whatever, driving the cost-to-recreate astronomically higher.

So I made the figures PNGs that are mostly transparent, except for the places where there was ink on the paper.  After any necessary straightening and some imperfection cleanup in Acorn, I then ran the PNGs through the color-index optimization process I wrote about back in 2020, which got them down to an average of 75 kilobytes each, ranging from 443KB down to 7KB.

At the 11th hour, still secretly hoping for a magic win, I ran them all through to see if we could get automated savings.  Of the 161 figures, exactly eight of them were made smaller, which is not a huge surprise, given the source material.  So, I saved those eight for possible future updates and plowed ahead with the optimized PNGs.  Will I return to this again in the future?  Probably.  It bugs me that the figures could be better, and yet aren’t.

It also bugs me that we didn’t get all of the figures and photos fully described in alt text.  I did write up alternative text for the figures in Chapter I, and a few of the photos have semi-decent captions, but this was something we didn’t see all the way through, and like I say, that bugs me.  If it also bugs you, please feel free to fork the repository and submit a pull request with good alt text.  Or, if you prefer, you could open an issue and include your suggested alt text that way.  By the image, by the section, by the chapter: whatever you can contribute would be appreciated.

Those image captions, by the way?  In the printed text, they’re laid out as a label (e.g., “Figure 1.02”) and then the caption text follows.  But when the text wraps, it doesn’t wrap below the label.  Instead, it wraps in its own self-contained block instead, with the text fully justified except for the last line, which is centered.  Centered!  So I set up the markup and CSS like this:

	<img src="…" alt="…" loading="lazy">
		<span>Figure 1.02.</span> <span>Effects of a nuclear explosion.</span>
figure figcaption {
	display: grid;
	grid-template-columns: max-content auto;
	gap: 0.75em;
	justify-content: center;
	text-align: justify;
	text-align-last: center;

Oh CSS Grid, how I adore thee.  And you too, CSS box alignment.  You made this little bit of historical recreation so easy, it felt like cheating.

Look at the way it’s all supposed to line up on the ± and one number doesn’t even have a ± and that decimal is just hanging out there in space like it’s no big deal.  LOOK AT IT.

Some other things weren’t easy.  The data tables, for example, have a tendency to align columns on the decimal place, even when most but not all of the numbers are integers.  Long, long ago, it was proposed that text-align be allowed a string value, something like text-align: '.', which you could then apply to a table column and have everything line up on that character.  For a variety of reasons, this was never implemented, a fact which frosts my windows to this day.  In general, I mean, though particularly so for this project.  The lack of it made keeping the presentation historically accurate a right pain, one I may get around to writing about, if I ever overcome my shame.  [Editor’s note: he overcame that shame.]

There are two things about the book that we deliberately chose not to faithfully recreate.  The first is the font face.  My best guess is that the book was typeset using something from the Century family, possibly Century Schoolbook (the New version of which was a particular favorite of mine in college).  The very-widely-installed Cambria seems fairly similar, at least to my admittedly untrained eye, and furthermore was designed specifically for screen media, so I went with body text styling no more complicated than this:

body {
	font: 1em/1.35 Cambria, Times, serif;
	hyphens: auto;

I suppose I could have tracked down a free version of Century and used it as a custom font, but I couldn’t justify the performance cost in both download and rendering speed to myself and any future readers.  And the result really did seem close enough to the original to accept.

The second thing we didn’t recreate is the printed-page layout, which is two-column.  That sort of layout can work very well on the book page; it almost always stinks on a Web page.  Thus, the content of the book is rendered online in a single column.  The exceptions are the chapter-ending Bibliography sections and the book’s Index, both of which contain content compact and granular enough that we could get away with the original layout.

There’s a lot more I could say about how this style or that pattern came about, and maybe someday I will, but for now let me leave you with this: all these decisions are subject to change, and open to input.  If you come up with a superior markup scheme for any of the bits of the book, we’re happy to look at pull requests or issues, and to act on them.  It is, as we say in our preface to the online edition, a living project.

We also hope that, by laying bare the grim reality of these horrific weapons, we can contribute in some small way to making them a dead and buried technology.

Add Your Thoughts

Meyerweb dot com reserves the right to edit or remove any comment, especially when abusive or irrelevant to the topic at hand.

HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <em> <i> <q cite=""> <s> <strong> <pre class=""> <kbd>

if you’re satisfied with it.

Comment Preview