meyerweb.com

Skip to: site navigation/presentation
Skip to: Thoughts From Eric

Archive: 'XSLT' Category

Migration Patterns

I’ve fielded a few questions about my experience migrating from Movable Type to WordPress, so I thought I’d address that subject for anyone else who might be interested.  I didn’t migrate from Movable Type.  I’ve never run Movable Type.  Okay?  That’s not saying anything for or against MT.  I’ve just never used it.

What I was using before setting up WordPress was a completely hand-built system where I authored entries in an XML format of my own devising, in which every entry for a given period (say, all of 2004) was sitting the same file.  Once I wrote a new entry, I’d pour the XML file through a set of XSLT scripts to generate the latest posts, monthly archive pages, and RSS feeds.  This was accomplished with some dirt-simple shell scripts I’d put together.  Here’s what the main script looked like:

#!/bin/bash
MONTH="$(date +%Y%m)"
echo $MONTH
xsltproc -o latest.html xslt/latest.xsl archive.xml
xsltproc -o rss20.xml xslt/rss20.xsl archive.xml
xsltproc -o rss091.xml xslt/rss091.xsl archive.xml
xsltproc -o $MONTH.html -stringparam chunk $MONTH xslt/chunker.xsl archive.xml

Anyone familiar with xsltproc will see what I’m doing at a glance.  For the rest of you, here’s a quick explanation.  The first xsltproc... line runs xsltproc using the script at xslt/latest.xsl (relative to the shell script) against the file archive.xml, writing the result to the file latest.html.  That’s it.  Nothing very fancy, but it worked well enough when I created the system.

Why did I abandon my loving crafed system for an installable package?  A combination of factors, any of which would not have been enough on its own.

  • The monthly archives were starting to get too heavy.  For example, the archive page for March 2004 is 88KB, and that doesn’t count any style sheets, images, or other external resources that would have to be loaded on top of that.  Back in the day, a month’s worth of posts would be maybe 15KB of HTML and text.  Heck, all of my posts from 1999 and 2000 total a whopping 32KB of HTML source.  My total posting from December 1999 through 2001 is about the same amount of data as the posts for March 2004.  So I needed more flexibility in terms of post archiving, which meant things like per-post archives, which I didn’t really want to have to try to support via XSLT.
  • I wanted to offer post commenting from time to time, and also have a system that managed pingbacks and trackbacks.  I had very little interest in figuring out how to implement my own commenting system, so weblogging software was my best choice.
  • An ability to search through the post archives was something I wanted to add, but for some reason I don’t want to do it through Google.  I’m still not sure why… I just don’t.

So why did I migrate to WordPress, specifically?  After looking at a number of packages, I decided that WordPress just fit me the best.  I still don’t like having to go through a Web interface to write posts, as I’ve gotten very used to authoring in BBEdit, but that was going to be a hurdle no matter what.  (And I do often write up a post in BBEdit before simply pasting it into the Web interface.)  Here are some specific reasons:

  • By default, WordPress generates valid XHTML files.  Thus, the process of adapting its code to generate the valid HTML I wanted was a lot less painful than it would have been with a package that doesn’t generate valid markup by default.
  • Similarly, WordPress is set up to handle site presentation via CSS, so it was a trivial matter for me to replace their default styles with my own.
  • I like that WordPress is not only open source, but users are encouraged to hack on it and share their hacks, which I’ve already started to do.  This was enough like a hand-built system to make me happy.  I think of it as a stock car that I can tune and tinker with to my heart’s content.
  • The new “Import via RSS” feature in WordPress 1.2 made sucking all of my back posts into the system really, really easy.  I just had to create a full-content RSS file containing every post I’d ever written—pretty easy, given that I had them all stored in XML—and then point the RSS importer at the file.  Well, two files, actually, but it still made the whole process very smooth.  It read the publication dates, categories, and everything else of note in a matter of milliseconds.  In fact, it was so easy I felt no regret about blowing away my test-site import and doing it again for the public site.
  • It certainly didn’t hurt that one of the primary forces behind WordPress is Matt Mullenweg, a fellow GMPG founder, so I knew that if I really got stuck I could ask him for help.

So that’s why I switched away from my home-brewed system and onto WordPress.  So far, aside from the occasional bouts of swearing at obscure MySQL and PHP syntax, neither of which has been anywhere near as migrane-inducing as XSLT syntax was, I’ve had no significant reason to regret the change.

Now you know… and knowing is half the battle.

Sign, Sign, Everywhere A Sign

So I’m on the book-signing schedule at SXSW04 as part of a five-person signature cage match that will last until only one person is left standing!  Er, or something.  Actually, I assume they’re going to kick us out of there by 1:15pm to clear enough space for all of Cory Doctorow‘s screaming fans.  But hey, if you have a book you want to have signed by any of us, bring it along.  I imagine you’ll also be able to buy Eric Meyer on CSS at the Borders booth where the signing will be held, as well as any of the other books listed.  This signing comes just fifteen minutes after the panel in which I’m participating, so it looks like I’ll have to dash from one to the other.

When they asked me if I was game for a book signing, I did recommend that they get copies of Eric Meyer on CSS because it seemed beyond scummy to have them stock up the first edition of Cascading Style Sheets: The Definitive Guide when the second edition will be coming out within a week or two of the signing.  Hey, I’m lookin’ out for ya.  Now all I have to do is think up some witty phrases to inscribe.

(If you’re in the Austin area but aren’t going to be attending SXSW04, you can still drop by and heckle us for free by getting an iF! pass.)

In the past three weeks, I’ve tried to hack (with varying levels of success) XSLT, Perl, and JavaScript.  Since I’m no better than a middling-fair programmer in any of those languages, I suppose some confusion was inevitable, but it seems like it’s always XSLT that gets me.  Thankfully, Chriztian Steimeier provided a solution for my XSLT problem.  The way that templates get called and nest and interact with each other continues to befuddle me, but I hope that it will one day make a modicum of sense.

Hack and Slash

Back in January, I reacted to Peter Nederlof‘s whatever:hover by musing that it would be nice to see behaviors used to extend IE in other useful ways, like adding generated content support and so on.  Dean Edwards, regardless of whether he saw what I had to say, is doing that and more with his in-progress IE7 behavior suite.  Can you say, “will add support for attrribute selectors, multiclass selectors, and adjacent-sibling selectors to IE/Win?”  Oh yeah… I thought so.  And that’s just the beginning.  He has generated content on his list of things that will be supported, and a whole lot more besides.  The behavior is currently alpha, but it’s everything I could possibly have hoped for and more.  I’m going to be keeping a close eye on Dean’s work, and will be putting it to use as it moves out of beta.

In a similar vein, Dean also created an XBL binding to let Gecko-based browsers use Microsoft behaviors.  I think he just might be a genius.  Thankfully, he’s using his powers for good instead of evil.

Hopefully, I can get one of you XSLT gurus to do the same on my behalf.  I have a problem that’s proven beyond my ability to grasp.  Basically, I have a list of events that include start and end dates; here’s the basic markup that drives it.  I can get a list of upcoming events, no problem; I just pass in information about the current date when I run the XSLT and do comparisons.  What I need is a list of recent events, where “recent” is defined as occuring within the past three months, even if those months straddle a New Year.  I also want to get at least the most recent event even if it didn’t happen within the last three months.  And, of course, I want any results sorted in reverse chronological order.  I cannot figure out how to do all that in XSLT.  Any pointers or takers for this one?  I could really use some help.

(Yeah, yeah, doing it with some database or other would be a snap.  I’m trying to do it in XSLT.  Think of it as a creative design constraint.)

On a totally different note, here’s an interesting pair of articles from SF Gate: Gay marriage momentum stuns both backers and foes and Where Is My Gay Apocalypse?  Thanks to Jeff Veen and Simon Willison for the pointers.

Entity Errors

Okay, XSLT folks, here’s one for you.  I had the following fragment in my journal recipes, with some whitespace to make everything more readable:

<xsl:element name="a" use-attribute-sets="plink">
  <![CDATA[&para;]]>
</xsl:element>

The goal was to get output something like this:

<a ... >&para;</a>

Instead, what I got was this:

<a ... >&amp;para;</a>

So how do I reach my goal?  And please don’t tell me that I should just generate the character directly.  It isn’t the result I want, even though I’m doing it now as a stopgap measure.  All I want to know is how to get what I want, if for no other reason than it will clarify more about XSLT, XML, and how they handle CDATA and entities.  Several Google searches turned up nothing useful, but it may very well be that I wasn’t using the right search terms.

Update: Curtis Pew pointed me to an XML.com article that led to the answer.  It is:

<xsl:element name="a" use-attribute-sets="plink">
  <xsl:text disable-output-escaping="yes">&para;</xsl:text>
</xsl:element>

The article talks about defining a custom entity, which I tried but failed to do even I’ve done it successfully in the past for less ambitious purposes.  xsltproc kept complaining that the namespace prefix xsl wasn’t defined, even though everything else in the recipe didn’t seem to have that problem.  Moving the xsl:text into the recipe itself was the answer.  Thanks to Curtis for the lead!

Reduced to Efficiency

I’ve been trying to catch up on e-mail.  Astonishingly, after only a couple of days of sustained effort, I’ve managed to get to the point where I’m only two weeks behind on my Inbox!  This is to a large degree because I’ve been sending out terse responses, for the most part, and pointing people to css-discuss in case they need more help.  Out of the 300 – 400 messages that arrive every day, once I strip away the listserv traffic and ditch the spam, I’m generally left with anywhere from three to twenty pieces of mail sitting in my Inbox.  The average is somewhere just below ten.

So if you’re thinking about asking me for help with understanding CSS, my best advice is to go join css-discuss.  As much as I love to help people out, you’re more likely to get much quicker and more complete answers from the community than from me, especially this summer, which is shaping up to be one of the busiest of my life.

As if in answer to all of my past grumblings about XSLT being all icky and bloated and clumsy, Simon brings word of the Parsimonious XML Shorthand Language (or PXSL, pronounced “pixel”).  This language basically turns XML syntax inside out, introduces indentation sensitivity, and ends up with a smaller and much less cluttered language.  Consider this example, which I nicked straight from the PXSL documentation:

<xsl:template match="/">
  <xsl:for-each select="//*/@src|//*/@href">
    <xsl:value-of select="."/>
    <xsl:text>&#10;</xsl:text>
  </xsl:for-each>
</xsl:template>

Here’s the PXSL version:

template /
  for-each //*/@src|//*/@href
    value-of .
    text <<&#10;>>

Okay, maybe not as compact as I would like, but it’s still a lot better than the XSLT version.  True, it still has to use XPath, so the line-noise quotient isn’t as close to zero as it should be, and it won’t do anything about the template-nesting rules XSLT imposes for no apparent reason.  It is also true PXSL is dependent on indentation and that never makes me happy, being a veteran of BASIC (where there was no indentation), PASCAL (where it didn’t matter), and HTML (ditto).  If I were programming Python already, I probably wouldn’t bat an eye, but guess why I’m not programming in Python?

The fascinating part to me is that, if you dig far enough into the document (which isn’t actually all that long), PXSL was originally designed “to reduce the verbosity of XSLT stylesheets.”  Ay-men, brother!  I do have to wonder about its whitespace handling, though.  Fortunately, when I’m ready to learn more I can find out about it via the PXSL Community site, which employs both XHTML and CSS for layout, including a styled unordered list to set up the navigation.  Most excellent.

What a great way to start a week!

Transformed Transforms

Thanks to the power of the Internet, I am now less annoyed at XSLT.  Chriztian Steinmeier wrote to suggest I try xml:space, something I hadn’t previously come across.  So the template now looks like this:

<xsl:template match="/archive" xml:space="preserve">
<div id="thoughts">
<h3>
<span>Thoughts From Eric</span>
</h3>
<xsl:apply-templates select="//entry" />
</div>
</xsl:template>

Ah, much better!  On the other hand, I discovered when I applied xml:space to another portion of my XSLT, it broke an xsl:choose structure.  I had to split one template up into three to sneak around that particular limitation, which some would say is a strength of the technology, since it forced me to further modularize my template.

If I’d been sufficiently determined to avoid splitting up that particular template, I could have used an idea sent in by Hugo Lopes.  He wrote to suggest that I could use custom-defined entities, like so:

<!DOCTYPE stylesheet [
<!ENTITY sp "<xsl:text> </xsl:text>">
<!ENTITY cr "<xsl:text>
</xsl:text>">
]>
<xsl:template match="/archive">
<div id="thoughts">&cr;
<h3>&sp;<span>Thoughts From Eric</span>&sp;</h3>&cr;
<xsl:apply-templates select="//entry" />&cr;
</div>
</xsl:template>

It’s a lot less ugly than what I had yesterday, I’ll agree, but in this particular situation xml:space is a better route for me to take.  Still, it’s an interesting solution to the problem, and a technique I’ll definitely keep in mind for future XSLT projects.  Thanks to Hugo and Chriztian for the help!

XSLTorture

Three days ago, in the process of finishing up the transition to the new design(s), I discovered a new reason to dislike XSLT—and it’s not exactly like I was lacking for reasons to do so before that.  So what aroused my ire this time around?  Whitespace.  Suppose you have the following XSLT template:

<xsl:template match="/archive">
<div id="thoughts">
<h3>
<span>Thoughts From Eric</span>
</h3>
<xsl:apply-templates select="//entry" />
</div>
</xsl:template>

Further suppose you want to preserve those whitespace returns in and around the elements, in order to keep the resulting HTML clean and readable without being forced to indent all the elements.  You can very easily force source indentation with xsl:output, but if you indent elements then you indent everything, including inline elements, and that drives me crazy.  In addition, having the whitespace avoids strange layout bugs in certain Web browsers that shall remain nameless, but are not IE/Win, surprisingly enough.

There is, so far as I could discover, only one way to ensure that whitespace is preserved in the HTML output.  It isn’t xsl:preserve-space, which will only preserve the whitespace found in the source XML.  No, apparently the answer is to use xsl:text as follows:

<xsl:template match="/archive">
<div id="thoughts">
<xsl:text>
</xsl:text>
<h3>
<xsl:text>
</xsl:text>
<span>Thoughts From Eric</span>
<xsl:text>
</xsl:text>
</h3>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="//entry" />
<xsl:text>
</xsl:text>
</div>
</xsl:template>

God, that’s ugly.  Really ugly.  Much uglier than XSLT’s inherently verbose clumsiness in template handling, which is what made me dislike it in the first place.  If there’s a better way to do what I did above, someone please let me know so I can share it with everyone else and feel a little less annoyed.  Thanks.

Don’t misunderstand: I really like what XSLT can do.  It’s just the syntax and what I find to be thoroughly weird limitations (such as the above) that I abhor.

December 2014
SMTWTFS
November  
 123456
78910111213
14151617181920
21222324252627
28293031  

Archives

Feeds

Extras