meyerweb.com

Skip to: site navigation/presentation
Skip to: Thoughts From Eric

Archive: 'Tools' Category

Better PDF File Size Reduction in OS X

One of the things you discover as a speaker and, especially, a conference organizer is this:  Keynote generates really frickin’ enormous PDFs.  Seriously.  Much like Miles O’Keefe, they’re huge.  We had one speaker last year whose lovingly crafted and beautifully designed 151-slide deck resulted in a 175MB PDF.

Now, hard drives and bandwidth may be cheap, but when you have four hundred plus attendees all trying to download the same 175MB PDF at the same time, the venue’s conference manager will drop by to find out what the bleeding eyestalks your attendees are doing and why it’s taking down the entire outbound pipe.  Not to mention the network will grind to a nearly complete halt.  Whatever you personally may think of net access at conferences, at this point, not providing net access is roughly akin to not providing functioning bathrooms.

So what’s the answer?  ShrinkIt is fine if the slides use lots of vectors and you’re running Snow Leopard.  If the slides use lots of bitmapped images, or you’re not on Snow Leopard, ShrinkIt can’t help you.

If the slides are image-heavy, then you can always load the PDF into Preview and then do a “Save As…” where you select the “Reduce File Size” Quartz filter.  That will indeed drastically shrink the file size—that 175MB PDF goes down to 13MB—but it can also make the slides look thoroughly awful.  That’s because the filter achieves its file size reduction by scaling all the images down by at least 50% and to no more than 512 pixels on a side, plus it uses aggressive JPEG compression.  So not only are the images infested with compression artifacts, they also tend to get that lovely up-scaling blur.  Bleah.

I Googled around a bit and found “Quality reduced file size in Mac OS X Preview” from early 2006.  There I discovered that anyone can create their own Quartz filters, which was the key I needed.  Thus armed with knowledge, I set about creating a filter that struck, in my estimation, a reasonable balance between image quality and file size reduction.  And I think I’ve found it.  That 175MB PDF gets taken down to 34MB with what I created.

If you’d like to experience this size reduction for yourself (and how’s that for an inversion of common spam tropes?) it’s pretty simple:

  1. Download and unzip Reduce File Size (75%).  Note that the “75%” relates to settings in the filter, not the amount of reduction you’ll get by using it.
  2. Drop the unzipped .qfilter file into ~/Library/Filters in Leopard/Snow Leopard or /Library/PDF Services in Lion.  (Apparently no ~ in Lion.)

Done.  The next time you need to reduce the size of a PDF, load it up in Preview, choose “Save As…”, and save it using the Quartz filter you just installed.

If you’re the hands-on type who’d rather set things up yourself, or you’re a paranoid type who doesn’t trust downloading zipped files from sites you don’t control (and I actually don’t blame you if you are), then you can manually create your own filter like so:

  1. Go to /Applications/Utilities and launch ColorSync Utility.
  2. Select the “Filters” icon in the application’s toolbar.
  3. Find the “Reduce File Size” filter and click on the little downward-arrow-in-gray-circle icon to the right.
  4. Choose “Duplicate Filter” in the menu.
  5. Use the twisty arrow to open the duplicated filter, then open each of “Image Sampling” and “Image Compression”.
  6. Under “Image Sampling”, set “Scale” to 75% and “Max” to 1280.
  7. Under “Image Compression”, move the arrow so it’s halfway between the rightmost marks.  You’ll have to eyeball it (unless you bust out xScope or a similar tool) but you should be able to get it fairly close to the halfway point.
  8. Rename the filter to whatever will help you remember its purpose.

As you can see from the values, the “75%” part of the filter’s name comes from the fact that two of the filter’s values are 75%.  In the original Reduce File Size filter, both are at 50%.  The maximum size of images in my version is also quite a bit bigger than the original’s—1280 versus 512—which means that the file size reductions won’t be the same as the original.

Of course, you now have the knowledge needed to fiddle with the filter to create your own optimal balance of quality and compression, whether you downloaded and installed the zip or set it up manually—either way, ColorSync Utility has what you need.  If anyone comes up with an even better combination of values, I’d love to hear about it in the comments.  In the meantime, share and enjoy!

Translations

Update 2 Aug 11: apparently there have been changes in Lion—here’s an Apple forum discussion of the problem.  There are two workarounds described in the thread: either to open and save files with ColorSync Utility itself, or to copy the filter to another folder in your Library (or install it there in the first place, above).

Update 27 Mar 12: edited the Lion install directory to remove an errant ~ .  Thanks to Brian Christiansen for catching the error!

Inspector Scrutiny

It’s been said before that web inspectors—Firebug, Dragonfly, the inspectors in Safari and Chrome, and so forth—are not always entirely accurate.  A less charitable characterization is that they lie to us, but that’s not exactly right.  The real truth is that web inspectors repeat to us the lies they are told, which are the same lies we can be told to our faces if we ask directly.

Here’s how I know this to be so:

body {font-size: medium;}

Just that.  Apply it to a test page.  Inspect the body element in any web inspector you care to fire up.  Have it tell you the computed styles for the body element.  Assuming you haven’t changed your browser’s font sizing preferences, the reported value will be 16px.

You might say that that makes sense, since an unaltered browser equates medium with “16”.  But as we saw in “Fixed Monospace Sizing“, the 16px value is not what is inherited by child elements.  What is inherited is medium, but web inspectors will never show you that as a computed style.  You can see it in the list of declared styles, which so far as I can tell lists “specific values” (as per section 6.1 of CSS2.1).  When you look to see what’s actually applied to the element in the “Computed Styles” view, you are being misled.

We can’t totally blame the inspectors, because what they list as computed styles is what they are given by the browser.  The inspectors take what the browser returns and prettify it for us, and give us ways to easily alter those values on the fly, but in the end they’re just DOM inspectors.  They don’t have a special line into the browser’s internal data.  Everything they report comes straight from the same DOM that any of us can query.  If you invoke:

var obj = document.getElementsByTagName('body')[0];
alert(getComputedStyle(obj,null).getPropertyValue('font-size'));

…on a document being given the rule I mentioned above, you will get back 16px, not medium.

This fact of inspector life was also demonstrated in “Rounding Off“.  As we saw there, browsers whose inspectors report integer pixel values also return them when queried directly from the DOM.  This despite the fact that it can be conclusively shown that those same browsers are internally storing non-integer values.

Yes, it might be possible for an inspector to do its own analysis of properties like font-size by checking the element’s specified values (which it knows) and then crawling up the document tree to do the same to all of the element’s ancestors to try to figure out a more accurate computed style.  But what bothers me is that the browser reported computed values that simply aren’t accurate in the first place.  it seems to me that they’re really “actual values”, not “computed values”, again in the sense of CSS2.1:6.1.  This makes getComputedStyle() fairly misleading as a method name; it should really be getActualStyle().

No, I don’t expect the DOM or browsers to change this, which is why it’s all the more important for us to keep these facts in mind.  Web inspectors are very powerful, useful, and convenient DOM viewers and editors, essentially souped-up interfaces to what we could collect ourselves with JavaScript.  They are thus limited by what they can get the browser to report to them.  There are steps they might take to compensate for known limitations, but that requires them to second-guess both what the browser does now and what it might do in the future.

The point, if I may be so bold, is this:  never place all your trust in what a web inspector tells you.  There may be things it cannot tell you because it does not know them, and thus what it does tell you may on occasion mislead or confuse you.  Be wary of what you are told—because even though all of it is correct, not quite all of it is true, and those are always the lies that are easiest to believe.

Pseudo-Phantoms

In the course of a recent debugging session, I discovered a limitation of web inspectors (Firebug, Dragonfly, Safari’s Web Inspector, et al.) that I hadn’t quite grasped before: they don’t show pseudo-elements and they’re not so great with pseudo-classes.  There’s one semi-exception to this rule, which is Internet Explorer 8’s built-in Developer Tool.  It shows pseudo-elements just fine.

Here’s an example of what I’m talking about:

p::after {content: " -\2761-"; font-size: smaller;}

Drop that style into any document that has paragraphs.  Load it up in your favorite development browser.  Now inspect a paragraph.  You will not see the generated content in the DOM view, and you won’t see the pseudo-element rule in the Styles tab (except in IE, where you get the latter, though not the former).

The problem isn’t that I used an escaped Unicode reference; take that out and you’ll still see the same results, as on the test page I threw together.  It isn’t the double-colon syntax, either, which all modern browsers handle just fine; and anyway, I can take it back to a single colon and still see the same results.  ::first-letter, ::first-line, ::before, and ::after are all basically invisible in most inspectors.

This can be a problem when developing, especially in cases such as having a forgotten, runaway generated-content clearfix making hash of the layout.  No matter how many times you inspect the elements that are behaving strangely, you aren’t going to see anything in the inspector that tells you why the weirdness is happening.

The same is largely true for dynamic pseudo-classes.  If you style all five link states, only two will show up in most inspectors—either :link or :visited, depending on whether you’ve visited the link’s target; and :focus.  (You can sometimes also get :hover in Dragonfly, though I’ve not been able to do so reliably.  IE8’s Developer Tool always shows a:link even when the link is visited, and doesn’t appear to show any other link states.  …yes, this is getting complicated.)

The more static pseudo-classes, like :first-child, do show up pretty well across the board (except in IE, which doesn’t support all the advanced static pseudo-classes; e.g., :last-child).

I can appreciate that inspectors face an interesting challenge here.  Pseudo-elements are just that, and aren’t part of the actual structure.  And yet Internet Explorer’s Developer Tool manages to find those rules and display them without any fuss, even if it doesn’t show generated content in its DOM view.  Some inspectors do better than others with dynamic pseudo-classes, but the fact remains that you basically can’t see some of them even though they will potentially apply to the inspected link at some point.

I’d be very interested to know what inspector teams encountered in trying to solve this problem, or if they’ve never tried.  I’d be especially interested to know why IE shows pseudo-elements when the others don’t—is it made simple by their rendering engine’s internals, or did someone on the Developer Tool team go to the extra effort of special-casing those rules?

For me, however, the overriding question is this: what will it take for the various inspectors to behave more like IE’s does, and show pseudo-element and pseudo-class rules that are associated with the element currently being inspected?  And as a bonus, to get them to show in the DOM view where the pseudo-elements actually live, so to speak?

(Addendum: when I talk about IE and the Developer Tool in this post, I mean the tool built into IE8.  I did not test the Developer Toolbar that was available for IE6 and IE7.  Thanks to Jeff L for pointing out the need to be clear about that.)

Announcing Followerlap

Last week, I got an interesting inquiry from Velda Christensen:

@meyerweb *wondering just how many of your followers follow @zeldman and vice-versa*

I had no idea.  Furthermore, I didn’t know of a tool that could tell me.  So I built one: Followerlap.

As it turned out, the Twitter API made answering the specific question pretty ridiculously easy, thanks to followers/ids.  All it takes is two API requests, one for each username.  The same would be true of friends/ids, on top of which I suspect I’ll fairly shortly build a tool quite similar to Followerlap.

Since I announced Followerlap’s existence on (where else?) Twitter, I’ve had a few repeated (and not unexpected) bits of feedback.

  • Why not list the common followers?  Because followers/ids returns a list of numeric IDs.  Resolving those IDs as usernames would require one API hit per ID.  If there are 15 followers in common, that’s not such a big deal, but if there are 1,500, well, I’ll run out of my hourly allotment of API requests very quickly.  Maybe there’s a better way; if so, I’d love to hear about it, because that would be a great addition.

  • Why can’t I find out how many people follow both Stephen Fry and Shaquille O’Neal?  Past a certain number of followers, somewhere in the 200,000–250,000 range, the API just dies.  You can’t even count on getting a consistent error message back.  There are ways around this, but I didn’t want to stress the API that way, so it just fails.  Sorry.

  • How can I link to a specific comparison?  At the moment, you can’t.  I hope to make that happen soon, but I decided that a tool this simple should have a similarly simple launch.  Ship early, ship often, right?  Anyway, it’s on the list of things to add soon.  Use the new “Livelink to this result” link underneath a result.  (See update below for more.)

So that’s Followerlap.  Any other questions?  I’ll do my best to answer them in the comments, though for a number of reasons I may be slow to respond.

Update 6 Jul 09: as noted in the edited point above, livelinking of comparison results is now, um, live.  So now you can pass around results like the number of people who follow both God and the Devil (thanks to Paul M. Watson for coming up with that one!).  I call this “livelinking” because hitting a result URL will get you the very latest results for that particular comparison.  “Permalinking” to me implied that it would link to a specific result at a specific time, which the tool doesn’t do and very likely never, ever will.

Seeking Math Help

So I have this equation that’s great for finding one term.  Problem is, I need to solve for another term that’s scattered all across the right side.  I’m hoping someone here has the mad algebra skills I managed to lose in the two decades since I last took a math class and can help me out.

Here’s the original equation:

Q = (3.07 × F × Y × (1 + 1.4 × ((D/V) × e(-2 × D/V)))) / D2

I want to be able to solve for D, not Q; in other words, have a single D on the left and everything else on the right of the equation.  F, Y, and V are all variable terms; the e is the classic irrational constant.  I tried for quite a while to do this and ran very firmly aground.  The best I managed was this minor simplification:

Q = (3.07 × F × Y × (1 + 1.4 × (D / (V × e(2 × D/V)))) / D2

…and even that assumes that I did things correctly.  Here’s the original equation in pretty shoulda-done-it-in-MathML-but-oh-well form:

I can shuffle the chairs around, as it were, but never really get anywhere close to having a single D on the left.  “But it’s so easy!“, you may well be shouting.  That’s why you’re working for Google and I’m not.

I remember having questions just like this on tests back in college: “Given this equation, solve for blah”.  It’s been too long, though, and in all honesty I was never that great at this sort of thing in the first place. Help, please?

[Update 14 Jan 09]: several commenters have shown that what I’m trying to do is impossible.  Frustrating, but that’s math for you.  Looks like I’ll have to take another approach.

MW Latest Tweet 1.1b2

Now available: MW Latest Tweet 1.1b2.  The only real difference between this version and the previous is better auto-link routines, thanks largely to a PHP4-ified version of Joseph Scott‘s recently released MakeItLink PHP class.  I tightened up some related code as well, thanks to my newfound understanding of just what the heck a “callback function” actually does, and how it can be useful.  And anonymous functions, too!

Also, there is an “enter debug mode” link at the bottom of the administrative panel for the plugin.  It’s very cleverly matched with an “exit debug mode” link when you’re in debug mode.  These links do just what they sound like they should do.  Debug mode itself, introduced in the previous beta, is unchanged (except maybe cosmetically).

In case anyone’s interested in seeing how I use the text-replacement strings on  meyerweb, here’s what I have in that textarea, formatted slightly for readability:


<div class="panel">
<h4>Recently Tweeted</h4>
<p class="more">
<a href="http://twitter.com/%%USER_SCREEN_NAME%%">see more</a>
</p>
<p>
%%TEXT%% <small>–tweeted %%CREATED_AT%%</small>
</p>
</div>

There were some reports of incompatibility between this plugin and early WordPress 2.7 betas.  Word is it’s working fine with the latest beta.  I probably won’t fix any incompatibilities until 2.7 final ships, but if anyone spots something they absolutely know will be a problem in 2.7 final, let me know.  Thanks!

MW Latest Tweet 1.1b1

There’s a new beta of MW Latest Tweet available.  It does four new things.  Four and a half if you count the new options setting as a half.

  1. All the files are in the mw_latest_tweet directory now, instead of having the plugin PHP outside of that directory like 1.0 did.  Yeah, I know, that should’ve been the case all along.  Sorry!  Learning on the job here.

    If you’re upgrading from 1.0, you should probably delete the 1.0 file and directory outright before uploading the 1.1b1 directory.  Alternatively, you should be able to upload 1.1b1, deactivate 1.0, activate 1.1b1, and then delete just the 1.0 PHP file.  I haven’t tried that, so I don’t know if it will actually work, but it seems like it should.

  2. URLs within a tweet are turned into hyperlinks for easy clickin’.  To go with this new feature, there’s a new option on the settings page to either shorten displayed URLs, like twitter.com does, or to not shorten them.  The default is to shorten, which means any URL 29 or more characters long gets shortened to 27 characters and gains a trailing ellipsis.  Again, like Twitter does it—although I used an ellipsis entity and not three periods.

    Note that if you upgrade from 1.0 to 1.1b1, this setting may default to “No” instead of “Yes”.  I’m not sure why, but it’s a pretty low-priority item right now.

  3. On a related note, @names are autolinked as well.  I’m using the pattern [A-Za-z0-9_] since that’s what Twitter says are valid characters for a username even though if you type in a grawlix on the signup page it will tell you, in nice bold green letters, that it’s available.

  4. If you want to see everything the plugin has cached, append &debug to the end of the plugin’s setting page URL and hit return.  You’ll get the settings page with a dump of the cached data at the end.  This is clumsy and will be much less so before 1.1 final.  I’m thinking click a link, enter debug mode.  Probably won’t go all AJAXy, though you never know.

So that’s the state of things.  Let me know if anything breaks.

Excerpts Exacted; Shielding the Admin

In response to my request, the indomitable Hamish Macpherson has created NeverForgetcerpt, a plugin for WordPress 2.5+ that will warn you if you’re about to publish a post that lacks an excerpt.  I’m already using it on meyerweb and it’s working like a charm.  He’s also expressed interest in the idea of a plugin that does that and also warns you if you forgot to add tags or categories, so stay tuned.  Meantime, all hail Hamish!

I have another plugin request, but in this case I’m looking for help in modifying something I’ve already done.  Or half-done, maybe.

I don’t know about you, but I get a lot of comment spam.  As I type this sentence, Akismet has stopped 837,806 spam attempts in total.  A false positive makes it past Akismet and my other defenses to land in the moderation queue about once every four days, on average.

Some of those false positives are really, really, really easy to spot, and they get marked as spam in order to help improve the recognition algorithms.  Others are hard to evaluate just by looking at the comment.  Many are trackbacks from sites in langauges I can’t read, and others that I can read look legit enough.  In such cases, I usually go visit the author’s URL to see if it looks spammy or not.

Now, the way I used to do this was to right-click on the blog link, copy the URL of the target, open a new browser tab, and paste the URL into the address bar.  Why?  To prevent my WP admin URL from landing in the referer logs of a potentially unscrupulous site owner.  But sometimes I forget to do all that, and just click.  I figured, well, why not stop fighting the tendency to just click and write a plugin that routes all outbound links through a redirect service?

So I did.  You can grab it for yourself if you want, but if you do, understand that it’s pretty clunky right now.  Which is the part I’d like help fixing.

The heart of the plugin is simplicity itself:

if (is_admin_page()) {
	add_filter('get_comment_author_url','_mw_obscurify',5);
}

function _mw_obscurify($url) {
	if ($url) return 'http://google.com/url?q=' . $url;
}

There’s a little more to it than that (specifically, the routine is_admin_page(), which I got from someone else’s plugin and wish now I could remember whose it was) but that’s the core.  So any time the URL of a comment author is fetched, it’s prepended to turn it into a Google redirect.

That’s true for both href values and displayed URLs, though, which is the clunky part.  The end result is that on comments from the aforementioned mighty Hamish, for example, I get the following markup on the “Comments” page:

<a href="http://google.com/url?q=http://hamstu.com">

http://google.com/url?q=http://hamstu.com</a>

What I’d very much prefer is:

<a href="http://google.com/url?q=http://hamstu.com">

http://hamstu.com</a>

Or even:

<a href="http://google.com/url?q=http://hamstu.com">
hamstu.com</a>

So what I’d like to know is if there’s any way to make that happen short of rewriting and replacing get_comment_author_url, which I’d prefer not to do since it could change in future versions of WordPress and I’m not particularly interested in turning a basic plugin into a continuing maintenance headache.  I mean, I will if absolutely necessary, but I’d like to find a better way if there is one.  Thus the request for help.

Also, are there better redirect strategies than using Google the way I have?  It’s very slightly annoying that I have to click through the Google redirect page, and though I absolutely understand why they do that, I’d love to find an automatic redirect that wouldn’t expose my referer to the target site.  Anyone know of one, or have a related sharp idea?

December 2014
SMTWTFS
November  
 123456
78910111213
14151617181920
21222324252627
28293031  

Archives

Feeds

Extras