Posts from 2005

WP-Gatekeeper

Published 19 years, 9 months past

In my post on rel="nofollow", I mentioned the use of easily human-comprehensible challenge questions like “What is Eric’s first name?” as a way to defeat spambots.  There were two points made in the comments that I had considered but hadn’t brought up, given that they were tangential to the point of the post.  They were:

  1. Spammers could set up a database of questions and answers used on sites.  They might or might not share it with each other, but the point is that if I set up “What is Eric’s first name?” as the sole challenge, the human running the spambot could build the ability to answer the question into the spambot, thus defeating it.  Quite true.
  2. In order to make it more difficult to do this, there could be a set of challenges from which one is picked randomly.  So I might have three challenges asking for the first names of myself, Kat, and Carolyn.  Every time a comment form is delivered to a browser, one of the three challenges, picked at random, is included.  This would make it more difficult for a human spammer, since he (or she) would have to find all of the challenge questions. work out the responses, and build them all into a database, keyed to each site’s domain.

So over the weekend, I built as a proof of concept (and also as an exercise in learning more about how PHP, mySQL, and WordPress work) a WordPress package to do what described in the second point above.  It’s called WP-Gatekeeper, available from my WordPress Tools page, and if you’re brave you can give it a try.  Why brave?  Because the installation involves hacking a few WP files and adding a new entry to the admin menu, not to mention firing up a plugin.  And if you do it in the wrong order, you can break commenting for a short period.  There are DIY installation instructions on the WP-Gatekeeper page, for those who still want to proceed.  You also need to be brave because if you install it, you’re running code written—well, actually, adapted—by someone with only beginner-to-intermediate PHP skills.  I’ve been testing it locally and everything seems fine, but this is even more “use at your own risk” software than usual.  Got it?  Good.

Accordingly, WP-Gatekeeper is currently considered beta software.  I’m making it available now in the hopes that people more experienced than I with PHP and WordPress can take a look, hack on the code, and make it more efficient and the whole package easier to install.  I’m already aware that in WP 1.5, adding the admin page is much easier and doesn’t require hacking files, but I wrote WP-Gatekeeper in 1.2 and want it to work there, since that’s the latest public version.  Thus, any optimizations should work in 1.2.  When 1.5 (or whatever the next version number is) comes out, then I’ll worry about it.

Of course, there’s still nothing that prevents a spammer from registering questions and answers into a database, but the admin page makes it easy for a blogger to add, remove, modify, and re-key the challenges.  That will make tracking them more difficult, so long as a blogger puts effort into maintaining the list of challenges.  It gets back, in the end, to maintaining your blog.  The more maintenance you put into something, the better its shape will stay.

I’m also interested in suggestions for how the overall system could be made harder to bypass with a bot, and easier for a WP admin to run.  One feature I plan to add before going final is the ability to have the keys replaced on a regular basis, with the interval (daily/weekly/monthly/etc.) set by the admin.  The  other driving consideration here is that the system should be fully capable of working even if JavaScript is disabled.  It’s an accessibility thing; just go with me on this.  (Accessibility is the main reason I did this rather than install an image CAPTCHA solution, as it happens.)

Got feedback?  Let’s hear it.


More Spam To Follow

Published 19 years, 10 months past

So… rel="nofollow".  Now there’s a way to deny Google juice to things that are linked.  Will it stop comment spam?  That’s what I first thought, but I’ve come to realize that it’ll very likely make the problem worse.  In the last few hours, I’ve been hearing things that support this conclusion.

First, the by-now required disclaimer: I think it’s great that Google is making a foray into link typing, and I don’t think they should reverse course.  For that matter, it would be nice if they paid attention to VoteLinks as well, and heck, why not collect XFN values while they’re at it?  After all, despite what Bob DuCharme thinks, the rel attribute hasn’t been totally ignored these past twelve years.  There is link typing out there, and it’s spreading.  Why not allow people to search their network of friends?  It’s another small step toward Google Grid… but I digress.

The point is this: rather than discourage comment spammers, nofollow seems likely to encourage them to new depths of activity.  Basically, Google’s move validates their approach: by offering bloggers a way to deny Google juice, Google has acknowledged that comment spam is effective.  This doesn’t mean the folks at Google are stupid or evil.  In their sphere of operation, getting comment spam filtered out of search results is a good thing.  It improves their product.  The validation provided to spammers is an unfortunate, possibly even unanticipated, side effect.

There is also the possibility, as many have said, that nofollow will harm the Web and Google’s results, because blindly applying a nofollow to every comment-based link will deny Google juice to legitimate, interesting stuff.  That might be true if nofollow is used like a sledgehammer, but there are more nuanced solutions aplenty.  One is to apply nofollow to links for the first week or two after a comment is posted, and then remove it.  As long as any spam is deleted before the end of the probation period, it would be denied Google juice, while legitimate comments and links would eventually get indexed and affect Google’s results (for the better).

In such a case, though, we’re talking about a managed blog—exactly the kind of place where comment spam had the least impact anyway.  Sure, occasionally the Googlebot might pick up some spam links before the spam was removed from the site, but in general spam doesn’t survive on managed sites long enough to make that much of a difference.

Like Scoble, where I might find nofollow of use would be if I wanted to link to the site of a group or person I severely disliked in order to support a claim or argument I was making.  It would be a small thing, but still useful on a personal level.  (I’d probably also vote-against the target of such a link, on the chance that one day indexers other than Technorati‘s would pay attention.)

No matter what, the best defenses against comment spam will be to prevent it from ever appearing in the first place.  There are of course a variety of methods to accomplish this, although most of them seem doomed to fail sooner or later.  I’m using three layers of defense myself, the outer of which is currently about 99.9% effective in preventing spam from ever hitting the moderation queue, let alone make it onto the site.  One day, the layer’s effectiveness will very suddenly drop to zero.  The second layer was about 95% effective at catching spam when it was the outer layer, and since it’s content-based will likely stay at that level over time.  The final layer is a last-ditch picket line that only works in certain cases, but is quite effective at what it does.

So what are these layers, exactly?  I’m not telling.  Why not?  Because the longer these methods stay off the spammers’ radar, the longer the defenses will be effective.  Take that outer layer I talked about a moment ago: I know exactly how it could be completely defeated, and for all time.  Think I’m about to explain how?  You must be mad.

The only spam-blocking method I can think of that has any long-term hope of effectiveness is the kind that requires a human brain to circumvent.  As an example, I might put an extra question on my comment form that says “What is Eric’s first name?”  Filling in the right answer gets the post through.  (As Matt pointed out to me, Jeremy Zawodny does this, and that’s where I got the idea.)  That’s the sort of thing a spambot couldn’t possibly get right unless it was specifically programmed to do so for my site—and there’s no reason why any spammer would bother to program a bot to do so.  That would leave only human-driven spam, the kind that’s copy-and-pasted into the comment form by an actual human, and nothing besides having to personally approve every single post will be able to stop that completely.

So, to sum up: it’s cool that Google is getting hip to link typing, even though I don’t think the end result of this particular move is going to be everything we might have hoped.  More active forms of spam defense will be needed, both now and in the future, and the best defense of all is active management of your site.  Spammers are still filthy little parasites, and ought to be keelhauled.  In other words: same as it ever was.  Carry on.


NEOAC Talk

Published 19 years, 10 months past

Just a quick note for people in the Cleveland area: I’ll be giving a talk this Saturday, 22 January, at the meeting of the North East Ohio Apple Corps in Strongsville.  The topic will be turning your Macintosh into a powerful Web development environment using resources, scripts, tricks, and tools available for free.  If you’re interested, drop by, and if you need directions, check out the NEOAC Web site.  I’m told that there will be donuts.  Mmm…. donuts.


Signs of Intelligence

Published 19 years, 10 months past

This morning, Carolyn told me quite clearly that she wanted some yogurt for breakfast.  Technically, what she said was “more baby”, but I knew what she meant.

How did a 13-month-old manage to tell me what she wanted?  By using sign language.  Kat and I have been teaching her Baby Signs, which is a simplified version of American Sign Language.  I’m given to understand that Baby Signs figure in the plot of the recent movie Meet The Fockers, but don’t let that sour you on the idea.  The amazing thing is that it really does work, if you’re willing to put in time and effort.

At this point, we’re actually looking more to real ASL signs than we are to the Baby Signs vocabulary when teaching Carolyn new signs.  I think the real utility of Baby Signs is that it gets you started where it makes the most sense: teach your baby signs like “food”, “water”, “more”, and “all done”.  This allows the child to communicate their wants and needs long before they ever become verbal.  It works because motor skills advance more quickly than verbal skills do.  I’ll be very interested to see if Carolyn retains the signing as she grows up, or if she’s able to pick up secondary languages more easily.

Carolyn’s first sign was “hat”, which of course didn’t help at all with deducing her needs, but it was still incredible to witness.  I was actually there when she figured it out.  She was looking through a Baby Signs board book while I stood watching.  She stared very intently at a picture of a baby signing “hat”, and then put her hand to the side of her head, just like in the picture.  My jaw dropped, but I managed to keep quiet.  She did it a couple more times, then looked up at me.  That’s when I showered the praise.  It only took a day or two to teach her that the actual sign for “hat” is to pat the hand on the head, not just place it there, like she saw in the picture.  Now Carolyn signs “hat” whenever she sees a picture of one of her grandfathers, because they both wear hats all the time.

Her signing vocabulary is now about thirty words, and she’s actually devised two signs of her own—which means, unfortunately, that we have no idea what she’s trying to say when she makes them!  But we’ll figure it out eventually.

As for how “more baby” means “I want yogurt”, that’s because we quickly noticed that when Carolyn signs “more” what she really means is “I want”.  As for “baby”, the yogurt we feed her has a picture of a baby on each container.  One day she walked over to the refrigerator, patted the door, and signed “baby”.  Then she had to do it a few more times while the rest of us scratched our heads and said things like, “The refrigerator’s not a baby, sweetie.  What are you trying to say?” before finally figuring it out.

Sometimes I think she’s smarter than we are.

So if you’re a new parent or a parent-to-be, I strongly recommend that you try this with your own baby.  When a baby starts waving bye-bye, that’s when they’re ready to start learning sign language.  (We started earlier than that, hoping to lay a foundation, and may or may not have been wasting our time.)  It will help reduce frustration, and therefore tantrums, because you’ll be better able to meet their needs when they have them.  The system isn’t perfect, of course: any baby that gets too upset will be unable to communicate with anything besides tears.  It’s still a great thing when your toddler comes into the room and signs “food” long before the hunger starts making her cranky.

I wonder if the children of deaf parents, whether they themselves are deaf or not, have long benefitted (tempramentally and intellectually) from signing, and nobody outside the deaf culture bothered to notice.


Uncensored Caption Text

Published 19 years, 10 months past

While watching a movie on TV this evening—all right, it was Volcano on the apparently mis-named cable channel American Movie Classics—I was amused to discover that the “bad” words had been edited from the audio track, but left completely intact in the closed captioning.  What, is there some kind of assumption that being hard of hearing also makes one hard to offend?


S5 1.1b3

Published 19 years, 10 months past

Well, there was time off for the holidays, but now S5 is back and ready to increment its beta number.  So, without too much ado: S5 1.1b3 (248KB ZIP file).  Here’s the current testbed presentation, for those who just want to play around with it.  Because of the long holiday break, I want to add another beta round or two just to work out as many kinks as possible.  So this isn’t the last version before going final on 1.1; still, I’m interested in any problems that people encounter.

There’s really only one notable change from the previous version.  I incorporated Jordan Liggitt’s “type slide number” code into this version.  Why his, when others have done similar things?  Because his version was well-marked with comments, and thus easy for me to figure out what he’d done and how he’d done it.  So here’s how it works:

  • If the user types a number (multi-digit is allowed), the script stores the number.  Inputting any non-number key clears the entered number.
  • If the user hits Enter/Return while there is a number stored, the slide show jumps to that slide.  Any attempt to jump directly to a slide past the end of the slide show results in no action, although the number is still cleared.
  • Hitting any of the “Next” or “Previous” keys while there is a number entered causes the slide show to skip the number entered in the appropriate direction.  Thus, entering “3” and hitting the space bar would jump forward three slides; entering 5 and hitting Page Up would jump backward five slides.  Skipping past the end of the slide show will drop you on the title slide, which is something I’m thinking about changing, though I’m not entirely certain in what way.

I’m mulling over which keys should invoke which jumping behavior.  For example, a couple of times I’ve typed a slide number and then hit the space bar to advance directly to that slide.  Instead, I jumped forward by that number, which is correct but obviously not what I was subconsciously expecting.  So I’m thinking about further restricting the keys that trigger the “jump n slides” behavior.  Anyone have suggestions based on other slide show software?

At this stage, I’m likely to put off adding the multiple-author meta that I toyed with in earlier versions.  The general need is still there, but I’m just not able to think the problem through with the kind of clarity I want.  It will have to wait for another day.  I’m also dithering a bit about the licensing, though at this point I’m leaning pretty heavily toward using Expat.  My hesitation is largely based on my very desire to make the right choice so that I never, ever have to worry about it again, you know?

Anyway, as always, feedback is welcome.


Tabular Weirdness

Published 19 years, 10 months past

Recently I was doing some table styling for a client and ran into what I can only call tabular weirdness.  There were two different things that I stumbled across, and interestingly, they were the kinds of problems you wouldn’t be likely to encounter in layout tables.  These would come up much more often in data tables.

In the first case, the general idea was to put some space between the tables and the surrounding material, but as these were data tables, they came with captions.  So I of course put the caption text in caption elements.  That’s when things started to get inconsistent.

To be more precise, the problems began after I left Safari to check the page in other browsers. In Safari, you see, the caption’s element box is basically made a part of the table box.  It sits, effectively, between the top table border and the top margin.  That allows the caption’s width to inherently match the width of the table itself, and causes any top margin given to the table to sit above the caption.  Makes sense, right?  It certainly did to me.

However, according to section 17.4 of CSS2.1 and the figure that accompanies it, the caption sits entirely outside the table’s box, and that includes the table’s margin.  The two are still tied together by the generation of an anonymous box, but the upshot is that if you give the table left and right margins, then the caption does not follow suit.  If you give the table a top margin, it pushes the caption away from the table. This is the behavior evinced by Firefox 1.0, and as unintuitive as it might be, it’s what the specification demands.

The third piece of strangeness was found in IE/Win.  What I’d done was simply said that some cell borders should be solid—nothing more complicated than border-bottom: 1px solid.  The idea was that it would, as borders do, pick up the foreground color of the cell, but IE/Win had other ideas.  As best I could tell, the borders were a light gray.  You can see it happen in the testcase I constructed to create the images in this entry.  Explicitly specifying a border color fixes the problem, of course, but it was a bit of weirdness I thought I’d pass along in case anyone runs into the same thing.


Mickey Prints

Published 19 years, 10 months past

Since Kat and I were going to be visiting Florida so often last year and this, and therefore we of course had to visit Disney World a lot, we decided to buy annual passes.  I was quite interested that when you buy an annual pass, the Disney folks take the prints of your right hand’s first and second fingers.  That data is associated with the card; whether it’s encoded onto the card’s strip or not, I don’t know.  But either way, some of your biometric data is associated with your Disney pass.  When you enter the park, you run the pass through the turnstile and stick your fingers into a reader.  If the fingers don’t match the card, you can’t get in, so you can’t share an annual pass with anyone else.

Now, suppose the Disney database stores that biometric data.  Now they have that data tied to a credit card number, purchasing patterns in the parks, probably a home address and phone number, and so on.  Interesting.  Guess what?  As of 2 January 2005, Disney is doing that for all passes: day passes, park hopper passes, all kinds of passes.  Every kind of pass.  Get a pass, get your fingers scanned.  (Okay, yes, you can opt out and be required to show photo ID, but how many people will bother?)

That’s a whole lot of biometric data associated with a whole lot of consumer data.  Interesting, don’t you think?


Browse the Archive

Earlier Entries

Later Entries