meyerweb.com

Skip to: site navigation/presentation
Skip to: Thoughts From Eric

Archive: 'Projects' Category

Survey Mapping

An anonymized copy of the data collected in the 2008 Survey has been turned over to some professional statisticians, as we did last year, and we’re waiting to hear back from them before moving into writing the full report.  But there’s no reason we can’t have a little fun while we wait, right?

So, calling all mapping ninjas: here’s a 136KB zip archive containing two tab-separated text files listing the countries and postcodes supplied by takers of the survey.  Before anyone has a privacy-related aneurysm, though, let me explain how they’re structured.

One of the two files is sorted alphabetically by country, with the postcodes as the second “column of data” (it’s country name, tab, postcode).  The second is the reverse: it’s sorted alphabetically by postcode, with the country names following each postcode.  This sorting should break any association they might have with the released data set, given that we won’t be including the postcodes in the released set.  (More on that in a moment.)

A word of warning: though I cleaned out some of the more obvious cases of people heaping abuse on us for even daring to ask the question, I can’t guarantee that the data set is perfectly clean.  There may be drops of bile here and there along with the usual collection of mistyped postcodes.  I know there’s at least one bit of obvious humor that I chose to leave in, so enjoy that when you find it.

We have two reasons to release this data this way at this point.  The first is to see what people do with it—heatmaps, perhaps, or one of those proportion-distortion maps, or a list of top-ten global postcodes or cities (or both).  Hey, go crazy!  I’d love to see a number of Google Maps/Yahoo! Maps/OpenMap/whatever mashups with this data.  That would be awesome.

The second reason is to ask for help with an API challenge.  Like I said, we’re not including the postcodes into the released data set.  What I would like to do instead is translate the postcodes into administrative regions (states, provinces, etc.) and put those in the data set.  That way, we can include things like “Ohio” and “British Columbia” and “Oaxaca”—thus providing a little bit better granularity in terms of geography, which was area of weakness in the 2007 survey.

Thanks to reading a couple of articles, I know how to do this for a single postcode.  But how does one do it for 26,457 postcode-and-country combinations without having to submit every single postcode as a separate request?  I’ve yet to see an explanation, and maybe there isn’t one, but I’d like to know either way.  And please, if someone does come up with a way, please show the work instead of just spitting out the result!  I’m hoping to learn a few things from the solution, but I obviously can’t do that without seeing the code.

One note: in cases where a postcode isn’t recognized or some kind of an error is returned, I’d like to have a little dash or “ERR” or something put in the result file.  That way we can get a handle on what percentage of the responses were resolvable.  Thanks.

Anyway, map and enjoy!

Survey Halfway

Okay, so yes, I posted about this two weeks ago and haven’t said anything since, but still: we’re halfway to the close of this year’s survey, so if you haven’t already done so, please devote ten minutes to taking it now!  You’ll make your voice heard along with literally thousands of fellow web professionals, hobbyists, and other people who make websites.  My copy of Excel is already weeping at the thought of having to crunch all that data, and I think that if there’s one thing on which we can all agree, it’s that anything which makes Excel cry is a good thing.

I TOOK IT! and so should you: The Survey for People Who Make Websites

Thank you.

I Took the 2008 ALA Survey

Ladies and gentlemen, it’s back, bigger and better than ever.  Please read Jeffrey’s wonderful introduction, and then start answering!  It shouldn’t take much more than 10 minutes to complete (it took me 6 minutes, 44 seconds, but who’s counting?).

I TOOK IT! and so should you: The Survey for People Who Make Websites

Last year, we had an astonishing 32,831 responses; I can only imagine where we’ll end up this year.  And just as with last year, we will report our findings and release an anonymized raw data set.

The more people who take the survey, the better the results will be, so please—post the link on any relevant sites, mailing lists, discussion boards, or other communities.  Print up flyers and post them around your town.  Anything we can do to get the word out!

Thank you.

The Really Perfect Ringtone

When I saw a couple of people link to “the perfect iPhone ringtone” last week, I had that sinking feeling that comes from being beaten to the punch.  I knew I should have stayed up an extra hour that one night and just gotten it done!

But wait, hold it, never mind, cancel the panic parade: it was not, in fact, the perfect ringtone.  Crisis averted!  Still, the sinking feeling lingered, reminding me of what could have been, so last night I sat down and got it done.  Now I bring to you the absolutely most perfect ringtone ever.

Feel free to preview it using that link, if you really feel that’s necessary, but frankly you should just charge ahead and download the .m4r AAC for instant ringtoniness.  If for some reason you’d rather have the audio source and do your own ringtone conversions, you can get the same file as a .m4a AAC or a comfy old .mp3.  And for all you completists, there’s a .zip archive of all three formats.

Go.  Ring.  Enjoy.

Expressive Sculptor

For those of you using Microsoft Expression Web, a free pre-release trial version of CSS Sculptor for Expression Web was announced by the WebAssist folks on Wednesday at MIX08.  So now you don’t have to put up with those snooty Dreamweaver users throwing you the mëtäl hörns every chance they get—throw ’em right back!  Røck!

If you’re curious about CSS Sculptor, I posted in some detail about it when it was first released in August 2007, and there’s of course plenty of enthusiastic copy about it on the WebAssist site.

One thing that’s different about the Expression version as compared to the Dreamweaver version is that it doesn’t have an “Apply” button to apply the input CSS ito the preview window.  Instead, changes are instantly reflected in the little preview.  It’ll be interesting to see how users react to that, since it could mean that the previewed design shatters as the CSS is updated, and then snaps back together upon further changes.  Is that good or bad tool usability?  Hard to say; it could scare people into undoing the shatter-change and never pushing forward, but it could also help users more quickly gain a deeper understanding of CSS by seeing how things come apart and then go back together.  I guess we’ll find out!

Out of Order

Apologies to anyone who tried visiting meyerweb in the very near past and found it broken.  I’d noticed that suddenly all kinds of comment spam were getting past Akismet and landing in the moderation queue, and was just preparing to ask the spam-fighters about it when I discovered that the blog portions of the site were throwing a PHP error about not being able to find a function I’d written into a plugin.

At which point I discovered that all my WordPress plugins had been deactivated.  I know I didn’t do that, so how they all got turned off remains a bit of a mystery to me.  I’ve turned all the ones I need back on, and things appear to be back to normal.

So Akismet wasn’t being evaded by the spam: it was simply switched off.  Good thing my non-plugin defenses caught everything that poured in during the outage.  Which, come to think of it, must all have been direct-submit spam, since there wouldn’t have been a comment form available on the entire site.  So what they were really avoiding was my direct-submission defensive plugin, not Akismet.

Well, either way, other defensive measures protected the site, so all’s well there.  I’m certainly not thrilled about the site having been largely offlined for a short period, and again, my apologies to anyone who got blocked from information they wanted.

This episode has actually given me cause to reconsider my usual preference to put site navigation at the end of the document source.  When the PHP failed, the navigation was never served up.  Had I put it at the top of the page, it would’ve been present even though the blog posts were failing.  Getting to the static areas of the site would have been possible.  Due to my structural choices, a script failure dramatically affected the usability of the site as a whole.

Something worth thinking about as I slowly work on improving the organization of meyerweb.

Survey Analysis Service

During our analysis of the responses to the Web Design Survey, one of the things I thought seriously about doing was dumping the whole dataset into a database and building a web front end to query it.  Then I remembered that as back-end developer, I’m an excellent book author.  I know some MySQL and PHP, but I’m right in that sour spot of knowing enough to make the development process slow and error-prone due to my moderate but incomplete knowledge of the languages while not knowing enough to correctly design the project from the outset.  So I stuck to Excel and the like, which can be cumbersome but quickly learned.

I was a little sad, though.  I’d had the thought that if I built an interface to the survey results, it could be released publicly once we were done.  With such a tool, anyone could generate their own pivot tables without having to learn the process in Excel (or deal with Excel’s handling of enormous data files).  That seemed like a really good thing.

Well, the dataset is public now.  So how about one or more of you super-sharp developer types, the ones who didn’t check any of the boxes on the question about gaps in your back-end coding skills, doing what I could not?

The basic scope of the project would be to list the various data points (gender, ethnicity, age bracket, salary, geographic location, perception of bias, etc., etc., etc.) and let a user pick the two they want to analyze against.  So if someone wanted a table showing the breakdown of gender by ethnicity, they would pick one to go on the top and the other to go on the left.  The table generated would give those numbers.  I’d have it spit out raw numbers, but allowing the user to optionally get the results as percentages might be a nice touch.  Though then you’d have to let the user say which way the percentages are calculated: by column, or by row.

For extra deepness, one could also filter the results based on the value for a third data point.  With that sort of feature, one could get the breakdown of gender by ethnicity for only the EU respondents.  I might do it by letting the user click on a data point and then pick the specific filtering value via a dropdown.  Maybe three radio buttons: one for top, one for left, and one for filter.  Or, heck, do a whole Web 2.0 drag-n-drop interface.  That part’s not important.  What matters is giving anyone the ability to easily get numbers out of the massive dataset.

The only real challenge I can foresee is where questions allowed more than one answer, like the location of work question and the skill questions.  In the dataset, they’re just comma-separated value lists.  Those would need to somehow be broken out into subtables or Boolean columns or something.  The actual structure of the solution interests me a whole lot less than simply having one.

I’m quite sure this is the kind of thing a real programmer could create in about a day.  As I am not a real programmer any more, it would take me a month or four.  Let’s not wait.  Anyone out there able to take the idea and run with it?

Digging Into the Data

One of the practical reasons we released the anonymized data sets from the 2007 Web Design Survey was that we knew we couldn’t ask every possible question, let alone report on the results.  For that matter, we knew we wouldn’t even be able to come up with every possible question.  It’s one thing to approach this enormous mountain of data with a specific question in mind; those questions always seem obvious to the questioner.  In that case, there’s a clear path to the summit.  But we didn’t come at this with a specific angle in mind.  We just wanted to know what the profession looks like.  So we not only didn’t have a clear path to the summit, we didn’t even have a summit to reach.  Instead, we had thousands.  The tyranny of choice came down on our heads like a, well, like a falling mountain.

So the obvious choice was to release the data for others to analyze in search of their own summits, and I’m really glad to see people already doing so.  One gentleman is looking to produce an analysis of UK respondents, for example.  Others are asking specific questions and getting surprising results.

For example, Rebekah Ahrens grabbed a copy of the dataset and pulled out the answer to a straightforward question: what’s the gender distribution for the various age groups?  What she found was that almost without exception, the younger the age group, the smaller the percentage of women.  Here’s a chart showing the results she found in graphic form.

Wow.  What is causing that?  It’s a pronounced enough pattern that I initially wondered if it was somehow an artifact of the analysis method.  Several times during the authoring of the report I’d think I’d found some amazing and previously unsuspected trend… only to discover I’d divided some numbers by the wrong total, or charted column-wise when the table was row-oriented.  Mistakes along those lines.  They happen.  But I really don’t think that’s the case here.

Now, the really important question is why this pattern exists, and that’s where the data fail us—we can’t get the numbers to reveal all the forces that went into their collection.  There are any number of reasons why this pattern might exist.  I thought of three hypotheses in quick succession, and I’m sure there are many more of equal or greater plausibility.

  1. Younger women didn’t hear about the survey, and so didn’t take it.
  2. Women are losing interest in the field, instead heading into other career paths, and so those who have stuck with the field longer are more prevalent.
  3. Increasing margins of error at the low and high ends of the age spectrum reduces the confidence of the numbers to the point that we can’t draw any conclusions.

Remember, these are all hypotheses, any one of which could be true or not.  So how would we go about proving or disproving them?

  1. Conduct a survey of women in the field to see if they answered the survey, if they know others who did, the ages of themselves and those others, and so on.  Difficult to undertake, but not impossible.
  2. Ditto #1, although a possibly useful followup analysis would be to look at the gender distributions by longevity in the field and then cross-reference the two results.
  3. Get someone who is a statistician to figure out the likely margins of error to see if that might explain things.  I’d do it, but I have no idea how.  I would tend to be skeptical of this as an explanation given the clear trend, but I suppose it is possible.

I can, however, create the gender-by-longevity chart mentioned in #2 there.

Check it out: above six years’ longevity, women are consistently more represented (compared to the overall average) than they are below six years’ longevity.  The only exception is a spike at “1 year or less”.  Is that enough to explain the trend Rebekah spotted?  It doesn’t look like it to me, but then I’m not a statistician.  I also wonder a bit about the spikes at edges of the longevity spectrum.

I’m not trying to propose an explanation here, because I don’t have one.  I don’t even have an unsubstantiated belief as to what’s happening here.  I know just enough to know that I don’t know enough to know the answer.  What I’m saying is this: the great thing is that anyone can do this sort of analysis; and that even better, having done so, we can start to figure out what questions we need to be asking of ourselves and each other.

December 2016
SMTWTFS
November  
 123
45678910
11121314151617
18192021222324
25262728293031

Archives

Feeds

Extras