Thoughts From Eric Archive

Peerless Whisper

Published 1 year, 1 month past

What happened was, I was hanging out in an online chatter channel when a little birdy named Bruce chirped about OpenAI’s Whisper and how he was using it to transcribe audio.  And I thought, Hey, I have audio that needs to be transcribed.  Brucie Bird also mentioned it would output text, SRT, and WebVTT formats, and I thought, Hey, I have videos I’ll need to upload with transcription to YouTube!  And then he said you could run it from the command line, and I thought, Hey, I have a command line!

So off I went to install it and try it out, and immediately ran smack into some hurdles I thought I’d document here in case someone else has similar problems.  All of this took place on my M2 MacBook Pro, though I believe most of the below should be relevant to anyone trying to do this at the command line.

The first thing I did was what the GitHub repository’s README recommended, which is:

$ pip install -U openai-whisper

That failed because I didn’t have pip installed.  Okay, fair enough.  I figured out how to install that, setting up an alias of python for python3 along the way, and then tried again.  This time, the install started and then bombed out:

Collecting openai-whisper
  Using cached openai-whisper-20230314.tar.gz (792 kB)
  Installing build dependencies ...  done
  Getting requirements to build wheel ...  done
  Preparing metadata (pyproject.toml) ...  done
Collecting numba
  Using cached numba-0.56.4.tar.gz (2.4 MB)
  Preparing metadata (setup.py) ...  error
  error: subprocess-exited-with-error

…followed by some stack trace stuff, none of which was really useful until ten or so lines down, where I found:

RuntimeError: Cannot install on Python version 3.11.2; only versions >=3.7,<3.11 are supported.

In other words, the version of Python I have installed is too modern to run AI.  What a world.

I DuckDucked around a bit and hit upon pyenv, which is I guess a way of installing and running older versions of Python without having to overwrite whatever version(s) you already have.  I’ll skip over the error part of my trial-and-error process and give you the commands that made it all work:

$ brew install pyenv

$ pyenv install 3.10

$ PATH="~/.pyenv/shims:${PATH}"

$ pyenv local 3.10

$ pip install -U openai-whisper

That got Whisper to install.  It didn’t take very long.

At that point, I wondered what I’d have to configure to transcribe something, and the answer turned out to be precisely zilch.  Once the install was done, I dropped into the directory containing my MP4 video, and typed this:

$ whisper wpe-mse-eme-v2.mp4

Here’s what I got back.  I’ve marked the very few errors.

[00:00.000 --> 00:07.000]  In this video, we'll show you several demos showcasing multi-media capabilities in WPE WebKit,
[00:07.000 --> 00:11.000]  the official port of the WebKit engine for embedded devices.
[00:11.000 --> 00:18.000]  Each of these demos are running on the low-powered Raspberry Pi 3 seen in the lower right-hand side of the screen here.
[00:18.000 --> 00:25.000]  Infotainment systems and media players often need to consume digital rights-managed videos.
[00:25.000 --> 00:32.000]  They tell me, is Michael coming out?  Affirmative, Mike's coming out.
[00:32.000 --> 00:45.000]  Here you can see just that, smooth streaming playback using encrypted media extensions, or EME, with PlayReady 4.
[00:45.000 --> 00:52.000]  Media source extensions, or MSE, are used by many players for greater control over playback.
[00:52.000 --> 01:00.000]  YouTube TV has a whole conformance test suite for this, which WPE has been passing since 2021.
[01:00.000 --> 01:09.000]  The loan exceptions here are those tests requiring hardware support not available on the Raspberry Pi 4, but available for other platforms.
[01:09.000 --> 01:16.000]  YouTube TV has a conformance test for EME, which WPE WebKit passes with flying colors.
[01:22.000 --> 01:40.000]  Music
[01:40.000 --> 01:45.000]  Finally, perhaps most impressively, we can put all these things together.
[01:45.000 --> 01:56.000]  Here is the dash.js player using MSE, running in a page, and using Widevine DRM to decrypt and play rights-managed video with EME all fluidly.
[01:56.000 --> 02:04.000]  Music
[02:04.000 --> 02:09.000]  Remember, all of this is being played back on the same low-powered Raspberry Pi 3.
[02:27.000 --> 02:34.000]  For more about WPE WebKit, please visit WPE WebKit.com.
[02:34.000 --> 02:42.000]  For more information about EGALIA, or to find out how we can help with your embedded device needs, please visit us at EGALIA.com.  

I am, frankly, astonished.  This has no business being as accurate as it is, for all kinds of reasons.  There’s a lot of jargon and very specific terminology in there, and Whisper nailed pretty much every last bit of it, first time in, no special configuration, nothing.  I didn’t even bump up the model size from the default of small.  I felt a little like that Froyo guy in the animated Hunchback of Notre Dame meme yelling about sorcery or whatever.

True, the output isn’t absolutely perfect.  Let’s review the glitches in reverse order.  The last two errors, turning “Igalia” into “EGALIA”, seems fair enough given I didn’t specify that there would be languages other than English involved.  I routinely have to spell it for my fellow Americans, so no reason to think a codebase could do any better.

The space inserted into “WPEWebKit” (which happens throughout) is similarly understandable.  I’m impressed it understood “WebKit” at all, never mind that it was properly capitalized and not-spaced.

The place where it says Music and I marked it as an error: This is essentially an echoing countdown and then a white-noise roar from rocket engines.  There’s a “music today is just noise” joke in here somewhere, but I’m too hip to find it.

Whisper turning “lone” into “loan” doesn’t particularly faze me, given the difficulty of handling soundalike words.  Hell, just yesterday, I was scribing a conference call and mistakenly recorded “gamut” as “gamma”, and those aren’t even technically homophones.  They just sound like they are.

Rounding out the glitch tour, “Hey” got turned into “They”, which (given the audio quality of that particular part of the video) is still pretty good.

There is one other error I couldn’t mark because there’s nothing to mark, but if you scrutinize the timeline, you’ll see a gap from 02:09.000 and 02:27.000.  In there, a short clip from a movie plays, and there’s a brief dialogue between two characters in not-very-Dutch-accented English there.  It’s definitely louder and more clear than the 00:25.000 –> 00:32.000 bit, so I’m not sure why Whisper just skipped over it.  Manually transcribing that part isn’t a big deal, but it’s odd to see it perform so flawlessly on every other piece of speech and then drop this completely on the floor.

Before posting, I decided to give Whisper another go, this time on a different video:

$ whisper wpe-gamepad-support-v3.mp4

This was the result, with the one actual error marked:

[00:00.000 --> 00:13.760]  In this video, we demonstrate WPE WebKit's support for the W3C's GamePad API.
[00:13.760 --> 00:20.080]  Here we're running WPE WebKit on a Raspberry Pi 4, but any device that will run WPE WebKit
[00:20.080 --> 00:22.960]  can benefit from this support.
[00:22.960 --> 00:28.560]  The GamePad API provides a JavaScript interface that makes it possible for developers to access
[00:28.560 --> 00:35.600]  and respond to signals from GamePads and other game controllers in a simple, consistent way.
[00:35.600 --> 00:40.320]  Having connected a standard Xbox controller, we boot up the Raspberry Pi with a customized
[00:40.320 --> 00:43.040]  build route image.
[00:43.040 --> 00:48.560]  Once the device is booted, we run cog, which is a small, single window launcher made specifically
[00:48.560 --> 00:51.080]  for WPE WebKit.
[00:51.080 --> 00:57.360]  The window cog creates can be full screen, which is what we're doing here.
[00:57.360 --> 01:01.800]  The game is loaded from a website that hosts a version of the classic video arcade game
[01:01.800 --> 01:05.480]  Asteroids.
[01:05.480 --> 01:11.240]  Once the game has loaded, the Xbox controller is used to start the game and control the spaceship.
[01:11.240 --> 01:17.040]  All the GamePad inputs are handled by the JavaScript GamePad API.
[01:17.040 --> 01:22.560]  This GamePad support is now possible thanks to work done by Igalia in 2022 and is available
[01:22.560 --> 01:27.160]  to anyone who uses WPE WebKit on their embedded device.
[01:27.160 --> 01:32.000]  For more about WPE WebKit, please visit wpewebkit.com.
[01:32.000 --> 01:35.840]  For more information about Igalia, or to find out how we can help with your embedded device
[01:35.840 --> 01:39.000]  needs, please visit us at Igalia.com.  

That should have been “buildroot”.  Again, an entirely reasonable error.  I’ve made at least an order of magnitude more typos writing this post than Whisper has in transcribing these videos.  And this time, it got the spelling of Igalia correct.  I didn’t make any changes between the two runs.  It just… figured it out.

I don’t have a lot to say about this other than, wow.  Just WOW.  This is some real Clarke’s Third Law stuff right here, and the technovertigo is Marianas deep.


A Leap of Decades

Published 1 year, 1 month past

I’ve heard it said there are two kinds of tech power users: the ones who constantly update to stay on the bleeding edge, and the ones who update only when absolutely forced to do so.  I’m in the latter camp.  If a program, setup, or piece of hardware works for me, I stick by it like it’s the last raft off a sinking island.

And so it has been for my early 2013 MacBook Pro, which has served me incredibly well across all those years and many continents, but was sliding into the software update chasm: some applications, and for that matter its operating system, could no longer be run on its hardware.  Oh and also, the top row of letter keys was becoming unresponsive, in particular the E-R-T sequence.  Which I kind of need if I’m going to be writing English text, never mind reloading pages and opening new browser tabs.

Stepping Up

An early 2013 MacBook Pro sitting on a desk next to the box of an early 2023 MacBook Pro, the latter illuminated by shafts of sunlight.
The grizzled old veteran on the verge of retirement and the fresh new recruit that just transferred in to replace them.

So on Monday, I dropped by the Apple Store and picked up a custom-built early 2023 MacBook Pro: M2 Max with 38 GPU cores, 64GB RAM, and 2TB SSD.  (Thus quadrupling the active memory and nearly trebling the storage capacity of its predecessor.)  I went with that balance, or perhaps imbalance, because I intend to have this machine last me another ten years, and in that time, RAM is more likely to be in demand than SSD.  If I’m wrong about that, I can always plug in an external SSD.  Many thanks to the many people in my Mastodon herd who nudged me in that direction.

I chose the 14” model over the 16”, so it is a wee bit smaller than my old 15” workhorse.  The thing that surprises me is the new machine looks boxier, somehow.  Probably it’s that the corners of the case are not nearly as rounded as the 2013 model, and I think the thickness ratio of display to body is closer to 1:1 than before.  It isn’t a problem or anything, it’s just a thing that I notice.  I’ll probably forget about it soon enough.

Some things I find mildly-to-moderately annoying:

  • DragThing doesn’t work any more.  It had stopped being updated before the 64-bit revolution, never mind the shift to Apple silicon, so this was expected, but wow do I miss it.  Like a stick-shift driver uselessly stomping the floorboards and blindly grasping air while driving an automatic car, I still flip the mouse pointer toward the right edge of the screen, where I kept my DragThing dock, before remembering it’s gone.  I’ve looked at alternatives, but none of them seem like they’re meant as straight up replacements, so I’ve yet to commit to one.  Maybe some day I’ll ask Daniel to teach me Swift to I can build my own. (Because I definitely need more demands on my time.)
  • The twisty arrows in the Finder to open and close folders don’t have enough visual weight.  Really, the overall UI feels like a movie’s toy representation of an operating system, not an actual operating system.  I mean, the visual presentation of the OS looks like something I would create, and brother, that is not a compliment.
  • The Finder’s menu bar has no visually distinct background.  What the hell.  No, seriously, what the hell?  The Notch I’m actually okay with, but removing the distinction between the active area of the menu bar and the inert rest of the desktop seems… ill-advised.  Do not like.  HARK, A FIX: Cory Birdsong pointed me to “System Settings… > Accessibility > Display > Reduce Transparency”, which fixes this, over on Mastodon.  Thanks, Cory!
  • I’m not used to the system default font(s) yet, which I imagine will come with time, but still catches me here and there.
  • The alert and other systems sounds are different, and I don’t like them.  Sosumi.

Oh, and it’s weird to me that the Apple logo on the back of the display doesn’t glow.  Not annoying, just weird.

Otherwise, I’m happy with it so far.  Great display, great battery life, and the keyboard works!

Getting Migratory

The 2013 MBP was backed up nightly to a 1TB Samsung SSD, so that was how I managed the migration: plugged the SSD into the new MBP and let Migration Assistant do its thing.  This got me 90% of the way there, really.  The remaining 10% is what I’ll talk about in a bit, in case anyone else finds themselves in a similar situation.

The only major hardware hurdle I hit was that my Dell U2713HM monitor, also of mid-2010s vintage, seems to limit HDMI signals to 1920×1080 despite supposedly supporting HDMI 1.4, which caught me by surprise.  When connected to a machine via DisplayPort, even my 2013 MBP, the Dell will go up to 2560×1440.  The new MBP only has one HDMI port and three USB-C ports.  Fortunately, the USB-C ports can be used as DisplayPorts, so I acquired a DisplayPort–to–USB-C cable and that fixed the situation right up.

Yes, I could upgrade to a monitor that supports USB-C directly, but the Dell is a good size for my work environment, it still looks pretty good, and did I mention I’m the cling-tightly-to-what-works kind of user?

Otherwise, in the hardware space, I’ll have to figure out how I want to manage connecting all the USB-A devices I have (podcasting microphone, wireless headset, desktop speaker, secondary HD camera, etc., etc.) to the USB-C ports.  I expected that to be the case, just as I expected some applications would no longer work.  I expect an adapter cable or two will be necessary, at least for a while.

Trouble Brewing

I said earlier that Migration Assistant got me 90% of the way to being switched over.  Were I someone who doesn’t install stuff via the Terminal, I suspect it would have been 100% successful, but I’m not, so it wasn’t.  As with the cables, I anticipated this would happen.  What I didn’t expect was that covering that last 10% would take me only an hour or so of actual work, most of it spent waiting on downloads and installs.

First, the serious and quite unexpected problem: my version of Homebrew used an old installation prefix, one that could break newer packages.  So, I needed to migrate Homebrew itself from /usr/local to /opt/homebrew.  Some searching around indicated that the best way to do this was uninstall Homebrew entirely, then install it fresh.

Okay, except that would also remove everything I’d installed with Homebrew.  Which was maybe not as much as some of y’all, but it was still a fair number of fairly essential packages.  When I ran brew list, I got over a hundred packages, of which most were dependencies.  What I found through further searching was that brew leaves returns a list of the packages I’d installed, without their dependencies.  Here’s what I got:

automake
bash
bison
chruby
ckan
cmake
composer
ffmpeg
gh
git
git-lfs
httpd
imagemagick
libksba
lynx
minetest
minimal-racket
pandoc
php
php@7.2
python@3.10
ruby
ruby-install
wget
yarn

That felt a lot more manageable.  After a bit more research, boiled down to its essentials, the New Brew Shuffle I came up with was:


$ brew leaves > brewlist.txt

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/uninstall.sh)"

$ xcode-select --install

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

$ xargs brew install < brewlist.txt

The above does elide a few things.  In step two, the Homebrew uninstall script identified a bunch of directories that it couldn’t remove, and would have to be deleted manually.  I saved all that to a text file (thanks to Warp’s “Copy output” feature) for later study, and pressed onward.  I probably also had to sudo some of those steps; I no longer remember.

In addition to all the above, I elected to delete a few of the packages in brewlist.txt before I fed it back to brew install in the last step — things like ckan, left over from my Kerbal Space Program days  —  and to remove the version dependencies for PHP and Python.  Overall, the process was pretty smooth.  I just had to sit there and watch Homrebrew chew through all the installs, including all the dependencies.

Cleanup

Once all the reinstalls from the last step had finished, I was left with a few things to clean up.  For example, Python didn’t seem to have installed.  Eventually I realized it had actually installed as python3 instead of just plain python, so that was mostly fine and I’m sure there’s a way to alias one to the other that I might get around to looking up one day.

Ruby also didn’t seem to reinstall cleanly: there was a library it was looking for that complained about the chip architecture, and attempts to overcome that spawned even more errors, and none of them were decipherable to me or my searches.  Multiple attempts at uninstalling and then reinstalling Ruby through a variety of means, some with Homebrew, some other ways, either got me the same indecipherable erros or a whole new set of indecipherable errors.  In the end, I just uninstalled Ruby, as I don’t actually use it for anything I’m aware of, and the default Ruby that comes with macOS is still there.  If I run into some script I need for work that requires something more, I’ll revisit this, probably with many muttered imprecations.

Finally, httpd wasn’t working as intended.  I could launch it with brew services httpd start, but the resulting server was pointing to a page that just said “It works!”, and not bringing up any of my local hosts.  Eventually, I found where Homebrew had stuffed httpd and its various files, and then replaced its configuration files with my old configuration files.  Then I went through the cycle of typing sudo apachectl start, addressing the errors it threw over directories or PHP installs or whatever by editing httpd.conf, and then trying again.

After only three or four rounds of that, everything was up and running as intended  —  and as a bonus, I was able to mark httpd as a Login item in the Finder’s System Settings, so it will automatically come back up whenever I reboot!  Which my old machine wouldn’t do, for some reason I never got around to figuring out.

Now I just need to decide what to call this thing.  The old MBP was “CoCo”, as in the TRS-80 Color Computer, meant as a wry commentary on the feel of the keyboard and a callback to the first home computer I ever used.  That joke still works, but I’m thinking the new machine will be “C64” in honor of the first actually powerful home computer I ever used and its 64 kilobytes of RAM.  There’s a pleasing echo between that and the 64 gigabytes of RAM I now have at my literal fingertips, four decades later.

Now that I’m up to date on hardware and operating system, I’d be interested to hear what y’all recommend for good quality-of-life improvement applications or configuration changes.  Link me up!


Playlist Wishes

Published 1 year, 2 months past

Last week I published a CSS Wish List, which reminded me I have another wish list rattling around in my head, and I’d like to throw out into the world.  This one is for an MP3 player in the vein of WinAmp or Apple Music, but with slightly more DJ-like capabilities in its playlists.

To wit:

  • The ability to set the playback volume for a track in a playlist.  Right now, if I adjust the playback volume of a track to match the others on a playlist, I’m changing its volume in every context.  Let’s say I bump up the volume setting on Yes’s “Leave It” so it matches Army of Anyone’s “Leave It”.  Now, if I go back to play 90215, which is the Yes album containing their “Leave It”, that one track will be way louder than the rest of the album.  This is honestly ridiculous, and it ought to be easy to fix.  Just let me say “on this playlist, play this track at this volume”.  And no, I don’t want some automagic “volume stabilizer” to try to do it for me, because they’re never quite right.  They’re the autocorrect of autotuning.
  • The ability to define, to the millisecond, how much two successive tracks in a playlist overlap.  This is useful when trying to mix two songs so the beat of one plays off against the beat of the other, so you get a smoothly bumpin’ segue.  Ideally, you would do this by click/tap-dragging waveform visualizations of the two tracks, in order to be able to match up (or offset) the beat-peaks, and quickly play just the overlap (maybe with a few extra seconds to either side).
  • The ability to trim tracks in a playlist.  This is closely related to the previous wish, but here, you could trim off the silent bit at the beginning or end of a track when it’s played in the playlist you’re building.  When it’s in the context of its source album, the silences are left intact.  This could also be used to trim off actual sound, in case you wanted to start a verse in or something, but the primary use case would be saying “for this track in this playlist, skip the half-second of silence”.  Again, being able to work with waveform visualizers would help here.
  • The ability to define the duration of fade-ins and fade-outs for each track in a playlist.  Maybe, paired with the previous wish for overlap duration, I want to be able to fade one song out and the next song in during the overlap period.  Or maybe for only part of the overlap, or for longer than the period of overlap, or even when there is no overlap.  I know you can set crossfade durations in apps like Apple’s Music, but they’re global and that’s not okay.  Sometimes you want to say “no crossfades on this playlist except for these two segues, one of which is 5 seconds long and the other of which is 2.7 seconds”.
  • The ability to play back these playlists on my mobile device (iPhone, Android phone, whatever).  On desktop too, sure, but the primary use case here is on mobile: a carefully-made woodshop playlist, or a workout playlist, or whatever.
  • If I’m really dreaming, I’d also like to have this supported by all music apps, so I could port playlists from one app to another.  Having all this supported in some kind of streaming service, so I could allow other people to listen to my playlists with relatively ease, would be even more incredible.

To reiterate: those would all be things applied on a per-playlist basis.  The tracks would sound within their source albums as they were produced to sound, but could be audibly modified for playlist purposes in a non-destructive way.

I’ve occasionally done stuff like this in Audacity, which works okay, but it does mean I have to import all the tracks, decide where to set the track break during overlaps, and then render them into new tracks, essentially creating copies of the songs with new waveforms and maybe parts of other songs mixed in at the beginning or end.

That all seems wasteful when this stuff could probably be expressed using simple strings of text within a playlist file.  I mean, the UI for the playlist editing could be all graphical and cool and Audacity-like (or better), but the actual data could be represented in just a few bytes of text.  And maybe this is all possible in an application like Logic Pro — I wouldn’t know, having never used it — but it shouldn’t require a $300 piece of software to make these things possible.

Maybe this is something that would only appeal to old farts like me, them what made mixtapes when they were created using a dual-deck player using actual magnetic tapes.  But I bet some of the young’uns would find it fun and interesting as well.


CSS Wish List 2023

Published 1 year, 2 months past

Dave asked people to share their CSS wish lists for 2023, and even though it’s well into the second month of the year, I’m going to sprint wild-eyed out of the brush along the road, grab the hitch on the back of the departing bandwagon, and try to claw my way aboard. At first I thought I had maybe four or five things to put on my list, but as I worked on it, I kept thinking of one more thing and one more thing until eventually I had a list of (checks notes) sixt — no, SEVENTEEN?!?!?  What the hell.

There’s going to be some overlap with the things being worked on for Interop 2023, and I’m sure there will be overlap with other peoples’ lists.  Regardless, here we go.


Subgrid

Back in the day, I asserted Grid should wait for subgrid.  I was probably wrong about that, but I wasn’t wrong about the usefulness of and need for subgrid.  Nearly every time I implement a design, I trip over the lack of widespread support.

I have a blog post in my head about how I hacked around this problem for wpewebkit.org by applying the same grid column template to nested containers, and how I could make it marginally more efficient with variables.  I keep not writing it, because it would show the approach and improvement and then mostly be about the limitations, flaws, and annoyances this approach embodies.  The whole idea just depresses me, and I would probably become impolitic.

So instead I’ll just say that I hope those browser engines that have yet to catch up with subgrid support will do so in 2023.

Masonry layout

Grid layout is great, and you can use it to create a masonry-style layout, but having a real masonry layout mechanism would be better, particularly if you can set it on a per-axis basis.  What I mean is, you could define a bunch of fixed (or flexible) columns and then say the rows use masonry layout.  It’s not something I’m likely to use myself, but I always think the more layout possibilities there are, the better.

Grid track styles

For someone who doesn’t do a ton of layout, I find myself wanting to style grid lines a surprising amount.  I just want to do something like:

display: grid;
gap: 1em;
grid-rules-columns: 1px dotted red;

…although it would be much better to be able to style separators for a specific grid track, or even an individual grid cell, as well as be able to apply it to an entire grid all at once.

No, I don’t know exactly how this should work.  I’m an idea guy!  But that’s what I want.  I want to be able to define what separator lines look like between grid tracks, centered on the grid lines.

Anchored positioning

I’ve wanted this in one form or another almost since CSS2 was published.  The general idea is, you can position an element in relation to the edges of another element that isn’t a containing block.  I wrote about this a bit in my post on connector lines for wpewebkit.org, but another place it would have come in handy was with the footnotes on The Effects of Nuclear Weapons.

See, I wanted those to actually be sidenotes, Tufteee-styleee.  Right now, in order to make sidenotes, you have to stick the footnote into the text, right where its footnote reference appears  —  or, at a minimum, right after the element containing the footnote reference.  Neither was acceptable to me, because it would dork up the source text.

What I wanted to be able to do was collect all the footnotes as endnotes at the end of the markup (which we did) and then absolutely position each to sit next to the element that referenced them, or have it pop up there on click, tap, hover, whatever.  Anchored positioning would make that not just possible, but fairly easy to do.

Exclusions

Following on anchored positioning, I’d love to have CSS Exclusions finally come to browsers.  Exclusions are a way to mark an element to have other content avoid it.  You know how floats move out of the normal flow, but normal-flow text avoids overlapping them?  That’s an exclusion.  Now imagine being able to position an element by other means, whether grid layout or absolute positioning or whatever, and then say “have the content of other elements flow around it”.  Exclusions!  See this article by Rob Weychert for a more in-depth explanation of a common use case.

Element transitions

The web is cool and all, but you know how futuristic interfaces in movies have pieces of the interface sliding and zooming and popping out and all that stuff?  Element transitions.  You can already try them out in Chrome Canary, Batman, and I’d love to see them across the board.  Even more, I’d love some gentle, easy-to-follow tutorials on how to make them work, because even in their single-page form, I found the coding requirements basically impossible to work out.  Make them all-CSS, and explain them like I’m a newb, and I’m in.

Nested Selectors

A lot of people I know are still hanging on to preprocessors solely because they permit nested selectors, like:

main {
	padding: 1em;
	background: #F1F1F0;
	
	h2 {
		border-block-end: 1px solid gray;
	}
	p {
		text-indent: 2em;
	}
}

The CSS Working Group has been wrestling with this for quite some time now, because it turns out the CSS parsing rules make it hard to just add this, there are a lot of questions about how this should interact with pseudo-classes like :is(), there are serious concerns about doing this in a way that will be maximally future-compatible, and also there has been a whole lot of argument over whether it’s okay to clash with Sass syntax or not.

So it’s a difficult thing to make happen in native CSS, and the debates are both wide-ranging and slow, but it’s on my (and probably nearly everyone else’s) wish list.  You can try it out in Safari Technology Preview as I write this, so here’s hoping for accelerating adoption!

More and better :has()

Okay, since I’m talking about selectors already, I’ll throw in universal, full-featured, more optimized support for :has().  One browser doesn’t support compound selectors, for example.  I’ve also thought that maybe some combinators would be nice, like making a:has(> b) can be made equal to a < b.

But I also wish for people to basically go crazy with :has().  There’s SO MUCH THERE.  There are so many things we can do with it, and I don’t think we’ve touched even a tiny fraction of the possibility space.

More attr()

I’ve wanted attr() to be more widely accepted in CSS values since, well, I can’t remember.  A long time.  I want to be able to do something like:

p[data-size] {width: attr(data-width, rem);}

<p data-size="27">…</p>

Okay, not a great example, but it conveys the idea.  I also talked about this in my post about aligning table columns. I realize adding this would probably lead to someone creating a framework called Headgust where all the styling is jammed into a million data-*attributes and the whole of the framework’s CSS is nothing but property: attr() declarations for every single CSS property known to man, but we shouldn’t let that stop us.

Variables in media queries

Basically I want to be able to do this:

:root {--mobile: 35em;}

@media (min-width: var(--mobile)) {
	/* non-mobile styles go here */
}

That’s it.  This was made possible in container queries, I believe, so maybe it can spread to media (and feature?) queries.  I sure hope so!

Logical modifiers

You can do this:

p {margin-block: 1em; margin-inline: 1rem;}

But you can’t do this:

p {margin: logical 1em 1rem;}

I want to be able to do that.  We should all want to be able to do that, however it’s made possible.

Additive values

You know how you can set a bunch of values with a comma-separated list, but if you want to override just one of them, you have to do the whole thing over?  I want to be able to add another thing to the list without having to do the whole thing over.  So rather than adding a value like this:

background-clip: content, content, border, padding; /* needed to add padding */

…I want to be able to do something like:

background-clip: add(padding);

No, I don’t know how to figure out where in the list it should be added.  I don’t know a lot of things.  I just know I want to be able to do this.  And also to remove values from a list in a similar way, since I’m pony-wishing.

Color shading and blending

Preprocessors already allow you to say you want the color of an element to be 30% lighter than it would otherwise be.  Or darker.  Or blend two colors together.  Native CSS should have the same power.  It’s being worked on.  Let’s get it done, browsers.

Hanging punctuation

Safari has supported hanging-punctuation forever (where “forever”, in this case, means since 2016) and it’s long past time for other browsers to get with the program.  This should be universally supported.

Cross-boundary styles

I want to be able to apply styles from my external (or even embedded) CSS to a resource like an external SVG.  I realize this sets up all kinds of security and privacy concerns.  I still want to be able to do it.  Every time I have to embed an entire inline SVG into a template just so I can change the fill color of a logo based on its page context, I grit my teeth just that little bit harder.  It tasks me.

Scoped styling (including imports)

The Mirror Universe version of the previous wish is that I want to be able to say a bit of CSS, or an entire style sheet (embedded or external), only applies to a certain DOM node and all its descendants. “But you can do that with descendant selectors!” Not always.  For that matter, I’d love to be able to just say:

<div style="@import(styles.css);">

…and have that apply to that <div> and its descendants, as if it were an <iframe>, while not being an <iframe> so styles from the document could also apply to it.  Crazy?  Don’t care.  Still want it.

Linked flow regions(?)

SPECIAL BONUS TENTATIVE WISH: I didn’t particularly like how CSS Regions were structured, but I really liked the general idea.  It would be really great to be able to link elements together, and allow the content to flow between them in a “smooth” manner.  Even to allow the content from the second region to flow back into the first, if there’s room for it and nothing prevents it.  I admit, this is really a “try to recreate Aldus PageMaker in CSS” thing for me, but the idea still appeals to me, and I’d love to see it come to CSS some day.


So there you go.  I’d love to hear what you’d like to see added to CSS, either in the comments below or in posts of your own.


Minimal Dark Mode Styling

Published 1 year, 3 months past

Spurred by a toot in reply to a goofy dev joke I made, I’ve set up a dirty-rough Dark Mode handler for meyerweb:

@media (prefers-color-scheme: dark) {
	html body {filter: invert(1);}
	/* the following really should be managed by a cascade layer */
	html img, 
	html img.book.cover, 
	html img.book.cover.big, 
	html #archipelago a:hover img {filter: invert(1);}
	html #thoughts figure.standalone img {
	  box-shadow: 0.25em 0.25em 0.67em #FFF8;
	}
}

It’s the work of about five minutes’ thought and typing, so I suspect it’s teetering on the edge of Minimum Viable Product, but I’m not sure which side of that line it lands on.

Other than restructuring things so that I can use Cascade Layers to handle the Light and Dark Mode styling, what am I missing or overlooking?  <video> elements, I suppose, but anything else jump out at you as in need of correction?  Let me know!

(P.S. “Use only classes, never IDs” is not something that needs to be corrected.  Yes, I know why people think that, and this situation seems to be an argument in favor of that view, but I have a different view and will not be converting IDs to classes.  Thanks in advance for your forbearance.)


Styling a ‘pre’ That Contains a ‘code’

Published 1 year, 3 months past

I’ve just committed my first :has() selector to production CSS and want to share it, so I will!  But first, a little context that will feel familiar to nerds like me who post snippets of computer code to the Web, and have markup with this kind of tree structure (not raw source; we’d never allow this much formatting whitespace inside a <pre>):

<pre>
	<code>
		{{content goes here}}
	</code>
</pre>

It’s nicely semantic, indicating that the contents of the <pre> are in fact code of some kind, as opposed to just plain preformatted text like, say, output from a shell script or npm job, which isn’t code and thus should, perhaps, be styled in a distinct way.

Given cases like that, you’ve probably written rules that go a little something like this:

pre > code {
	display: block;
	background: #f1f1f1;
	padding: 1.33em;
	border-radius: 0.33em;
}

Which says: if a <code> element is the child of a <pre> element, then turn the <code> into a block box and give it some background color, padding, etc.

It works out fine, but it always feels a little fragile.  What if there are already <pre> styles that throw off the intended effect on code blocks?  Do we get into specificity wars between those rules and the code-block rules?  Find other ways to figure out what should be adjusted in which cases?  Those are probably manageable problems, but it would be better not to have them.

It’s also, when you step back for a moment, a little weird.  The <pre> is already a block box and the container for the code; why aren’t we styling that?  Because unless you wrote some scripting, whether server-side or client-side, to add a class to the <pre> in scenarios like this, there wasn’t a way to address it directly based on its structural contents.

There is now:

pre:has(> code) {
	background: #f1f1f1;
	padding: 1.33em;
	border-radius: 0.33em;
}

Now I’m styling any <pre> that has a <code> as a child, which is why I took out the display: block.  I don’t need it any more!

But suppose you have a framework or ancient script or something that inserts classed <span> elements between the <pre> and the <code>, like this:

<pre>
	<span class="pref">
		<code>
			{{content goes here}}
		</code>
	</span>
</pre>

First of all, ew, address the root problem here if at all possible.  But if that isn’t possible for whatever reason, you can still style the <pre> based on the presence of a <code> by removing the child combinator from the selector.  In other words:

pre:has(code) {
	background: #f1f1f1;
	padding: 1.33em;
	border-radius: 0.33em;
}

Now I’m styling any <pre> that has a <code> as a descendant  —  child, grandchild, great-great-great-great grandchild, whatever.

Which is not only more robust, it’s a lot more future-proof: even if some hot new front-end framework that sticks in <span> elements or something gets added to the site next year, this style will just keep chugging along, styling <pre> elements that contain <code> elements until long after that hot new framework has cooled to ash and been chucked into the bit-bucket.

There is one thing to keep in mind here, as pointed out by Emmanuel over on Mastodon: if you have a scenario where <pre> elements can contain child text nodes in addition to <code> blocks, the <pre> will still be styled in its entirely.  Consider:

<pre>
	{{some text is here}}
	<code>
		{{content goes here}}
	</code>
	{{or text is here}}
</pre>

pre:has(> code) and pre:has(code) will still match the <pre> element here, which means all of the text (both inside and outside the <code> elements)  will sit inside the light-gray box with the rounded corners.  If that’s fine for your use case, great!  If not, then don’t use :has() in this scenario, and stick with the pre > code {…} or pre code {…} approach of yore.  That will style just the <code> elements instead of the whole <pre>, as in the example at the beginning of this article.

As I write this, the code hasn’t gone into production on wpewebkit.org yet, but I think it will within the next week or two, and will be my first wide-production use of :has().  I feel like it’s a great way to close out 2022 and kick off 2023, because I am that kind of nerd.  If you are too, I hope you enjoyed this quick dive into the world of :has().


Highlighting Image Accessibility on Mastodon

Published 1 year, 4 months past

Some time ago, I published user styles to visually highlight images on Twitter that didn’t have alt text  —  this was in the time before the “ALT” badge and the much improved accessibility features Twitter deployed in the pre-Musk era.

With the mass Mastodon migration currently underway in the circles I frequent, I spend more time there, and I missed the quick visual indication of images having alt text, as well as my de-emphasis styles for those images that don’t have useful alt text.  So I put the two together and wrote a new user stylesheet, which I apply via the Stylus browser extension.  If you’d like to also use it, please do so!

/* ==UserStyle==
@name           mastodon.social - 12/14/2022, 9:37:56 AM
@namespace      github.com/openstyles/stylus
@version        1.0.0
@description    Styles images posted on mastodon.social based on whether or not they have alt text.
@author         @Meyerweb@mastodon.social, https://meyerweb.com/
==/UserStyle== */

@-moz-document regexp(".*\.social.*") {
	.media-gallery__item-thumbnail img:not([alt]) {
		filter: grayscale(1) contrast(0.5);
	}
	.media-gallery__item-thumbnail img:not([alt]):hover {
		filter: none;
	}
	.media-gallery__item-thumbnail:has(img[alt]) {
		position: relative;
	}
	.media-gallery__item-thumbnail:has(img[alt])::after {
		content: "ALT";
		position: absolute;
		bottom: 0;
		right: 0;
		border-radius: 5px 0 0 0;
		border: 1px solid #FFF3;
		color: #FFFD;
		background: #000A;
		border-width: 1px 0 0 1px;
		padding-block: 0.5em 0.4em;
		padding-inline: 0.7em 0.8em;
		font: bold 90% sans-serif;
	}
}

Because most of my (admittedly limited and sporadic) Mastodon time is spent on mastodon.social, the styles I wrote are attuned to mastodon.social’s markup.  I set things up so these styles should be applied to any *.social site, but only those who use the same markup mastodon.social uses will get the benefits.  pinafore.social, for example, has different markup (I think they’re using Svelte).

You can always adapt the selectors to fit the markup of whatever Mastodon instance you use, if you’re so inclined.  Please feel free to share your changes in the comments, or in posts of your own.  And with any luck, this will be a temporary solution before Mastodon adds these sorts of things natively, just as Twitter eventually did.


Addendum: It was rightly pointed out to me that Firefox does not, as of this writing, support :has() by default.  If you want to use this in Firefox, as I do, set the layout.css.has-selector.enabled flag in about:config to true.


How to Verify Site Ownership on Mastodon Profiles

Published 1 year, 4 months past

Like many of you, I’ve been checking out Mastodon and finding more and more things I like.  Including the use of XFN (XHTML Friends Network) semantics to verify ownership of sites you link from your profile’s metadata!  What that means is, you can add up to four links in your profile, and if you have an XFN-compliant link on that URL pointing to your Mastodon profile, it will show up as verified as actually being your site.

Okay, that probably also comes off a little confusing.  Let me walk through the process.

First, go to your home Mastodon server and edit your profile.  On servers like mastodon.social, there should be an “Edit profile” link under your user avatar.

Here’s what it looks like for me.  Yes, I prefer Light Mode.  No, I don’t want to have a debate about it.

I saw the same thing on another Mastodon server where I have an account, so it seems to be common to Mastodon in general.  I can’t know what every Mastodon server does, though, so you might have to root around to find how you edit your profile.  (Similarly, I can’t be sure that everything will be exactly as I depict it below, but hopefully it will be at least reasonably close.)

Under “Appearance” in the profile editing screen, which I believe is the default profile editing page, there should be a section called “Profile metadata”.  You’ll probably have to scroll a bit to reach it.  You can add up to four labels with content, and a very common label is “Web” or “Homepage” with the URL of your personal server.  They don’t all have to be links to sites; you could add your favorite color or relationship preference(s) or whatever

I filled a couple of these in for demonstration purposes, but now they’re officially part of my profile.  No, I don’t want to debate indentation preferences, either.

But look, over there next to the table, there’s a “Verification” section, with a little bit of explanation and a field containing some markup you can copy, but it’s cut off before the end of the markup.  Here’s what mine looks like in full:

<a rel="me" href="https://mastodon.social/@Meyerweb">Mastodon</a>

If I take this markup and add it to any URL I list in my metadata, then that entry in my metadata table will get special treatment, because it will mean I’ve verified it.

The important part is the rel="me", which establishes a shared identity.  Here’s how it’s (partially) described by XFN 1.1:

A link to yourself at a different URL. Exclusive of all other XFN values. Required symmetric.

I admit, that’s written in terse spec-speak, so let’s see how this works out in practice.

First, let’s look at the markup in my Mastodon profile’s page.  Any link to another site in the table of profile metadata has a me value in the rel attribute, like so:

<a href="https://meyerweb.com/" rel="nofollow noopener noreferrer me">

That means I’ve claimed via Mastodon that meyerweb.com is me at another URL.

But that’s not enough, because I could point at the home page of, say, Wikipedia as if it were mine.  That’s why XFN requires the relationship to be symmetric, which is to say, there needs to be a rel="me" annotated link on each end.  (On both ends.  However you want to say that.)

So on the page being pointed to, which in my case is https://meyerweb.com/, I need to include a link back to my Mastodon profile page, and that link also has to have rel="me".  That’s the markup Mastodon provided for me to copy, which we saw before and I’ll repeat here:

<a rel="me" href="https://mastodon.social/@Meyerweb">Mastodon</a>

Again, the important part is that the href points to my Mastodon profile page, and there’s a rel attribute containing me.  It can contain other things, like noreferrer, but needs to have me for the verfiication to work.  Note that the content of the link element doesn’t have to be the text “Mastodon”.  In my case, I’m using a Mastodon logo, with the markup looking like this:

<a rel="me" href="https://mastodon.social/@Meyerweb">
 	<img src="/pix/icons/mastodon.svg" alt="Mastodon">
</a>

With that in place, there’s a “me” link pointing to a page that contains a “me” link.  That’s a symmetric relationship, as XFN requires, and it verifies that the two pages have a shared owner.  Who is me!

Thus, if you go to my Mastodon profile page, in the table of my profile metadata, the entry for my homepage is specially styled to indicate it’s been verified as actually belonging to me.

The table as it appears on my profile page for me.  Your colors may vary.

And that’s how it works.

Next question: how can I verify my GitHub page?  At the moment, I’d have to put my Mastodon profile page’s URL into the one open field for URLs in GitHub profiles, because GitHub also does the rel="me" thing for its profile links.  But if I do that, I’d have to remove the link to my homepage, which I don’t want to do.

Until GitHub either provides a dedicated Mastodon profile field the way it provides a dedicated Twitter profile field, or else allows people to add multiple URLs to their profiles the way Mastodon does, I won’t be able to verify my GitHub page on Mastodon.  Not a huge deal for me, personally, but in general it would be nice to see GitHub become more flexible in this area.  Very smart people are also asking for this, so hopefully that will happen soon(ish).


Browse the Archive

Earlier Entries

Later Entries