A tale of two selectors in HTML and CSS
In the course of normal web development we're used to encountering weird quirks of various browsers, but this one left us flabbergasted. The root of the problem was using
document.querySelectors for HTML elements with
ids that had leading numbers. Solving it required a simple regex:
var safe_id = id.replace(/^(\d)/, "\\3$1 ");
Join us in the Way-back Machine, as we travel back to recount the circumstances that led us to discover this particular quirk. In order to see why this solution works, there are multiple pieces of web infrastructure to unpack.
A mysterious error
We were working on a video site, and every now and then the player would crash (spectacularly) mid-playback. The sole piece of information we had to go on was this incredibly helpful error:
SyntaxError: DOM Exception 12
We wondered, bemused, what happened to the other 11 exceptions. Over time, we noticed that this happened when we were switching video tags (i.e. selecting elements) that had
ids that had a started with a number (i.e.
<div id="5AF5634">why?</div>). We were programmatically generating
ids by randomly selecting 8 alphanumeric characters. The randomness explains why we saw it fail intermittently, rather than every time. We were left with even more questions.
We need to go deeper
document.getElementById("5AF5634") worked just fine. And the HTML 5 specification says that leading numeric characters in
ids are copacetic.
Digging some more, we found that
document.querySelector uses the CSS specification, which does not allow leading digits. So, the video playback would just stop, breaking the application every time it selected an improperly generated
It looked like we would have needed to replace the id generator that was enmeshed in the rest of the backend code, which would have required several hours' work; we couldn't simply place a constraint on the
id without requiring serious surgery for the rest of the application.
A wild solution appears
However, there is a sneaky way to get around
document.querySelector's limitation: translating a leading digit to its unicode code point. A number's unicode code point is
U+003[number]. You can represent this as a string in this form as an escaped hexadecimal number
"\3[number] ". The trailing space is not technically required, but if you have any trailing hexadecimal or another space then those will also get converted. For example
"1\323" is not
"1\32soda" works just fine.
The regex call
id.replace(/^(\d)/, "\\3$1 ") takes the leading digit of our
id and transforms its associated unicode code point via a hexadecimal escape with a trailing space in case the following character is a space or hexadecimal digit. So you only need to understand unicode code points, hexadecimal encoding, and the facts that
document.getElementId uses HTML 5
id rules and
document.querySelector uses CSS
id selector rules. Simple, right?
As we have already outlined above, there are a host of reasons we went with this approach. Confining our changes to the CSS selector layer of our application meant that ids were unchanged for the rest of the application. We could bolt this
safe_id on, rather than having to perform major surgery throughout the code base. Our solution purely additive as opposed to altering existing functionality. This choice saved time and money.