[whatwg] Editorial: ASCII case-insensitive string comparison

When I read Anne van Kesteren's Encoding specification recently, I came across the following definition, borrowed from HTML5:

> Comparing two strings in an ASCII case-insensitive manner means comparing them exactly, code point for code point, except that the characters in the range U+0041 to U+005A (i.e. LATIN CAPITAL LETTER A to LATIN CAPITAL LETTER Z) and the corresponding characters in the range U+0061 to U+007A (i.e. LATIN SMALL LETTER A to LATIN SMALL LETTER Z) are considered to also match.


The construction �are considered to also match� seems awkward here since the intended meaning is clearly not that the characters match in addition to doing something else like in �I don�t just want you to laugh but to also sing along� or �our face/tongue system allow[s] us to talk and eat�but also to sing and act�.

The most natural place for �also� is probably in front of �considered� (yielding �are also considered to match�).

(Another solution would be to remove the need for �also� by rewriting the phrase, for instance to something like �except that the characters in the range U+0041 to U+005A ([...] A to [...] Z) are considered equivalent to the corresponding characters in the range U+0061 to U+007A ([... a] to [... z])�.)

�istein E. Andersen

Received on Saturday, 12 May 2012 12:47:44 UTC