Archive for October 24, 2007

Wednesday, October 24, 2007

A Localization Horror Story

The documentation for Perl’s Maketext module mentions an interesting issue:

First off, your code for “I scanned %g directory.” or “I scanned %g directories.” assumes there’s only singular or plural. But, to use linguistic jargon again, Arabic has grammatical number, like English (but unlike Chinese), but it’s a three-term category: singular, dual, and plural. In other words, the way you say “directory” depends on whether there’s one directory, or two of them, or more than two of them. Your test of ($directory == 1) no longer does the job. And it means that where English’s grammatical category of number necessitates only the two permutations of the first sentence based on “directory [singular]” and “directories [plural],” Arabic has three—and, worse, in the second sentence (“Your query matched %g file in %g directory.”), where English has four, Arabic has nine. You sense an unwelcome, exponential trend taking shape.

I’ve run into simpler versions of this problem several times, in different programming languages, and never found a satisfying solution.