Recently I wrote about detecting the preferred language or locale of a web site visitor, using Zend Framework.
Well, I have to start with one correction. In my last blog post about this topic I talked about the User Agent String, but since I wrote the article, I figured out that the User Agent String doesn’t play any role at all. Neither in the Zend Framework variant that I blogged about earlier, nor in today’s code. And that is good so, because both Internet Explorer and Mozilla Firefox will no longer add the browser language or locale to their User Agent Strings in the future. Other browsers are likely to follow, since there are efforts to make it harder to fingerprint users by their UA-Strings.
So here is my ZF-less variant:
$m = array(); $http_accept_language = isset($_SERVER['HTTP_ACCEPT_LANGUAGE']) ? $_SERVER['HTTP_ACCEPT_LANGUAGE'] : ""; preg_match_all('/([a-z]{2})(?:-[a-zA-Z]{2})?/', $http_accept_language, $m); $pref_locale = array(); foreach($m[0] as $locale) { $pref_locale[] = $locale; } $pref_lang = array(); foreach($m[1] as $lang) { if (! in_array($lang, $pref_lang)) { $pref_lang[] = $lang; } }
The array $pref_locale
will contain all locales in order of the user’s preference, and $pref_lang
will only contain the languages in order of the user’s preference. Other than my ZF code from last time, this allows to look up secondary, tertiary etc. choices of the user as well.
Here is an example. Lets assume, these are the user’s preference settings:
$_SERVER['HTTP_ACCEPT_LANGUAGE']
would contain:
en-us,en;q=0.9,de-at;q=0.7,de;q=0.6,es-es;q=0.4,es;q=0.3,ja;q=0.1.
After running my code, the $pref_locale
array would contain this:
Array ( [0] => en-us [1] => en [2] => de-at [3] => de [4] => es-es [5] => es [6] => ja )
and the $pref_lang
array would contain
Array ( [0] => en [1] => de [2] => es [3] => ja )
Still simple enough, and most importantly, still a better solution than what Google still does.