Commerce 2.x Stories - Internationalization
Welcome to the first article in the “Commerce 2.x Stories” series. As the development heats up, we’ll be covering interesting developments, ideas, and contributors.
Our first topic of interest is internationalization and localization. This involves tasks from translating UIs and content to representing numbers, currencies, and dates in a locale specific manner. It’s also a current pain point with Drupal 7 / Commerce 1.x - especially as it relates to currency management.
Commerce 1.x was always as translatable as the rest of Drupal 7.
Is it possible to fully translate a Drupal 7 site? Yes.
Is it painless and easy? No.
Most of a site’s interface gets translated using the built-in t() function. The t() function covers only strings defined in code. Any configuration defined in the UI requires a contrib module: i18n. Many users don’t realize this and get very angry for “having to install a whole set of modules just to translate a few additional strings”. And since i18n is a contrib module, other contrib modules don’t always integrate with it, increasing the chances of having non-translatable strings on a site.
Then there’s content. Out of the box, Drupal could only translate nodes. Anything else (taxonomy terms, products, etc) needed a different approach. Hence, Entity Translation was born, but it took a long time to mature. Contrib once again doesn’t fully support it (though Commerce 1.x does thanks to the heroic efforts of its maintainer, plach), and many tasks can now be accomplished in multiple ways. Entity Translation is usually the best solution, but it’s not immediately obvious.
Drupal 8 fixes all that. All configuration is stored using a common API with support for translations. Using the built-in configuration translation module, any configuration can be translated out of the box. The core content translation module is out, and a vastly improved entity_ ranslation module is now in core.
This means that out of the box, Commerce 2.x is fully translatable, in a quick, painless, and well understood manner. Thank you, Drupal 8!
Going beyond translations
Having the store and all of its content and products in multiple languages is a great first step.
Supporting different markets also means that a product has multiple prices in multiple currencies. A product can cost 75 EUR, 65 GBP, 100 USD, at the same time. The right price is selected based on the customer’s location, current language, and other factors.
Commerce 1.x has always supported this nicely, allowing you to place unlimited price fields on a product, and select the right one during the calculation of the product price. Commerce 2.x will continue to have the same flexibility. At the same time, we’re working on improving the currency handling and price formatting.
Currency handling in Commerce 1.x
Commerce 1.x defined its own set of currencies, including information for each currency such as the currency code (USD), name (US Dollar), numeric code (804), symbol ($), the number of fraction digits (digits after the decimal point, 2 for dollars). However, the list was incomplete, requiring users to provide patches in order to add their local currencies. Currencies also change over time. Some get deprecated and replaced with other currencies. Some lose subunits due to inflation (no more cents for you!).
The currency names were translatable, but translating hundreds of currencies to different languages is a big chore for the translators.
The formatting of currency amounts was tied to currencies. Each currency had data on what the decimal separator was, what the grouping separator was, whether the currency symbol goes before or after. This allowed the currency amount to be formatted correctly in its home marked, in most cases. But it turns out, even two countries using the same currency (France and Germany, for example) don’t show an amount the same way.
Take an amount of 12345.99 EUR:
- France expects to see it as 12 345,99 €.
- Germany expects to see it as 12.345,99 €.
- The UK expects to see it as €12,345.99.
This means that formatting of amounts needs to be done per locale, not per currency. It gets better: the symbol can change from locale to locale. Most locales will show a US dollar amount using $. But in Australia the same amount would use the US$ symbol, because Australia has its own dollar (AUD) that uses the $ symbol.
And what about other numbering systems? Take an amount of 999.99 AED. In the United Arab Emirates, the locale ar_AE requires the amount to be formatted as د.إ. ٩٩٩٫٩٩. Yes, the system needs to know about arabic digits. Same for bengali or devanagari (hindu) digits.
Users also have a tendency to input prices using their locale’s rules. A french user would like to enter a 99.99 price as 99,99, because their decimal separator is a comma. The system doesn’t know how to parse such an amount.
Summing up, here are our requirements for Commerce 2.x:
- Provide a currency list that is automatically generated from a reliable, external source.
- Provide a way to get the currency name and symbol per locale.
- Provide a way of formatting currency amounts per locale.
- Provide a way of parsing a currency amount according to locale.
What about the intl PHP extension?
The php-intl extension (which has existed since version 5.3) provides a NumberFormatter class. This class allows us to format and parse currency amounts per locale. 2/4 requirements resolved? Well…
It turns out the extension isn’t present by default on PHP installations. The Ubuntu VPS you installed? Your fancy cloud hosting? The MAMP/XAMPP installation you have on your laptop? It’s most likely missing from all of them. Requiring Commerce 2.x users to install a PHP extension (and its supporting library) would impact adoption significantly. Even Symfony, which is much more developer oriented, had to move away from the same intl requirement.
Intl uses the ICU library to do the actual work. But different distributions ship with different ICU versions. Installed Ubuntu 12.04 instead of 14.04? Your currency formatting rules are now years out of date. But wait, where does the ICU library get the data from? The per-locale formatting rules and currency lists need to be defined somewhere. The answer is CLDR:
The Unicode CLDR provides key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data available. This data is used by a wide spectrum of companies for their software internationalization and localization
The same data that everyone from Apple to Google and Microsoft uses, easily downloadable in JSON form. What’s not to like? (Side note: Drupal 8 has replaced its ISO country list with the CLDR country list).
- Currency information that’s more precise than the official ISO data.
(For example, ISO tells me that my country’s currency, the Serbian Dinar (RSD) has two fraction digits (a 100 subunits). But CLDR knows that the real number of fraction digits is 0, nobody uses the subunits because the inflation made them worthless.)
- Currency names & symbols for all locales.
- Number formatting patterns for all locales.
- Country information, language information, date formatting rules, and much more.
One of our primary goals in Commerce 2.x is to isolate critical eCommerce requirements such as this one and publish abstract solutions to them in single purpose PHP libraries that everyone can use. This is why we’ve created commerceguys/intl, an internationalization library.
It parses CLDR definitions into our own more compact YAML definitions and uses them to re-implement intl’s NumberFormatter and provide currency, country, language data.
$currencyRepository = new CurrencyRepository();
$numberFormatRepository = new NumberFormatRepository();
$currency = $currencyRepository->get('USD', 'fr');
echo $currency->getName(); // dollar des États-Unis
// Format a currency amount for France.
$numberFormat = $numberFormatRepository->get('fr');
$currencyFormatter = new NumberFormatter($numberFormat, NumberFormatter::CURRENCY);
echo $currencyFormatter->formatCurrency('12345.99', $currency); // 12 2345,99 $US
// Parse formatted values into numeric values.
echo $currencyFormatter->parseCurrency('12 2345,99 $US’', $currency); // 12345.99
Integrating it into Commerce 2.x
Commerce 2.x will pull in the commerceguys/intl library via Composer and import the currencies into configuration entities. This will allow users to modify the default currencies or create their own. Translated currency names and symbols will also be created for each language in the system. Adding a new language will lead to currency translations automatically being registered as well.
The price field will reference currency entities and use the NumberFormatter under the hood of the price formatter.
Commerce 2.x has dramatically improved its currency handling. By identifying the common problems and solving them in an independent library, we’ve allowed every Composer-enabled PHP project to take advantage of our efforts!
Join us every Wednesday at 3PM GMT+2 on IRC (#drupal-commerce) for the Commerce 2.x office hours if you’re interested in contributing with the help of our pleasant community.
Brilliantly summarized. Thanks for all your hard work, Bojan. : )
Another PHP library doing the same
concrete5 CMS had almost exactly the same problem which is why there's this library:
http://punic.github.io/ (doc - work in progress)
It uses JSON data without converting them and supports date formats. I thought about merging a few things together, but the approach is a bit different. Punic uses more static methods for example..
Who knows, maybe we can still benefit from each others work!
I had no idea punic existed, I guess my GitHub search wasn't as good as I thought.
I'd be very happy to compare notes, especially around date formatting which I haven't implemented yet.
The reason why we parse & convert JSON data is that it allows us to do a lot of deduplicating (don't store en and en_US if they're the same, for example), thus reducing
the size of the bundled data drastically.
It also allows us to have a clear 1-1 mapping between the data file and the matching class, which improves readability and allows us to
skip doing parsings & adjustments each time the JSON file is loaded (look at the shipped scripts/ for the operations that are done on the dataset).
I think both libraries
I think both libraries started on the same day, no chance for either of us to know about it ;-)
Removing unused data would be an easy task in punic. There's a build script which pulls the CDLR data where we could remove some parts. Afaik we never had the desire to do that, but it should be possible (https://github.com/punic/punic/blob/master/bin/build.php).
I'll talk to Michele (the one who did most of the work on punic) about whether there's a way to improve things for both libraries.
I love this
Its so great that you are doing this