Hybrid i18n with Next and Astro (part 1)

Cover image

In this series, I relate my quest for the perfect internationalization system (i18n) for hybrid frameworks, namely Next.js and Astro. This story involves nasty technical concepts, but also teapots 🫖, so it's probably worth the read.

The end goal is to improve the Devographics translation system, which powers the State of JavaScript, CSS, HTML, React and GraphQL surveys. It is used both by the Next.js survey app and the new Astro results app which is not released yet.

Part 1: tokens are all you need

Let's dive in the most infuriating issue with internationalization: loading translations that are not actually used in the app.

I want to design a system that will only pick the translation tokens required by the current page. Many libraries do that, but I am not satisfied with them and I'll explain why as we go. But first let's try to better frame the problem.

Splitting dictionaries is not optimal

To translate content, you need to build dictionaries, matching tokens and translated expressions. Next.js official documentation shows how to load the right dictionary in your application.

"en": {
    "users.count": "There are {count} users",
    "home.title": "Hello World"
},
"fr": {
    "users.count": "Il y a {count} utilisateurs",
    "home.title": "Allo le monde"
}

When the dictionaries become too big, we split them by context, meaning business context or usage: tokens for forms, tokens for the homepage etc.

/** 
Will load the tokens out of three subdictionnaries, 
that we call "contexts" 
*/
const tokens = await fetchLocale({
  localeId: "fr-FR",
  contexts: ["common", "homepage", "contact_form"],
});

This split is a mix of technical and organizational concerns. It's fine but far from optimal in terms of how many tokens we load for a given page.

Performance impact in the Devographics codebase

My concern are about performance and optimization, so let's take a closer look at the actual situation in Devographics apps.

By looking at the network tab of the browser devtools, we see the list of 1659 tokens loaded by the homepage of the 2022 State of JS result app. It contains tokens from generic contexts ("homepage", "common", "results") and from specific contexts ("state_of_js", "state_of_js_2022").

JSON data loaded by Gatsby, containing approximately 1600 tokens

The loading time is around 75ms. That's far from negligible. The lack of interactivity while the file is loaded and then the app is hydrated may end up being noticeable.

Loading time of the request that gets the dictionary in Gatsby

In the Next.js survey app, tokens end up in the main page HTML (see some explanations in this discussion). That's roughly 30ko of translations, on a total of 35ko. It's not that huge, but still 85% of the total size of the page! 30ko is the size of a properly optimized image.

Dictionary in the page HTML in Next

The page is not even that complex, it contains ~30 pieces of text, there is clearly some overloading happening.

The surveyform homepage

These applications are using modern JavaScript frameworks: Astro and Next.js. Both are complex beasts that blend multiple mechanism to render content. We need to better understand the relation between internationalization and rendering.

Client vs server concerns

In an Astro or a Next.js application (with App Router), we have 2 kind of components. Client components, such as React client components and Astro interactive islands, are rendered client-side. They can be prerendered server-side too but that's not relevant here. Server-only components (RSCs, .astro files) are rendered only once, server-side.

With regards to i18n, here are the two problems we want to solve :

RSCs are ideal for text content because they are not associated with client-side JavaScript. But the tricky part is nesting. We cannot use a traditional React context for server components. We are going to describe an alternative pattern and our implementation in the Devographics codebase in part 2 of this series. Hint: it's going to involve React cache function.
Client components are rendered in the browser, this means the end user has to download the dictionary of translation tokens. This often leads to overfetching, and this is a relatively complex issue to solve. We will deal with this issue in part 3 of this series.

Next step: server components

Now that we have a better understanding of the concerns regarding i18n in hybrid frameworks like Next or Astro, we are ready to work on solutions! See you soon for the next article.

Part 2 "Server Components, we have a deeply nested problem" is out!

Part 3 "From tokens to token expressions" is out!

Part 4 "Optimized client-side translations" is out!

A bientôt !