[๐ ] Cutting multilingual blog build time from 21 minutes to under 2 minutes
โจ GPT-5.5โs Summary ใ
A record of tracing a Jekyll build that exceeded 20 minutes after multilingual expansion through profile data, removing repeated rendering and full-site scans, and bringing it down to 1 minute 50 seconds.
In the previous multilingual setup post, I added the operating structure for a multilingual blog.
At first, I felt pretty proud of it.
ko, en, ja, zh-Hans, es, pt-BR, fr, id.
There were posts, menus, hreflang, and shared view counts. From the outside, it had become a fairly plausible multilingual blog.
Then another problem surfaced right away.
The build was taking far too long.
The first production build looked like this.
done in 1311.791 seconds
21 minutes and 52 seconds.
That is not the kind of number a simple blog should produce. If editing one post makes the build take more than 20 minutes, eventually I will spend more time waiting for builds than writing.
At first, it was tempting to think, โWell, there are eight languages now, so of course it got slower.โ
But that was too quick a conclusion.
The page count did increase. Still, reaching the 20-minute range meant something else was going on. So this time, instead of fixing by feel, I turned on jekyll build --profile.
The first culprit: calendar JSON was being embedded into every page
I first checked the size of _site.
_site: 1.2G
HTML total: about 840MB
840MB of HTML for a static blog.
That made no sense.
I opened one representative post, and the reason was immediately visible. The sidebar calendar was embedding the full post-list JSON for that language inline on every post page.
<script type="application/json" data-calendar-posts>
[
...
]
</script>
That was not the only issue.
When building the calendar fallback list, Liquid was scanning the full post list again for every date. Posts that did not match still produced a huge amount of Liquid whitespace. The visible UI was small, but inside the HTML there was a giant block of hidden whitespace and repeated lists.
This was not a feature bug. It was a structural bug.
The calendar did not need the server to render a complete HTML version on every page. JavaScript was already able to draw the calendar dynamically. In that case, the server only needed to provide the basic shell for the current month and a path to the data file.
So I moved the calendar data into separate JSON files per language.
/assets/data/calendar-posts-ko.json
/assets/data/calendar-posts-en.json
/assets/data/calendar-posts-ja.json
/assets/data/calendar-posts-zh-Hans.json
/assets/data/calendar-posts-es.json
/assets/data/calendar-posts-pt-BR.json
/assets/data/calendar-posts-fr.json
/assets/data/calendar-posts-id.json
And left only this much on the page.
data-calendar-posts-src="/assets/data/calendar-posts-en.json"
The result dropped immediately.
1311.791 seconds -> 745.273 seconds
Almost half the time disappeared, but it was still long.
That is when I realized it.
The calendar was a big culprit, but it was not the only culprit.
The second culprit: menu stats were being calculated 80,000 times
The next profile was even more blatant.
_includes/sidebar-nav-stats.html 81120 calls 173.084s
_includes/masthead.html 2704 calls 319.860s
_includes/seo.html 2704 calls 119.985s
sitemap.xml 1 call 117.495s
The funniest part was sidebar-nav-stats.html.
This include is a tiny piece that attaches the post count and latest post time next to each sidebar category.
For example, something like this.
Daily Review (310) 1 days ago
Devlog (24) 2 days ago
But every time this tiny piece was called, it sorted the entire post list again and filtered it again.
For each menu item in each language.
On each page.
Again in the desktop sidebar and the mobile menu.
In the end, it was called 81,120 times.
This value does not need to be recalculated for every page. If the language and menu URL are the same, the result is the same. So I switched it to include_cached from jekyll-include-cache.
{% include_cached sidebar-nav-stats.html url=child.url lang=current_lang %}
Then the call count changed like this.
81120 calls -> 210 calls
173 seconds -> 0.4 seconds
That felt less like optimization and more like catching a bug.
The third culprit: the site kept scanning everything to build language links
While adding multilingual switching, I had put logic like this into masthead, seo, and sitemap.
Does this language URL actually exist?
The intent was right.
I should not put a nonexistent translated URL into hreflang or the language switcher. So at first, I checked URL existence by looping through all pages and all collection documents.
The problem was that this repeated on every page.
2704 pages * site.pages scan * translated collections scan
That structure only gets worse as a multilingual site grows.
So I changed the method.
This blog already has a rule for translated URLs.
/some/post/
/en/some/post/
/ja/some/post/
...
And exceptions can be managed separately as data.
This time, I put the session review post whose translation was still pending into _data/i18n_pending.yml.
entries:
- source_url: /devlog/github-pages-blog/github-pages-blog-english-version-lessons/
locales:
- en
- ja
- zh-Hans
- es
- pt-BR
- fr
- id
With that, normal posts are connected by the prefix rule, and pending posts fall back to the other languageโs home page. There is no full-site scan.
The impact was large.
masthead: 319.860 seconds -> 9.882 seconds
seo: 119.985 seconds -> 7.181 seconds
sitemap: 117.495 seconds -> 4.633 seconds
And the final build ended like this.
done in 110.344 seconds
From 21 minutes 52 seconds to 1 minute 50 seconds.
That is still not enough to call the blog fast, but at least it escaped the state where I was too scared of the build to write.
What I had to be careful about while fixing it
If you only look at the speed, this optimization looks easy.
But the part that really needed care was avoiding broken features.
In a multilingual site especially, it is easy to celebrate a faster build and then create problems like these.
language switch links go to 404s
hreflang points to nonexistent URLs
pending translations are exposed as alternates to search engines
About/language buttons disappear again on mobile
the calendar stays empty
So at the end, I used both browser checks and automated checks.
These are the things I checked.
production build succeeds
the travel post has working language switch links for 8 languages
hreflang works for 8 languages + x-default
pending translation posts fall back to the other language home
About, ๐บ๐ธEnglish, and the calendar display correctly on mobile
representative multilingual URLs return 200
the rendered structure of 2481 source posts is checked exhaustively
calendar JSON parses correctly
i18n post coverage errors: 0
I did not read all 2481 posts one by one with human eyes.
But at minimum, the automated check covered whether the output files exist, whether the post structure rendered, whether language links are not broken, and whether calendar data exists.
That was the core of this work.
Build optimization is not only about reducing time. It is about restating the existing feature contracts.
What I learned this time
Jekyll looks simple because it is a static site generator.
But once Liquid starts scanning the whole site again and again, even a static site can get plenty heavy.
In a multilingual structure especially, small inefficiencies immediately become multipliers.
page count * language count * menu count * total post count
If that kind of multiplication is hiding somewhere, the build will suddenly blow up later.
What I learned this time is simple.
First, do not inline the same data repeatedly on every page.
Second, cache includes when the same input produces the same output.
Third, do not scan the whole site from every page just to check whether URLs exist.
Fourth, keep exceptions as data instead of handling them by feel.
Fifth, after optimization, verify the feature contracts with automated checks.
This blog is slowly becoming less of a simple personal blog and more of a system.
That is both good and tiring.
But at least this time, the tiring part meant something.
Pulling a build that took more than 21 minutes down into the 1-minute range felt like a pretty big turning point.
Now the multilingual blog has at least enough breathing room to keep growing.
Leave a comment