[π ] Moving Naver Blog PDF Backups to GitHub Pages
β¨ GPT-5.5 Summary γ
A record of extracting 173 posts and 1,521 images from 18 Naver Blog PDF backups and transplanting them back into the existing GitHub Pages blog structure.
I wanted to bring the posts I had accumulated on Naver Blog back into my GitHub Pages blog.
More precisely, this was not merely about storing backup files somewhere. I already had written posts. They had dates, images, categories, and the thoughts I had during those periods. But those records were sitting separately in another house called Naver Blog.
In the end, I wanted to rebuild this blog as the center of my records. A GitHub Pages blog is simple, but I can stack records in the structure I want.
This time, though, it was not a matter of writing one new post.
I had to take 18 Naver Blog PDF backups and move the posts and images inside them back into the existing Jekyll blog structure.
I Started by Setting the Conditions
From the beginning, the goal was simple.
I wanted to bring in the Naver Blog backups, but make them read as if they had originally belonged inside this blog.
I set a few conditions.
- Extract all posts from the 18 PDFs without omissions.
- Preserve each postβs date and original link.
- Place images under
assets/images/YYYY-MM/YYYY-MM-DD/according to the existing blog convention. - Continue the
Today Wasseries numbering from the existing posts. - Do not mix restaurant, travel, AI, development, and similar posts into the
Today Wasnumbering. - Do not force posts into existing categories; create new categories if needed.
- Do not import broken PDF sentences as-is.
- The result must be buildable Jekyll posts.
Written like this, it sounds ordinary. But once I actually did it, it was not just file copying.
It was moving a record system from one system into another.
I Could Not Trust PDF Text Alone
At first, I thought extracting text and images from the PDFs would be enough.
The posts were extracted. The images were extracted too. But the problem was the body text. Sentences brought in from the PDFs were broken in strange places.
For example, they looked like this.
Trying to control that enormous storm all by myself as quickly as possible, that excessive drive itself was the biggest cause making me feel helpless
because of that.
One sentence was split like paragraphs, words were broken apart, and the reading rhythm was ruined.
If I migrated them in that state, they might be backed up, but the posts themselves would be damaged. They would feel less like writing that a person reads and more like traces torn out of a PDF.
So I changed direction.
I used the PDFs as the starting point for post lists and image extraction, then reread the original Naver HTML to recover the body text. I followed the Naver editorβs paragraph, list, and quote flow and rebuilt the body as Markdown.
Only then did the posts become posts again.
I Matched Images to the Existing Blog Style
Images mattered too.
Naver posts had many images. Especially for travel posts and restaurant posts, images were almost the body itself. If I moved only the text, the record would become half-empty.
In the end, I imported 1,521 images.
Image paths followed the existing blog convention.
assets/images/2025-09/2025-09-09/naver-004-001.jpg
I organized filenames with year-month, date, and Naver import number. That way, even later, I can trace which date and which import an image came from.
In the body, I used normal Markdown image syntax.

This kind of simplicity matters in a static blog. After the build, it is just files. I do not need to depend on a separate image server or external links.
I Split Categories Again
The most delicate part was categories.
At first, I wondered if I could roughly put the Naver posts under diary. But if I did that, it would be hard to find posts later, and the blog structure would blur.
So I created new categories.
diary life
diary thought
diary relationship
diary restaurant
diary travel
I also used existing categories such as diary ai, diary dev, and diary religion. Reading/mindset posts went under reading mindset, app introduction posts under tip app, and blog-building records under devlog github-pages-blog.
Creating categories does not end with moving one file.
Category pages are needed. Sidebar navigation is needed. Category labels and links shown in archives need to match. The icon in front of each title also needs to fit the existing blog convention.
Restaurant posts were organized with [π½οΈ], AI posts with [π€], development posts with [π§βπ»], travel posts with [π§³], and so on.
These details may look small, but if they are messy, imported posts continue to feel like foreign objects that came from outside.
I Kept Today Was Numbering Separate
The easiest part to get confused about was Today Was numbering.
The Today's Verification posts from Naver were essentially Daily Reviews. So they had to continue from the existing blogβs Today Was series.
By contrast, restaurant, travel, AI, and reading posts are not part of Today Was, no matter how close their dates are. If those posts are mixed into the numbering, the series itself breaks.
The final result was aligned like this.
Today Was #1 ~ #200
The numbers continued from 1 to 200 with no omissions or duplicates. I also checked that non-Daily Review posts did not contain Today Was # numbering.
This was not simply organizing numbers.
It was preserving the identity of the series.
Verification Was Half the Work
What is scary about this kind of migration is that it can look plausible on the outside while one thing after another is subtly wrong.
An image file might be missing while the Markdown reference remains. Category front matter and the actual folder could diverge. Title icons could differ from the existing convention. Broken ? icons from the PDF could remain in the body.
So I ran separate verification.
The checks were roughly these:
Imported posts: 173
Image references: 1,521
Missing images: 0
Visible standalone ? remaining: 0
Today Was numbering: #1 ~ #200
Non-Daily Review numbering mixed in: 0
Category folder mismatches: 0
At the end, I also ran a Jekyll build.
bundle exec jekyll build
With a static blog, passing the build is when I can finally feel relieved. If a Liquid syntax error in even one Markdown file breaks, the entire site can stop.
Result
In the end, I moved 173 posts and 1,521 images from 18 Naver Blog PDF backups into this blog.
But the numbers are not the most important part.
This work was not a simple backup. It was restoring scattered records into one system.
PDFs, Naver HTML, Jekyll front matter, category pages, sidebar navigation, image paths, and series numbering all had to align. If just one thing was wrong, the context of the records would break.
To others, it may look like I simply moved posts. But for me, it was work to reorganize the record system.
I did not just bring in many posts. I decided again how to structure the records I had built, how to recover broken data, and how to settle them into the conventions of the existing system.
Writing records matters, but holding onto them so they are not lost matters too.
This work was closer to that.
Leave a comment