Website Content Migration Checklist for Pages, Images, Cases, and Old URLs
The short version
The most underestimated cost in a website rebuild isn't development. It's content. Every page needs a Keep / Merge / Rewrite / Delete decision before anything moves. Images, PDFs, datasheets, case studies — they all get renamed against one rule, or three months later nobody can find a single asset. Old URLs need a one-to-one mapping table: 301 to the merged page when there's a successor, 410 when the page is genuinely gone, no orphan 404s. Chinese and English versions have to line up page-by-page, with every pair tagged "translated", "needs native review", or "do not translate". And a real QA pass at launch means clicking links, submitting forms, and spot-checking redirects, not just looking at the homepage and calling it done. A small team can do all of this. The discipline is writing decisions into a table instead of moving files on instinct.
In eight out of ten rebuild projects we inherit, the team underestimated this stage. The client expects design and engineering to be the bottleneck. The actual blocker turns out to be questions nobody wants to ask out loud: what year was this factory photo taken, is this PDF still accurate, who translated this English page and is it usable. The old site behaves like a five-year-old warehouse no one wants to walk into.
This checklist is for that situation. It's not an SEO primer or a WordPress tutorial. It's the working document for a small team whose CEO wants the new site live next month, whose old site has pages nobody can vouch for, and whose budget assumes the rebuild is mostly about templates.
1. Inventory
Always start by listing what's actually on the old site. Many teams skip this and start migrating directly, then halfway through realize there are forty abandoned campaign pages and five versions of the same datasheet that nobody can identify as current.
The cheapest inventory method is to export a sitemap, drop it into a spreadsheet, and tag every page with one label:
- Keep: page has traffic, has backlinks, content is still accurate. URL and copy stay roughly the same.
- Merge: several similar pages (e.g. three near-duplicate product family pages) collapse into one new page.
- Rewrite: the topic still matters, but the content is outdated or poorly written and needs a real rewrite.
- Delete: expired campaigns, discontinued products, an internal newsletter from 2018. None of it travels to the new site.
Pull data from three sources to make these calls: page views from Google Analytics for the last 12 months, impressions and clicks from Search Console, and referring domains from Ahrefs or Semrush. A page with no signal in any of the three, and no specific business reason to keep it, is a delete.
One warning: don't let one person do the inventory alone. Sales, marketing, and product each need a representative in the room, because Analytics can't tell you "this page was built specifically for the visit from a strategic customer last quarter." Business context lives outside the data.
2. File naming
Old asset folders usually look like uploads/2019/03/IMG_2837_final_v2_used.jpg, docs/产品手册-终版-修改版.pdf, case/案例3.docx. Carrying that into the new site solves nothing.
Define a naming rule before migration starts. For example:
- Images:
context-product-or-case-year.webp, e.g.factory-line-led-2024.webp,case-bmw-warehouse-2023.webp. - PDFs and brochures:
product-doctype-version.pdf, e.g.solar-inverter-datasheet-v3.pdf. - Case studies:
industry-client-year.md, e.g.automotive-bmw-2023.md.
This step looks cosmetic but is not. Once names follow one rule, the new site's content editors, blog authors, and salespeople all know where to find an asset. Image SEO and case-page schema markup later both depend on the same naming convention.
The working order:
- Copy every old asset into a read-only
archive/folder as a fallback. - Filter to only the files marked Keep or Rewrite.
- Batch-rename via a script, and save an old-name → new-name CSV for traceability.
- Re-encode images to WebP or AVIF, and run OCR on PDFs so they become searchable.
3. URL mapping
URL migration is the highest-risk step in the whole project. Get it wrong and search traffic collapses, backlinks break, and the link in your CEO's email signature now returns a 404. There has to be an explicit mapping table.
The table needs at minimum four columns: old URL, new URL, redirect type, owner. Only three redirect types are legitimate:
- 301 permanent: there's a successor page, even if the URL changed. This is the default for almost every kept or merged page.
- 410 Gone: the content really is retired and has no replacement. A 410 tells search engines explicitly that the page is gone, which is more cooperative than letting them hit a 404.
- Keep the URL: when the new architecture allows it, leaving high-value URLs unchanged removes a layer of redirect risk entirely.
A few common calls:
/news/2019-trade-show.htmlhas no value and no merge target → 410./products/led-panel-old.htmlcollapses into/products/led-panels/→ 301./about-us.htmlbecomes/about/(path-only change), has traffic and links → 301./blog/2018-internal-newsletter.htmlwas an internal weekly accidentally indexed → 410.
When implementing, test the chain. No redirect should hop more than once (avoid A→B→C). Status code must be 301, not 302. Mobile and HTTPS variants need to redirect too. This matches what Google Search Central recommends for site moves with URL changes. The SEO mechanics are covered in more detail in How to Preserve SEO During a Website Rebuild, and full-domain migrations are in SEO Migration Checklist for Old Domains and Websites.
4. Bilingual alignment
Many clients say their old site has an English version. Open it and the picture is messier: half the English pages are stubs, some link back to non-existent Chinese parents, and the translated copy is so machine-translated that the sales team refuses to send it to clients.
Migration is the moment to build a bilingual alignment table — every Chinese page paired with one of these statuses:
- Aligned: URL, structure, and CTA already match across both versions, ready to migrate as-is.
- Needs native review: an English draft exists but a native writer needs to rework it.
- Needs rewrite: the English page can't be salvaged and must be rewritten (not translated).
- Not translated: this page only serves the domestic audience and won't appear on the international site.
A common mistake here is treating "completeness" as "1:1 between languages". The overseas site doesn't need every "company news" item from the Chinese site. It does need clean, well-written service pages, case studies, and white papers, because those are what buyers actually read. The difference between localization and translation is unpacked in Localized SEO vs Direct Translation.
If the new site uses hreflang, the alignment table should also note the hreflang pairing for every page (e.g. zh-CN ↔ en) so configuration ships with the migration.
5. Post-launch QA
The most dangerous moment is right after migration, because everyone exhales. The real work starts there. A serious post-launch QA list includes at minimum:
- Links: run Screaming Frog over the whole site, confirm zero 404s and no redirect chain longer than one hop. Scan external links too. Replace or remove the dead ones.
- Images: every image has alt text, the file path resolves, and mobile loads correctly. Watch for old images embedded by mistake into rewritten pages.
- Forms: actually submit each inquiry form. Confirm the inbox receives it, the auto-reply fires, and the CRM has a record.
- SEO fields: title, meta description, Open Graph, canonical, schema all populated on every page. Service pages get Service schema, articles get Article schema.
- Redirects: pull 20–30 entries from the mapping table and test them by hand: popular pages, merged pages, 410 pages. Test mobile, HTTPS, www and non-www variants.
- Search Console: submit the new sitemap, file the change of address if the domain moved, watch the indexing and 404 reports for seven days.
- Analytics: on launch day, compare GA4 real-time against expected pages. Every priority page should be reporting.
A more systematic "what should be checked before launch" list is in Complete Website Renovation Audit Checklist.
6. Small-team tradeoffs
This list looks heavy for a small team. The realistic answer to "what can we cut?":
- Don't cut inventory, but make it fast. Three people, two days. List pages from the sitemap, vote on Keep / Rewrite / Delete, debate only the ties.
- File naming can phase. At launch, only the Keep and Rewrite assets must follow the new convention. The Merge-related orphans can wait a week.
- Don't cut URL mapping, and start it the moment the new site architecture is decided, not after development is done.
- Bilingual alignment can phase. Make sure service pages, case studies, and the homepage are native-reviewed in English at launch. Blog posts can ship with their existing translations and get replaced in batches.
- Don't cut QA. It's better to push launch by two days than to discover next week that a major form silently fails.
If terms like hreflang, canonical, or 301 are unfamiliar to anyone on the team, walk through the Overseas Website Glossary once before splitting up the work.
Migration ownership
| Area | Owner | Key output |
|---|---|---|
| Inventory | Content + sales + product | Sitemap tagging sheet (Keep / Merge / Rewrite / Delete) |
| File naming | Content | Naming convention doc, archive/ copy, rename CSV |
| URL mapping | SEO + tech | Mapping table (old → new → type → owner) |
| Bilingual alignment | Content | Page pair table with status |
| Launch QA | Tech + SEO | QA report (links, images, forms, SEO, redirects) |
FAQ
The old site has no sitemap. How do we inventory?
Run a crawler (Screaming Frog's free tier covers 500 URLs, enough for most SMB sites) and export the URL list. Layer in GA4's pages-with-traffic for the past 12 months and Search Console's pages-with-impressions. Merge the three lists, deduplicate, and you have a working inventory in a few hours.
A retiring page has good backlinks. We're afraid to lose authority. What should we do?
Decide whether the content is genuinely obsolete. If only the URL is changing and the content is still useful, 301 to the new page and most of the authority transfers. If the content is dead but the backlinks matter, consider keeping the URL with refreshed content, or 301 to the closest relevant page. The worst option is a 404, because you lose users and authority at the same time.
Do Chinese and English URLs have to match exactly?
Not strictly required, but match them at the path level. If Chinese is /zh/services/seo/, English should be /en/services/seo/. The pairing is obvious, hreflang configuration stays simple, and operations are sustainable. If Chinese and English URL patterns drift, the team eventually can't keep them in sync.
How long until SEO recovers after migration?
In our experience, a small migration with proper 301s, an updated sitemap, and consistent internal links shows traffic recovery in 2–4 weeks. A larger migration or domain change typically returns to pre-migration levels in 6–8 weeks, with fluctuation in between. If traffic still hasn't recovered after a month, the usual culprits are missing 301 entries, an unrefreshed sitemap, or an indexing issue specific to the new site.
Get a diagnosis
If you're preparing a rebuild, or you're halfway through and the old content is bogging the project down, bring your old URLs and read access to Search Console and Analytics. We'll run this checklist with you in a free initial review under our website rebuild service and tell you which pages must be kept, which can be merged, and which URLs must have 301s configured before launch day.