Have you ever been involved in a website migration that could have gone better?
Perhaps you’ve got some form of migration coming up, and you want to make sure that everything goes as smoothly as possible.
Today we’re talking about surviving and thriving website migrations with a former senior member of the famed Google Search Quality team and among the select few former Googlers with extensive policy driving, webspam hunting, and Webmaster outreach experience. Nowadays, he applies his skill set to recover websites from Google penalties and help clients maximize the potential of their websites and search engines. A warm welcome to the In Search SEO podcast director at Search Brothers, Kaspar Szymanski.
In this episode, Kaspar shares five steps to survive a website migration, including:
- Record server logs
- Conduct an SEO audit
- Address legacy issues
- Plan ahead and choose the right moment
- Monitor your progress
5 Website Migration Tips
Kaspar: Thank you for having me. It’s always a pleasure.
D: Thank you so much for joining me. You can find Kaspar over at searchbrothers.com. Today, you’re sharing five essential steps to surviving and thriving website migrations starting with number one, record server logs.
1. Record server logs
Absolutely. One of my pet topics, if I may say so. server logs are very much underutilized in many circumstances and in the context of migrations in particular. In a nutshell, server logs allow for insight with regard to what search engines actually care for in terms of landing pages. When we do record server logs, we can actually tell how much these two groups of landing pages overlap. The group of landing pages that have been crawled regularly and the group of landing pages that we don’t want to be indexed and ranked. This is just one application, the most basic one if you want.
However, server logs can’t be retrieved in any way. Unless we start recording server logs, we do not have that insight, and there is no way to regain it. Unless we start recording server logs and it has to happen over an extended period of time. So when we’re talking about preparations for a migration of sorts, whether that’s a content migration or a domain migration area, we need to have those server logs recorded for quite a while in order to gain an understanding. What is it that Google is crawling? It is also a very important preparatory step for the subsequent second step of conducting a successful migration. But I don’t want to steal the thunder and I’ll let you talk about that for a moment first.
D: Exactly. A brief follow-up question on the first point. I think many SEOs are probably a little bit guilty of just relying on a single analytics package. So what’s an example of a piece of data that is available in server logs that SEOs won’t have access to just by looking at standard analytics?
K: I think these are two rather complex questions, one of them being what kind of tools can we utilize and I always say to tap into everything that you can, it really depends on the budget, manpower, level of expertise, and technical acumen in your team. At the most basic level, it is, of course, Google Search Console. This is the one tool that every SEO should utilize in order to gain a basic understanding of how Google’s algorithms see my website.
There is of course Bing Webmaster Tools. I’m a huge fan as it is another major search engine and it’s another opportunity to either verify the findings from Google or gain different insights and compare these against each other. There are some amazing commercial tools out there. I’m a big fan of Ryte and DeepCrawl which has been rebranded to Lumar. Spotify is a phenomenal tool.
All of these provide insights that can be converted against each other to some extent, sometimes more, sometimes less. There is the crawling component, there is on-page, technical, off-page, all of these insights. But none of these can give us the basic insights that server logs can provide. To begin with, tapping into that reservoir of data and figuring out what our server responses or if our server and responses are in fact 200 Okay for those landing pages that we deem to be valuable and desirable. Or are we returning something else altogether? Are these 200 Okay or 404 error pages? There may be expired documents, expired products, or unavailable products which is a huge problem for commercial or retail websites.
On the most basic level, this tells us our responses from our site to search engines. And what is it that those search engine bots prioritize? How much does this overlap with what we’re looking at as the top-priority landing pages? Now, when the website is rather small, a couple of 100 or 1000 documents, that doesn’t sound like such a huge problem. However, if we happen to be talking about large websites, tens of thousands of documents, and landing pages or even millions, this is a problem of a completely different magnitude. In that instance, preserving server logs for perpetuity and utilizing those year-over-year is an absolute must in order to gain an advantage in organic Google search.
D: I get the feeling that you could give an hour presentation on each step in the sequence here. But let’s move on to number two, in terms of website migrations, and that’s to conduct an SEO audit.
2. Conduct an SEO audit
K: Yes, and that is best conducted really in conjunction with utilizing those server logs. It is not a must. And I also want to say that the vast majority of commercial websites out there do not save and preserve server logs, or they don’t do it completely. It is also possible for these websites to conduct a technical audit. And the reason for doing that is that when we conduct a migration, we’re looking for a positive result for not only a better user experience for a faster, smoother website but also for growing rankings. And this is something that we do not want to approach with any legacy issues. Those legacy issues can be manifold. There can be off-page backlinks that are holding the website back.
But there are other factors. There are technical solutions that may have been great back in the days when they were implemented. But maybe they aren’t working for us anymore. Content that doesn’t stand the test of time content that was considered evergreen at some point in time, but isn’t anymore. Google penalties are of course a big issue if any are applied and that haven’t been resolved. This is something that would be part of a consideration when we’re conducting an audit. All of these factors have to be looked at, reviewed, prioritized, and tackled before we conduct the migration because we do not want to start that step towards a more bright future with the handbrake on and being hampered by all these things that haven’t been addressed yet.
And that brings us, of course, to the next step in the process. But I’ll let you do the talking first.
D: Yes, step number two was conducting an SEO audit but you did mention and give a couple of examples of addressing legacy issues, which is number three. Are there other areas of legacy issues that you’d like to talk about as well?
Outperform Your Competition - in Every Marketing Channel
The all-in-one solution for data-driven marketing planning and competitor analysis
3. Address legacy issues
K: It depends. Every website is different. But frequently it is websites that are organically and historically grown in previous migrations or maybe two different websites have been combined and merged into one. They may have been either blogs, forms, Wiki pages, or FAQ pages that came into existence in the past. And back in the day, they used to be useful, but nowadays, they’re legacy and historical landing pages that aren’t useful anymore.
The big issue is the Href Lang application for international websites. And there is the canonicals. Chronicles are a phenomenal way to tell the major league search engines. This is among many documents that are rather similar. This is the one that I care for. However, if they aren’t applied properly, and thoroughly, they can actually backfire in a major way. So just naming this handful of technical factors, content factors and factors that relate to how up-to-date the content is, these just show how complex this topic may be. Which is reasonable why such an audit needs to be prepared thoroughly, it takes time. Which brings us to another important factor of migrations, migrations do take time, they need to be planned and prepared much in advance so that all these steps can be factored in.
Please allow me to continue by sharing a few words about the third step, which is rather important and that is addressing those legacy issues. From my experience working with large organizations and websites, it is critical not only to conduct that audit, not only to figure out what’s holding the website currently back, it is also important to address those issues. For instance, if we happen to have a lot of products that are expired or unavailable for whatever reason, which return a 200 Okay status. However, it says on the actual landing page that we don’t have that product. That is going to trickle down to user signals because users come from search and they anticipate to have the product available that they’ve been looking for. They end up on the landing page, they look at the product that it isn’t available they would, of course, return back to Google search, and look for an alternative or rerun the query. This user behavior is indicating they haven’t found what they’re looking for. This isn’t great. This is a very negative signal.
Now, if we merely move those landing pages to the new platform, we will be transferring those negative signals. Now, this is a big issue if the site is sizable, but the larger the website is, the more of an issue it becomes. And the reason for that is crawl budget, a topic we haven’t talked about much yet today. Because it is critically important to understand how much time Google needs to recrawl my website, or at least the bits and pieces that I care for. And in some instances, it’s going to be almost overnight. A website of a couple of 1000 pages is not going to be an issue. But if we happen to be looking at a website that is 100,000 landing pages that are desirable, and several million landing pages that are not aligned with the crawl budget, and the crawl distribution indicates that it will take Google a month, two months, six months, or even longer than that to recall a significant part of the website, we won’t be able to benefit from any migration move because it will take too long. We have to first address the crawl budget issue. We have to first make sure that the crawl budget is distributed in a way that is more favorable for us so the website can be crawled faster. For that reason, it is critically important to address those legacy issues, not only understand they are there, but also address those.
Backlinks is a very good example here in this context as well, because we may be in a position where we move to a more bright future with the migration to let’s say a new CMS. However, the backlink profile is rather problematic, say in a very competitive environment, travel or finance products, insurance, all these things. So there might be legacy backlinks that are perceived as unnatural backlinks by Google. And this is something that may trigger a manual spam action which is something that may hold us back on an algorithmic level. So we have to have we have to address those utilizing the disavow file or even asking for reconsideration requests if there is a penalty in place. All of these things have to be done before we do the actual migration. Here’s another example where a legacy is best addressed before doing the ultimate step that we’re talking about here.
D: And step number four, plan ahead and choose the right moment.
4. Plan ahead and choose the right moment
K: Yes, and this is very important because many website operators do not factor in seasonality. Now, depending on the vertical that we work in, depending on the crawl budget, we can anticipate when it is going to hit us. Let’s say that Q4 is where 50% of the revenue is being generated. We want to go ahead and avoid having all those turbulences that are naturally associated with any kind of migration prior, in the run-up, or during that period, because that’s going to impact your revenue generation. So planning a long-term plan is really important.
It also ties into our crawl budget situation, because we need to understand how long it is going to take for Google and other major search engines to recrawl the website if it’s really large. And this is something that we need to factor in yet again. Q4 is a very good example. In retail, there’s a lot of business being generated in Q4, which for many businesses makes the beginning of Q1 a very good time to go ahead, conduct the migration, and see what the new data looks like. This is something that is an SEO decision but it does trickle down to the business size side of things. And this is very important, particularly in challenging, economical times, where the business may not be quite as flourishing as we were typically used to based on previous years.
So planning ahead, long-term planning is very important. And factoring in and including all stakeholders. Such migration is something that is primarily being done by the technical team, but the content team, the link-building team, the marketing team, and many others, all of these folks may be involved at some level. So it’s important to loop them in. And to make sure this is a team effort that is being conducted lead by one person, one decision maker, but where people all pull on the same string.
D: That takes us up to step number five, monitor the progress.
5. Monitor your progress
K: Yes, that involves server logs so we have come full circle. Server logs allow us insight into how we fared prior if we have those recorded. And of course, we want to make sure that this was fresh server logs. We have an insight into how well are we doing, especially when we happen to be moving bits and pieces of the website in a successive kind of way, so we don’t have to migrate the entire website altogether in one go. If it’s a large website, we can start with subdomains or directories, bits, and pieces. So you have how Google and other search engines embrace those new patterns. And now how do they prioritize those? So monitoring the progress over time is critically important. It isn’t done by merely pushing the button and conducting the migration and leaning back. The most fretful and teeth-grinding moment is when it happens and we need to keep an eye on how things are progressing. Most of the time, there are fluctuations that are completely normal. That is something to be anticipated. But we do not want to see a situation where those fluctuations are excessive, and where they persist over an extended period of time.
The Pareto Pickle – Record Server Logs
D: Superb. Let’s finish off with the Pareto Pickle. Pareto says you can get 80% of your results from 20% of your efforts. What’s one SEO activity that you would recommend that provides incredible results for modest levels of effort?
K: That is very challenging, David. I have to say it depends on the individual website, the kind of resources that can be applied, or the manpower behind it. Modest efforts are rather relative. For large websites, I would still say it is recording server logs. The initial effort is minimum, but it is something that will be quite a challenge and overkill for small websites. But for a large commercial website, with a lot of landing pages, saving and preserving several logs is a rather modest effort. The cost involved is negligible, we’re talking about a couple of hard drives, and the files themselves can be Gzip. There are no legal challenges to be expected. These are your own server logs. And this is something that can rest. There is not much else that needs to be done unless you want to tap into those vast reservoirs of information.
So in my opinion, the effort and the return on investment are quite favorable when we start saving and preserving server logs. As you can tell, server logs is one of my favorite topics, but it is one that is the most promising for SEOs. This is where the big gains really lie.
D: I’m your host, David Bain. You can find Kaspar Szymanski over at searchbrothers.com. Kaspar, thank you so much for being on the In Search SEO podcast.
K: David, it’s been a real pleasure. Thank you for having me. And I look forward to the next opportunity for us to have a discussion about SEO.
And thank you for listening.
Darrell creates SEO content for Similarweb, drawing on his deep understanding of SEO and Google patents.
Related Posts
The #1 keyword research tool
Give it a try or talk to our marketing team — don’t worry, it’s free!