ADVERTISEMENT

Do search engines understand the navigation of your website?

ADVERTISEMENT

If you have a page that you’d like search engines to locate, but it’s not connected to any other page that’s almost inaccessible. A lot of websites make the error of arranging their navigation in ways that make them unaccessible by search engines. This can result in which hinders their ability to appear within search results.

Common mistakes with navigation that prevent crawlers from seeing the entirety of your website:

  • Mobile navigation gives different results than your desktop navigation
  • Any kind of navigation in which the menu items aren’t within the HTML such as JavaScript-enabled navigations. Google has become more proficient in crawling and comprehending Javascript however, it’s not yet a complete method. The most reliable method to make sure that something is discovered, understood, and then indexed by Google is to put it in HTML.
  • Individualization, or showing distinct the navigation of a certain kind of user versus other visitors might seem like cloaking the crawler of a search engine.
  • Not linking to the primary page of your site through navigation menu — remember that links are the pathways crawlers use to navigate to get to new pages!

This is the reason it’s crucial for your website to have an easy navigation system and useful URL folder structure.

Do you have a clear information architecture?

The term “information architecture” refers to the method that organizes and labels the content on a site to increase the efficiency and accessibility of users. The most effective information architecture is intuitive, which means that visitors shouldn’t need to think long and hard to navigate your site or find the information they need.

Are you utilizing sitemaps?

A sitemap is exactly what it says is an index of URLs on your website that crawlers could use to locate and index your site’s content. One of the most efficient methods to ensure that Google finds your top most important webpages is to create a document that is compliant with Google’s guidelines and submit it via Google Search Console. While submitting a websitemap won’t substitute for the necessity of good navigation of your website, it can definitely assist crawlers in following the correct path to all your pages that are important to you.

Be sure to include only URLs you would like to be indexed by search engines. Also, make sure that you provide crawlers with constant guidelines. For example, you shouldn’t include a URL on your sitemap if you’ve blocked the URL using robots.txt or included URLs within your sitemap that are duplicates, rather than the most popular, canonical version (we’ll provide more information about canonicalization later in Chapter 5!! ).

If your site isn’t linked to any other websites linking with it, then you could be able make it more visible by sending your XML sitemap to Google Search Console. It’s not a guarantee that they’ll include your URL within their search results, but definitely worth to give it a shot!

Are crawlers receiving errors when they attempt to connect to your website’s URLs?

While exploring the URLs on your website the crawler could be faced with errors. It is possible to use Google’s the Search Console’s “Crawl Errors” report to identify URLs where it could be happening. this report will reveal server errors and also not identified errors. Log files from servers can provide this information and an abundance of additional information, such as crawl frequency. However, since getting access to and dissecting log files of servers is a more sophisticated method and requires more expertise, we will not discuss this in detail inside our Beginner’s Guide, although you are able to get more details on this page.

Before you do anything significant using your crawl errors report you must to be aware of server errors as well as “not found” errors.

4xx Codes: If crawlers of search engines can’t access your website due to a problem with the client

4xx errors are errors caused by the client and mean that the requested URL is not properly formatted or does not meet the requirements. One of the most frequent fourxx error messages is “404 – not found” error. It could be caused by the result of a URL mistake, deleted pages, or a broken redirects, just to mention a few instances. When search engines come across the 404 page, they cannot access the website. If users encounter an error page, they may be frustrated and quit.

5xx Codes: If crawlers of search engines can’t access your website because of a server error

5xx errors are server-related errors, which means that the server that the page is hosted on did not satisfy the searcher’s or search engine’s request for access to the page. The Google Search Console’s “Crawl Error” report, there is a tab devoted to errors like these. The reason for these errors is that the request to the URL ran out of time, so Googlebot stopped the request. Read Google’s manual to find out more about solving issues with connectivity to servers.

Fortunately, there’s an effective method to notify the search engine and the user that your site has changed by redirect 301 (permanent) redirect.

A 301 code signifies that the page has been moved permanently to a different site, so don’t redirect URLs to pages that aren’t relevant -URLs in which the previous URL’s content isn’t currently in use. If a webpage is ranked in a search and you redirect it to a URL that has other content, it may fall in ranking position due to the information which made it relevant to that specific query is no longer there. They are powerful and can be used to move URLs in a responsible manner!

There is also the option of redirecting a webpage using 302 however, it should be reserved for short-term changes and in situations where the transfer of link equity isn’t as important as it sounds. 302s can be thought of as an alternative route. They temporarily divert traffic along an exact route however, it’s not the same forever.

After you’ve verified that your website is optimized for crawlability the next step is to ensure that it’s crawlable.

Indexing What do search engines understand and store your webpages?

After you’ve confirmed that your website is crawled, the next thing to do is to verify that it’s found. It’s true that even if your website is able to be crawled and found through a web search engine does not necessarily mean it’ll be included within their database. In the previous article about crawling, we talked about the ways that search engines find your website’s pages. The index is the place where your pages that are discovered are stored. When a crawler discovers an internet page, the search engine displays it as it would a browser. When it does this, the search engine scrutinizes the contents of the page. The entire information it gathers is saved within its database.

Find out more about the process of indexing and how to ensure your site is included it into this vital database.

What can I do to see how the Googlebot crawler views my page?

Yes that cached copy of your site will reflect the image of the previous moment that Googlebot was able to crawl it.

Google is a crawler and caches websites at various times. More established, well-known sites that post frequently like https://www.nytimes.com will be crawled more frequently than the much-less-famous website for Roger the Mozbot’s side hustle, http://www.rogerlovescupcakes…. (if only it were real…)

You can also look at the version that is text-only of your website to check whether your content is crawled and efficiently cached.

Are there ever any pages deleted from indexes?

Yes, webpages are able to be removed from index! One of the primary reasons why a URL could be removed are:

  • The URL returns an “not found” error (4XX) or server error (5XX) This may be an accident (the site was relocated, and an 301 redirect was not created) or even deliberate (the page was removed and 404ed to remove it of the search results)
  • The page was tagged with an meta tag that said noindex – This tag may be added by website owners to tell the search engine not to exclude the site to its index.
  • The URL was manually penalized for not complying with Google’s Webmaster Guidelines and, as consequence, it was removed off the list.
  • The URL has been stopped from crawling by the introduction of a password to visit the site.

If you suspect that your website’s page that was once included in Google’s index isn’t being indexed and you want to know why, utilize this Tool for URL Inspection to determine the status of your page or, alternatively, make use of Fetch with Google that has an “Request Indexing” feature to send individual URLs to the index.

Let search engines know how they can index your website

Robots meta directives

Meta directives (or “meta tags”) are the guidelines you can provide to search engines on how you would like your website page to be handled.

You can inform crawlers of search engines things such as “do not index this page in search results” or “don’t pass any link equity to any on-page links”. These directives are implemented through Robots Meta Tags on the headers of the HTML pages (most often utilized) or through the X-Robots tag within your HTTP header.

Indexing What do search engines understand and store your webpages?

After you’ve confirmed that your website is crawled, the next thing to do is to verify that it’s found. It’s true that even if your website is able to be crawled and found through a web search engine does not necessarily mean it’ll be included within their database. In the previous article about crawling, we talked about the ways that search engines find your website’s pages. The index is the place where your pages that are discovered are stored. When a crawler discovers an internet page, the search engine displays it as it would a browser. When it does this, the search engine scrutinizes the contents of the page. The entire information it gathers is saved within its database.

Find out more about the process of indexing and how to ensure your site is included it into this vital database.

What can I do to see how the Googlebot crawler views my page?

Yes that cached copy of your site will reflect the image of the previous moment that Googlebot was able to crawl it.

Google is a crawler and caches websites at various times. More established, well-known sites that post frequently like https://www.nytimes.com will be crawled more frequently than the much-less-famous website for Roger the Mozbot’s side hustle, http://www.rogerlovescupcakes…. (if only it were real…)

You can also look at the version that is text-only of your website to check whether your content is crawled and efficiently cached.

Are there ever any pages deleted from indexes?

Yes, webpages are able to be removed from index! One of the primary reasons why a URL could be removed are:

  • The URL returns an “not found” error (4XX) or server error (5XX) This may be an accident (the site was relocated, and an 301 redirect was not created) or even deliberate (the page was removed and 404ed to remove it of the search results)
  • The page was tagged with an meta tag that said noindex – This tag may be added by website owners to tell the search engine not to exclude the site to its index.
  • The URL was manually penalized for not complying with Google’s Webmaster Guidelines and, as consequence, it was removed off the list.
  • The URL has been stopped from crawling by the introduction of a password to visit the site.

If you suspect that your website’s page that was once included in Google’s index isn’t being indexed and you want to know why, utilize this Tool for URL Inspection to determine the status of your page or, alternatively, make use of Fetch with Google that has an “Request Indexing” feature to send individual URLs to the index. (Bonus that GSC’s “fetch” tool also has the “render” option that allows you to determine the if there is any issue regarding how Google interprets your site).

Let search engines know how they can index your website

Robots meta directives

Meta directives (or “meta tags”) are the guidelines you can provide to search engines on how you would like your website page to be handled.

You can inform crawlers of search engines things such as “do not index this page in search results” or “don’t pass any link equity to any on-page links”. These directives are implemented through Robots Meta Tags on the headers of the HTML pages (most often utilized) or through the X-Robots tag within your HTTP header.

Indexing What do search engines understand and store your webpages?

After you’ve confirmed that your website is crawled, the next thing to do is to verify that it’s found. It’s true that even if your website is able to be crawled and found through a web search engine does not necessarily mean it’ll be included within their database. In the previous article about crawling, we talked about the ways that search engines find your website’s pages. The index is the place where your pages that are discovered are stored. When a crawler discovers an internet page, the search engine displays it as it would a browser. When it does this, the search engine scrutinizes the contents of the page. The entire information it gathers is saved within its database.

Find out more about the process of indexing and how to ensure your site is included it into this vital database.

What can I do to see how the Googlebot crawler views my page?

Yes that cached copy of your site will reflect the image of the previous moment that Googlebot was able to crawl it.

Google is a crawler and caches websites at various times. More established, well-known sites that post frequently like https://www.nytimes.com will be crawled more frequently than the much-less-famous website for Roger the Mozbot’s side hustle, http://www.rogerlovescupcakes…. (if only it were real…)

You can also look at the version that is text-only of your website to check whether your content is crawled and efficiently cached.

Are there ever any pages deleted from indexes?

Yes, webpages are able to be removed from index! One of the primary reasons why a URL could be removed are:

  • The URL returns an “not found” error (4XX) or server error (5XX) This may be an accident (the site was relocated, and an 301 redirect was not created) or even deliberate (the page was removed and 404ed to remove it of the search results)
  • The page was tagged with an meta tag that said noindex – This tag may be added by website owners to tell the search engine not to exclude the site to its index.
  • The URL was manually penalized for not complying with Google’s Webmaster Guidelines and, as consequence, it was removed off the list.
  • The URL has been stopped from crawling by the introduction of a password to visit the site.

If you suspect that your website’s page that was once included in Google’s index isn’t being indexed and you want to know why, utilize this Tool for URL Inspection to determine the status of your page or, alternatively, make use of Fetch with Google that has an “Request Indexing” feature to send individual URLs to the index. (Bonus that GSC’s “fetch” tool also has the “render” option that allows you to determine the if there is any issue regarding how Google interprets your site).

Let search engines know how they can index your website

Robots meta directives

Meta directives (or “meta tags”) are the guidelines you can provide to search engines on how you would like your website page to be handled.

You can inform crawlers of search engines things such as “do not index this page in search results” or “don’t pass any link equity to any on-page links”. These directives are implemented through Robots Meta Tags on the headers of the HTML pages (most often utilized) or through the X-Robots tag within your HTTP header.

Leave a Comment

FreeWorld