Content Crawl Priority

Daniel
I created 9 more pages this week only 3 got indexed.
The 1st was submitted 8 days ago.
Pages are linked well I believe.
Never had an issue it taking this long.
Any ideas of what kind of issues to look for or a platform I can run a test on?

📰👈

6 👍🏽6 32 💬🗨

Ammon
Crawl priority is not a fixed value, but rather is relative to everything else that gets uploaded, news that breaks, products that launch, trends, fads, even memes.
This is why Googler's I've spoken with don't even like to discuss crawl 'budget' as a concept. There's no such thing. Instead, each and every minute, the crawlers are working their way through an ever updating priority queue.
The exact topic you write about, the keywords in links pointing to the url, the strength of links pointing to the URL, the search volume of terms relating to the URL, the time of year, and all the other priority affecting events of the day, hour, minute, all affect how soon your page will be crawled.
The longer it waits, the higher it's priority, though at the same time, the longer it waits, the more higher-priority things are likely to jump the queue. But crawling is not a simple or basic thing. Crawl prioritization has its own complex mathematical algorithms, and once you actually study how it works, it is an incredible system, every bit as complex as any other algorithms Google use.
What Google don't care about is how many pages you uploaded today, or this week, or this year. What they do care about is how many people that are their customers, people doing searches *NEED* any given piece of content quickly.
Crawls are directed in large part by importance, and in large part by customer (searcher) demand. Content that is just another version of widely available alternatives is low priority than content about some brand new term that never turned up in searches before and now has hundreds of people searching for it (some new actor's name, a new pop group, a movie name, a new product, breaking news, etc.).
There are more than 55 million websites online. To grab just one page from each in a single 30-day month requires 1,833,333 of them to be crawled each day. That means crawling 76,389 of them per hour every hour, 24/7. That in turn means grabbing 1,273 pages per minute. Given shared hosting and virtual hosting, trying to crawl 1,273 pages per minute every minute of every hour of every day is likely to knock out a few servers. So in reality, Google have to strategically attempt to map out the IPs and make sure they don't take servers down too.
But of course, one page from every website per month would be useless. We expect Google to have grabbed the latest headlines page of all major news sites, all around the world, in every language, at absolute minimum of every hour, and to have crawled all the stories too.
Several thousand products will be uploaded to Amazon, and to eBay, every single hour. Several thousand videos will be uploaded to YouTube. Hundreds of thousands of Instagram pictures per hour, each with their own URL. Hundreds of thousands of tweets.
The logistics of crawling are genuinely mind-shattering. And the volume of new content just keeps increasing.
Then, on top of all that, you have to recrawl older content somewhat regularly to see if anything has changed, or even if it is still there. Billions more already indexed URLs.
To get crawled at all, especially within a few days, is damn near miraculous when you really understand what has to be going on to make that happen. Just a few years ago, you expected to wait maybe a month or more unless you had some pretty high importance signals, or were writing about stuff that was trending right at that moment.
Keep building your importance signals. That's stuff that obviously includes having links from other important sites, but can also include signals like widespread use of brand searches.
But above all else, for your own sanity, learn patience. It is a really Freaking big world out there, and a lot happens in it every single day. Expect it to take a few weeks for content to get crawled, and try never, ever to submit it. Submitting your content to be crawled means you miss out on seeing how important Google thinks your site is by how often it decides to crawl content on its own.

📰👈

The Solution for Webpages Marked as Crawled Currently not Indexed on GSC Google Search Console?

Block crawling via Robots.txt, Add NoIndex Meta Tags on each page, Password Protect the pages | What do they mean?

RELATED POSTs

Leave a Reply

Your email address will not be published.