The Quick, Down-n-dirty Guide to PageRank
So, in a few discussions lately it has become clear that some people still really struggle to get their heads around how PageRank works.
Imagine you have a website. For simplicity, the Website has ten overall categories, and Each consists of ten pages. One of those categories is our 'top level pages' and includes the Homepage, the About page, and a bunch of other top-level content that is linked to on every page of the whole site.
Each of those pages in the top level category also link to a category homepage or index page.
The Category pages each link to 9 other pages about the specific product or service or whatever that the category is about. These are your second-tier content pages, and are the more specific pages for the product or service that the category focuses on.
So, that's 10 top-level pages, each linking to each other, plus to 9 other category level index pages. Only the category homepages link to the other 9 pages within each specific category.
Okay, so with that really simple structure in mind, imagine we have about 50% of all the external links to the site are pointed to the homepage, and another 50% in total point to various specific content pages or to category index pages.
Those exact ratios are not too important, the main thing is that we are accounting in some way for 100% of all the links pointing to your site, and thus 100% of all the link 'juice' that flows into the site.
You have 100 pages, and 100% of your link 'juice' value, pictured in your head, right?
Now, remember the structure. Your site Homepage is linking to 9 other top level pages, and also to 9 category level index pages. That's 18 links out from that page. So the 100% of passable value is divided between the 18 links, meaning each gets 5.56% of the total passable link value from the homepage going to those linked pages.
(There is actually a damping factor, meaning that not all the value of links that comes to a page gets passed out again, but that a percentage is lost. The original patent had this at 15%, but the fact is it could be more, less, or even a variable amount. For now, we won't worry about the damping factor as it'll only confuse anyone already struggling to understand the basics of PageRank. For now just remember it exists, and is why direct links are still a bit more valuable than internal links.)
So each of the top level pages is getting roughly 5% of the total link power that the homepage has based on the link structure. Each of those other top-level pages are also linking back to the homepage, and to each other, and to those 9 category index pages, so each of those top-level pages also has 18 links in total, and is passing just over 5% of the value they have (5.6%) through each of those links. So the homepage is getting some added power from the fact that what it links to, is also linking back, but it is only 5% of 5% (or 0.25% of the total value that we said was coming into the site as a whole.)
The Category level index pages were also linked from every page, so they got the same 5.56% as the others, but each of those pages has 10 links to the 10 top-level pages, 9 links to the other category pages, and 9 links to the deep content pages in the category itself. So this time we have 28 links that the total value of links has to be split between.
That means each of those category index pages is passing 1/28th of the 5.56% that was flowing in, meaning roughly 0.2%, to each page it links to.
Remember, for the deep pages that are inside this category, not linked to directly from the rest of the site, that 0.2% of all the passable value is the ONLY real link value they are getting. That's all the 'juice' they have to work with when Google is working out the 'authority' style metrics.
If I create 10 new pages in one of those categories, so that it now has 20 pages in total in the category, then the category homepage that links to them now has 38 links instead of 28, and the link juice passed to each is now 1/38th of the 5.56% instead of 1/28th. Sure, you added new content, hopefully more keywords, and that is all great for creating *relevant* pages, but at the same time you dropped the PageRank score of ALL the pages in the category too.
Instead of each getting 0.2%, now each is getting 0.15%, or 25% less 'juice' than before.
And THAT is why when you add new content, you absolutely need to make sure that you'll be getting more links too. That's why a percentage of all content you create should be dedicated link-attracting content. Stuff that draws new, genuine links, giving the whole site more power so that even as it gets diluted across more pages, and is a smaller percentage, it is a bigger total amount.
Anytime you add content without getting more total linkvalue coming to the site as a result, you dilute the PageRank of all other pages on the site, and need to compensate for all pages having less 'power'.
105 👍🏽22 💟129
If it dilutes so much why do 10000 local pages rank on a relatively new site with 10-50 links pointing to it and the only thing linking to the 10000 local pages is a html sitemap in the footer – both easy and competitive niches we see this pattern – not saying what you said is bs not at all but the maths don't make sense for those scenarios
Because Google Local doesn't use PageRank. Many reasons, but basically if they did, nobody living in a small suburb could ever find their local butcher, as the one in the city would always have more PageRank.
Local is an extension of maps, NOT an extension of Google Search Algorithms. It uses an entirely different set of criteria, where the main ones are radius, and footfall (android devices all tell Google where they go, what shops they enter, for how long, and more importantly, which rival/competitor shops they walked past to get to the one they used).
User » Ammon
Possibly but if local relies on radius and footfall why do we all make sites rank 400 miles away and at various places 400 miles apart using the same site where there is zero footfall- my head hurts 😅
Sabraw » User
Is that for organic though, or maps?
Ammon ✍️ 🎓
Partly lack of competition. Most other small businesses don't do any kind of Search Engine Optimization (SEO at all, and so just about anything you do is going to put you ahead – at least until you meet one with strong signals. If they all have about the same level of footfall, and all have about the same level of business presence generally, whichever one adds a feather's worth of weight is still suddenly the heaviest.
User » Sabraw
The topic was page rank i assumed we were talking about organic web pages and links
Ammon ✍️ 🎓 » User
You were the one who brought up Local pages, by which I assumed you meant local search, the one main search type that doesn't use PageRank (named for the inventor Larry Page, not for web pages) at all.
In regular Google search, PageRank applies, and all in my post holds true.
User » Ammon
To me local pages are "domain/plumber-michigan" etc – as i said 10000 local pages on a website linked to via html sitemap in footer not Google Business Profile (GBP
User » Ammon
Therefore by the same maths mentioned above with tiny amounts of page rank split between 10s of 1000s of pages would have next to zero page rank yet outrank sites decades old with links on many occasions
Ammon ✍️ 🎓 » User
If Google shows maps in the Search Engine Result Page (SERP, it is at least a partially local search. For many (most) kinds of local services, Google defaults to local search and uses the local search algorithm.
Do a search for the one word 'Dentist' and you'll see local results and a map pack. Google's algorithms determined that this has local intent long ago, and so uses the local search style algorithms for it. Does this for most service types from plumbers to roofing contractors.
User » Ammon
Cool so then page rank is nothing to worry about for local or national because local doesnt need many links and national just needs a few more 😁
. My last reply sounded sarcastic i have a crap way of wording things 😅
Ammon ✍️ 🎓
PageRank is always worth worrying about because it doesn't just affect your ability to rank. The crawl priority system also uses it (along with topic trends, query demand, etc) to determine when or if to crawl a page, and then to render it, index it, etc.
All the people worrying about the recent "discovered not crawled" and "crawled, not indexed" messages could probably fix it just with PageRank (not quite all, some would have to worry about quality scores too, but you get the gist).
User » Ammon
With regards to the dentist thing though pretty much anything brings up a maps listing these days even national terms like "dressing tables" "furniture" so when is it local algorithm and when is it the main? Seems the same to me – i hate the way questioning things in a debate creates the view of disagreement
Ammon ✍️ 🎓
Never be afraid to ask an honest question. Debate, when respectful can often be a lot like argument, but is always a good thing. It's only when respect is lacking that it becomes problematic.
The answer to that is the same reason that I'm always amazed that every town still seems to have that one furniture store, where everything seems about 3 times what you'd pay online, and you always scratch your head as you pass it wondering how they stay in business. Old people 😃
Well, technophobes too, or those who still prefer to see a thing physically before they lay out hundreds of dollars for it. So long as Google's analysis of how people interact with their Search Engine Result Pages (SERPs, and what things lead the most people to the shortest possible session completion (without rage-quitting) shows that some people want local suppliers, Google will continue to make sure it shows local suppliers. And because of PageRank, the only way to be *sure* to include some local, in all situations and locations, regardless of how many big national or international sites there might be, is to add a map pack or local pack.
Gerencser » Ammon
Now why did you tell people that there is a difference between search and maps. It's the #1 reason bad SEO users fail so hard at local 😉 😉
Ammon ✍️ 🎓
You think they fail hard? You should see the guys who do well with one local site and then think they can use the same thing to crack Google search 😃
Marvin » User
Pagerank is still major IMO for local, you know this. I've seen very competitive local terms and I've ranked #1 for a lot of them, and I could clearly see the progression in rankings as I threw more links at it, the higher it ranked.
The local organic rankings are using a hybrid algorithm I'm sure, the local and the organic. This is obvious because you can sometimes see local services ranking top in the Maps pack, but well below in the top 10 for organic, or vice versa. If they were just using one algo, their rankings (organic and local/maps) would mimic each other.
In the original PR document, each page had a PageRank (PR of 1; the sum of all pages' PR still converges to the total number of web pages so that the average PR of a page is 1. This was open to abuse as you could create 10,000 pages and link it all to the target page, passing on 10,000 PR, in theory (not accounting for the damping factor).
Bill covered a patent addressing this:
"In practice, this vulnerability of PageRank is being exploited by web sites that contain a very large set of pages whose only purpose is to "endorse" a main home page.
All the endorsing pages are created on the fly.
This large number of endorsing pages, all of them endorsing a single page, artificially inflates the PageRank score of the page that is being endorsed."
"The patent filings provide a couple of potential solutions. The first is to split a minimum PageRank value amongst all of the indexed pages of a domain or an IP address rather than add more PageRank as new pages are created. The second is to assign a domain-based trust rank, based upon the web server it is hosted upon."
This was back in 2007 so I'm sure there have been many iterations to address the issue. However, my theory as to why mass-page sites work is because pages can still have inherent PR value generated by itself (without internal or external links) although how much and whether they do or not is probably a more complex calculation, and if you have 10,000 pages with self-generated PR, passing PR to 500 pages, those 500 pages might just rank if the competition is low enough.
Ammon ✍️ 🎓 » Marvin
Creating pages still meant you had to have links *to* the pages you created that had passable PageRank. Early abuse of the kind you mentioned literally stopped working that way in the noughties, although modern Private Blog Networks (PBN)s essentially use a similar exploit with just about enough genuine PageRank somewhere in the network to carry the rest.
If you look even at the original papers, the very earliest form of PageRank, they specifically point out that orphan pages are removed before the calculation of PageRank begins.
So, if you create hundreds of pages, you have to link to those hundreds of pages, and you diffuse the PageRank being passed around the site by the same amount.
Marvin » Ammon
Yes they did address it although we really don't know what iteration of PR they have right now. The original definitely allowed for some abuse.
You did need to link to the page you create which diluted the PageRank (PR but if you have a website of 1 page with a starting PR of 1, you have 1 PR. You create 3 pages, A links to B, B links to A & C, and C links to A, using the original PR formula what you would get is:
A – 1.3
B – 1.2
C – 0.7
We still created more PR out of thin air (no external links) for A than what it originally had. Provided we're not trying to rank ALL pages we created and instead trying to funnel that 'juice' to a few target pages, it's a good way to abuse the system.
They've fixed it since then but I wonder if Google has a modified version of minimum PR value for pages that would allow a site owner to increase the PR value of a single page on their site past PR1 without additional external links.
Ammon ✍️ 🎓 » Marvin
It didn't start with 1 exactly. The default value of a page is the exact same number as the damping factor, so that they cancel out to allow convergence. So in the originals that would be 0.15. It converged at one, meaning the sum total value of all links in existence would be equal to the number of pages in existence – another reason that removing both orphan pages and dangling links was a necessity.
Marvin » Ammon
The above calculation already factors in the damping value of 0.15 with (1-d) d being 0.85. But it does assume a starting value of PR1 for every page which is as you said; "meaning the sum total value of all links in existence would be equal to the number of pages in existence"
So a 1page site would have a total PR of 1, whereas a 5 page site would have Public Relations (PR) collectively. Now distributed and with damping factor, you could end up with the 5th page barely having any PR but the main homepage with the lionshare of that PageRank (PR
I'm referring to this formula:
PR(A) = (1-d) / N + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
Ammon ✍️ 🎓 » Marvin
Except that the average page out there has an actual PageRank score that looks more like 0.00106732
Because the average is 1, and that average has to include every major site, portal, resource, etc. Think of the score of a url like amazon's homepage, and how low your pages have to be for the average to still be 1.
The UK has this famous statistic – 2.4 children. That was the average number of children in a household in the UK when the study was conducted. Obviously it is a 'mean' average. Most people, not especially familiar with statistics imagine that this means the mode average is probably 2 kids per household. But in fact, the mode average, the most common number of children in any actual household is 0, zero, nil. It is simply that there are enough households with 4 or more children that the average comes out to more than 2, even though the most common number of children is zero.
From 2.4 to zero isn't what most people expect. It's a pretty significant difference. But you can expect the exact same sort of difference in the PageRank scoring. Remember the Pareto Principle, and that it repeats.
80% of all the links will go to just 20% of the sites. But 80% of those links to just those 20% will also just go to the top 20% of the 20%, and again, and again. Which is why the top 1% are always so incredibly, mind-bogglingly rich (in whatever resource) compared the the majority of us.
What I love about PageRank, even though we don't get to see it, is that it is a relative amount of all the links in existence that link to any one specific URL compared to every other known URL. That's literally one way of using the score. Another is of course what it was originally designed for, which is about what proportion of all journeys may go through, from, or to, a specific page, such as Amazon.
It's an incredibly useful metric, with the link graph being able to be used in about as many different ways as Network Theory itself can be.
Marvin » Ammon
That's true, but I guess the exact number is somewhat irrelevant as it's all relative to how much you actually need? The point would stand that every time you created a new page, you are creating PR out of thin air which could then be abused by leveraging it towards a target page.
If I created just the homepage and got 0.00106732 PR from that, I'd still get a lot more PR to my site as a whole if I created 1000 of them (diluting the rest of the web in the process).
Again, they already addressed this issue so it's a moot point but we have seen mass-page sites rank a lot of pages by utilising this strategy so perhaps some version of this algo that is less susceptible to abuse is around.
Ammon ✍️ 🎓 » Marvin
One thing to always look out for is when you find yourself saying "I've seen X cause Y", but cannot say "I have extensively tested and been able to do X and get Y result". I mean, if you saw it working so much, you'd have done it yourself, at least to test, right? And so that you are only saying you saw it, not that you've done it and here are some working examples kinda makes me think you couldn't replicate it.
What very often happens is we get correlation but not causation. You've seen sites that rank that do this. What you won't have seen is the thousands of sites that DON'T rank that do this, because they don't rank. So you always, always need to correct for that natural and completely unavoidable observational bias.
For many, many years, probably around 2 whole decades, you'd pretty much always find that every high-ranked website included a keywords META tag. Because of that, many of the tools that examine the features of high ranked pages to 'backward engineer' the ranking factors all reported that the keywords meta was important. But in fact, the search engines had been completely ignoring it for years. They strip it out and act like it isn't there at all, so all having it does for SEO itself is add a few extra bits of filesize for no good reason, which is a slight negative.
SEO myths naturally cause a higher than normal level of correlation. That's because more of the highly ranked pages have employed some level of SEO knowledge to get there than lower ranked pages. So any widespread myth in Search Engine Optimization (SEO will occur more often in highly ranked pages than in low-ranked pages, not because it works, but because both things correlate to the presence of someone who reads SEO tips.
I can tell you a lot of stuff that I have personally extensively tested, not just seen on other sites. Stuff you can test and do yourself and get the same results. One of those things is that pruning pages that don't actually perform in their own right can have a major positive effect on your SEO. If you were to search this vey group for case studies of others reporting the exact same thing you'd find plenty. It is replicable.
It proves that what you think you are seeing isn't what is causing the ranking.
There are scores of clients in the past decade where the absolute first move in fixing their woes was to remove a ton of blog posts. That's because one of those myths of SEO is that adding lots of fresh content is a ranking factor, and helps with SEO. That myth was going around for years (not surprisingly since most of those pushing it the hardest were SEO agencies that would provide content for you, and thus profited from you believing you needed to keep buying long after everything needful had already been created).
The myth came about partly through "SEO content creation" agents and agencies spreading what they wanted to be true, and partly through SEO users so bad that the ONLY time they ever ranked was during that very short, temporary period when fresh content gets the freshness boost (so Google don't completely omit news and so stop it from ever gaining links). Regardless of how it came into being, it was (and still is) a nonsense that can seem to work by correlation sometimes, but is actually not a thing.
So, about 10 years back I started more and more often seeing companies that had bought into the myth, or had their previous SEO agency insist on it, who'd been adding hundreds or thousands of posts over the previous few years, none of which were particularly good, very rarely added any links, and, as I've been explaining, were diluting the PageRank that the site had across far more pages than was useful, meaning that better pages were weaker than otherwise.
The cure that is 100% replicable is to go through the content with a full audit, and consider removing anything that either didn't get any third-party links (never earned a single citation) or could not be shown to be directly contributing to conversions. Even pages that in some way helped with a few conversions would also be examined to see if any could be consolidated – same content over fewer pages – without losing their effect on helping conversions.
Try it yourself. It works, 100% of the time where a site has become bloated with pages that genuinely are not adding to either links or conversions. Completely replicable. And that 100% replicable thing is how you know a thing is really a thing.
Marvin » Ammon
It's obviously very difficult to test for something like this and isolate the variable unless you have a lot of resources to spare. Of course, you could create a mass-page site and build absolutely no links to it, not even social, but that seems like an impractical approach as that requires a ton of resources. Moreover, the ranking or not ranking of the pages can't be attributed entirely to this either.
I've probably ran over 100 tests over the past month alone, so we do test everything, and try to isolate the variables as much as we can, but in Search Engine Optimization (SEO, unless you're doing SVT which is not truly representative of the real search environment, it's impossible to say conclusively whether the test confirmed your hypothesis or not. At best, we can make very educated guesses.
I have both done content pruning to success for a few clients, at the same time, I have created mass-page sites in competitive spaces and ranked them successfully, with traffic value near half a mil $ per month. Matter of fact, I've just launched 3 "mass-page" sites recently. Of course, we're not building 0 links to them, that's an impractical approach.
But I have certainly seen with my own mass-page sites, outrank established companies that have far fewer pages and 5x more genuine links. I also see it with other sites (which is why others make the same observation). I'm not saying mass pages were the only, or even the main contributing factor, but I wouldn't completely discount what Dean said either. And I'm well aware that mass-page sites could be ranking for 100 other reasons despite their diluted PR
As the calculation above based on the original PR formula shows, you can increase the overall accumulated PR of the site by having more pages (diluting the rest of the web in the process), and if you mess up your internal linking, you could dilute the PageRank (PR of your target pages, but you could also boost them.
Ammon ✍️ 🎓 » Marvin
You're creating an either/or argument that nobody else is having here. A larger site has more pages, thus more content, thus will rank for a wider range of keywords and get more searches bringing traffic. Nobody is disputing that, or questioning it. Hopefully the content, because there is more of it, and it gets more traffic on more keywords will also get more links. If so, it can cancel out the absolute fact that more pages diffuse and dilute the PageRank of a site as a whole, by earning more to begin with.
EVERYTHING I have said to this point is in the context of that. Not in contrast to it. The two facts do not contradict because they are looking at different things, even though both things may apply to the same situation or scenario and may contradict in some ways in that specific instance.
It is a little like looking at diet advice where they say to lose weight you should burn fat. But at the exact same time, muscle mass is more dense, and weighs more, than fat. If you put on muscle to lose fat, you may actually end up weighing more than before, not less. That's simply how the two different facts interact.
Adding pages that do not also attract more links will dilute PageRank to all those pages because of your navigation links. Other factors, such as increased relevancy for more keywords, more long tail search, and a long list that apply to the OTHER few hundred signals Google use aside from PageRank way partially negate this, or greatly outweigh it.
I'm not saying that PageRank is the only factor to consider. I'm not saying that just worrying about PageRank will instantly get high rankings and solve all other SEO issues. I am ONLY talking about PageRank, as PageRank, for those who want to understand the specific factor that is PageRank. Not the entire set of algorithms.
There are many cases where adding more, relevant, meaningful content, will be the right thing to do. And there are cases aplenty where adding weak, low quality content, merely because someone told you that more pages will give you PageRank will be one of the worst things you could possibly do other than blocking Google entirely via the robots exclusion protocols.
If there were a causal link between adding pages (of any quality) and gaining PageRank (which has neither a quality scoring mechanism, nor a relevancy one) then I am absolutely 2,000% certain that tools that explore the link graph to build their own, such as Ahrefs and majestic, moz and SEMrush, would have spotted it and we'd all know. More importantly, those cases we BOTH report of gaining rankings by pruning content shouldn't have worked.
Marvin » Ammon
Perhaps this was not the point you're making, or you're misunderstanding my point.
My point was that a website with 100 pages, will have more TOTAL PageRank (PR than a site with Public Relations (PR) per the algo formula above. In creating 100 pages, YES, we are diluting the PR of the 100 pages, BUT I'm not saying we need to rank 100 pages. I'm saying we create 100 pages to rank 1 page. So you have all of that additional PR, diluted or not, and then it's funelled to the single target page.
That is obviously abusing the PR system. And I'm not even arguing that it still works, nor advising anyone to do it. I'm simply saying that is how the original PR formula would work.
There's a great paper on this here:
In the screenshot example, we see the 3 child pages with diluted PR but the homepage has more PR than what it would have had, had it existed on its own without more pages. This was how it could be abused to create a ton of crap pages to funnel the link juice to a few target pages. Obviously, this is not practical as there are plenty of drawbacks to having a ton of useless content, User Experience (UX and Search Engine Optimization (SEO, but that's not my point.
Ammon ✍️ 🎓 » Marvin
There is another possibility that you are ignoring: that you could either be wrong, or that the misunderstanding could be yours.
The page you cite there has some errors. Most particularly and obviously you can see a major one with ease where the author states:
"Observation: every page has at least a PR of 0.15 to share out. But this may only be in theory \- there are rumours that Google undergoes a post-spidering phase whereby any pages that have no incoming links at all are completely deleted from the index…"
The fact is that it is clearly stated in several of the papers and patents of PageRank that both orphan pages, and dangling links, are completely removed from the calculation process, before the calculations begin.
Not an 'observation', nor a 'rumor', nor a 'theory' but an absolute statement in the documents he's actually citing. And somehow he missed it.
For anyone following along but unsure of the terminology, an 'orphan' page is a page that has no parents, meaning a page that has no backlinks. If nothing links *to* a page it is an orphan, and none of the links *on* that orphan page can or will be calculated. A 'dangling link' is any link to a document that Google was unable to find or retrieve – this can be a broken link, a link blocked by the robots.txt, a link blocked by some other server response, or originally a link that had the 'nofollow' attribute might also be treated this way in the PageRank calculations.
The other main mistake in the page is that it assumes that Google have told us everything, and that it all works exactly as the example. The documents actually specifically state that the damping factor given is ONLY an example, and immediately goes on to talk of others that were used and how useful other values were – without disclosing the specifics.
I don't know if you have ever filed a patent, dealt with the process, or simply had long chats with a patent lawyer. It's a field all of its own, with its own practices and considerations, and patents are generally very, very carefully thought out. You want to protect your intellectual property by registering it, making your application as broad as possible to prevent as many loopholes or slip-bys as possible (i.e. you don't want someone changing just one tiny thing and thus not being covered), but also specific enough to actually be granted (patents can't be too broad, there has to be a specific patentable idea or process there that is original).
There are also 'Defensive Patents' which are where you patent things you may not even be using, because they were steps that led you to work out something else you don't want to talk about, or that otherwise defend things NOT actually in the patent, or that were tangential but important. It's like laying a patent minefield to slow anyone else's progress generally as they have to pick their way through.
The core thing to understand about patents is that they prevent others from gaining or using the same idea if they come across it by themselves because you have claimed ownership of that idea first, but at the tradeoff that you've given that secret up, and after around 70 years (exact durations vary on different types of intellectual property) it passes into the public domain and EVERYONE owns it and can use it freely.
A patent guarantees that others can't discover and use your patented process or idea for a number of decades, but also guarantees that it has an expiration date, and your 'secret' will expire, not being your unique advantage forever. The process itself, the whole idea of patents, was to prevent secrets being lost forever once people died without ever passing them on, and the duration of patents is general legal terminology relating to human lifespans (thus why 70 years is common as the biblical lifespan of "three-score years and ten", and some other types of intellectual property are literal lifespans expiring on or after death of the creator).
Anyway, the point is that a lot of patent applications cover just enough of the idea or process to get the protection, but also leave just enough actual secrets or non-specifics that you may get a random few years on top, after it expires, before any rivals actually get it to work quite the same way. A few more years of exclusivity.
It would be very unusual and rather irregular in patent terms if the Google patent covered everything and gave up all the specifics. Instead it will have just enough to protect the core ideas (and as many uses of them as possible) while still hopefully leaving just a few 'trade secrets' left out.
Ian (Rogers) always came across as a nice guy, but certainly as more of a math/dev guy than a law guy (or even marketing/psychology guy). However, I didn't know him as well as I knew others that wrote definitive guides to PageRank such as Chris Ridings, or even Phil Craven whom I'd originally introduced to the PageRank documents and helped beta-test his PageRank calculator (long forgotten now, but very popular at the time). Because of that, I never corrected him about his mistake. He was a kind of rival or competitor to guys I knew better for keywords relating to PageRank explaining. 🙂
But you don't need my long intimate history, nor my experiences with patent owning companies, nor my long chats with Google engineers, or extensive studies of PageRank to get to the core understanding. Just a little bit of logical thinking will get you to the same place.
All of those PageRank papers, documents, and calculators were out and well known in the early to mid noughties. Nearly 20 years ago. If it ever actually worked that you could create PageRank from nothing by simply creating thousands of pages, you can be absolutely certain that at least 1 of the hundreds of thousands of people who have read those documents over the years would have tried it, found it worked, done it to death, been seen doing it and copied by others, and we'd all have been doing it for well over a decade.
It wasn't just 'fixed' and never mentioned. It never worked in practice. PBNs are the closest, and those are still no magic bullet, and actually depend on what external links they can get for their power.
How do you go about non-linkworthy pages then? Direct internal links towards them? Purchase external links?
Make a link-worthy page that can link to it.
Viktor » Ammon
Is there a minimum and maximum number of internal links from that link-worthy page you would point towards others? Or as much/little as it makes sense?
Ammon ✍️ 🎓 » Viktor
In most of SEO, the maximum is all about what you can justify as the best use of time and resources for the expected return. If you have enough time and labour, all you can get.
PageRank is a BS concept we should move past. How much PageRank your pages at best only matters relative to the pages you'll compete against into the Search Engine Result Pages (SERPs for whatever your keyword target is. Not only that PageRank assumes all links are equal when it's super clear how much more powerful contextually relevant links have become in the last 10 years. It's a handy conceptual teaching tool but no one should be seriously worrying about this
PageRank has always, ALWAYS been a system for weighting links – to specifically measure how all links are NOT equal. Given that fundamental misattribution, the rest is basically invalidated before you wrote it.
PageRank was entirely mathematical mate. Simplified it was there to measure the volume of links to a page and look at the volume of links each page that was linking to that has. It did not look at topical relevance of a page that and the topical relevance of the links that it has. It was massively simplistic compared to where we are now.
Page A, B, and C have 10 links each and link once each to Page Z. Page D has 5 links also and links once to Page Y and so does Page A.
So 3 highly linked pages link to page Z
And 2 highly linked pages link to page Y
In PageRank example then Page Z is a more valuable page than page Y.
Where we are now in many verticals and moving more towards places context on those links.
That context might tell you that Page B and C are coupon sites with no value and Page A is the MayoClinic linking to Page Z and Y as sources of information on Asthma and Page D is a resource but they ONLY link to page Y.
In current example. Page Y is more valuable as it's got less shit coming to it and more contextually relevant links.
Because PageRank assumes all links are contextually equal.
But go off! Let's all SEO like it's 1999
Home of the Office of Disease Prevention and Health Promotion – health.gov
Home of the Office of Disease Prevention and Health Promotion – health.gov
Ammon ✍️ 🎓 » Patric
Oh dear… Where to start?
No, PageRank was never, ever, just about the volume or number of links. That is why it is a link WEIGHT system. This is the second time telling you an undisputed fact. Don't make the exact same mistake a third time please or you'll really embarrass yourself, not to mention cause untold injuries as everyone headdesks yet again.
You are however right that it was topic insensitive, even though you didn't know the label. That's where the idea for 'Hilltop' came in, and TSPR (Topic Sensitive PageRank) and a dozen other suggested (and often patented) refinements. Google actually used their own version of a DMOZ clone for a while as a means to categorizing broad topics, but that was many years ago, and predates the little green PageRank toolbar (which actually reused the iconography from the directory to provide the bar).
Are you starting to get the idea that maybe I have studied PageRank a fair bit over the years? You'd be right to do so.
Google still use PageRank, and in fact, use it in more ways than ever. As I mentioned in a comment above, it is one of the core metrics in assigning crawl priority, though the others tend to be about topic (specifically whether search demand is increasing or decreasing, whether it is trending, how many existing results they have, etc).
Surely you have heard people refer to Google using hundreds of signals? Well, PageRank is one of them, and still a highly significant one. Others used apply to relevance (that's the keywords bit), and in fact, when Google run a search, they process relevance first, pulling all pages that are remotely about the thing, and THEN apply ranking. Truly irrelevant results are not even in the ranking process.
When you hear people talking about EAT, what metric do you think Google use to base 'Authority' on? In fact, when you want to calculate expertise, one prime way is to first calculate all the relevant sites, and then only look at how THOSE link, which ones the pages that rank for the topic link to.
Authority is like your general 'importance' to everyman and the known Internet, where Expertise is more about your specific credentials and respect within the field, with additional lookups to Trust, which could be about the verifiable fact (Knowledge Graph) among other signals.
But the important part there is not whether PageRank is the *only* thing that matters to ranking. It isn't. You have to be relevant first, and then hundreds of different signals may be used or not, and in varying amounts, depending on the exact type of query. But make no mistake, a lot of different specific metrics use PageRank as one of their ingredients, mostly because even now it is one of the most robust things, harder to influence (without detection) than most other signals. Others are only robust through either being specific resources inaccessible to public interference (which can also make them slightly biased), or through being unknown.
People have been successfully feeding wrongful data into the Knowledge Graph for a while to show that it is not as secure or hard to manipulate (at least yet, they'll keep working on it). The link graph, because it is all relative values, relative to the entire known internet, is harder to significantly corrupt. Not impossible, at least on the small scale, but a lot harder.
Very insightful, it does require some deeper level of analysis, but it definitely is valuable.
There were plenty of people saying that link sculpting does not work, but maybe it was just wishful thinking on their part.
Distance (or hops or steps) from 'seed' sites as a vector also seems to matter.
And then, of course, there is the question of whether individual pages are ranked or SERPs that solve a query, as a whole and how often they are tested/switched.
And last, but not least, there seem to be different algorithms that are differently weighted for different queries, localities and SERP features.
Yes, there are many different 'flavours' of algorithm, where some signals are dialled up as more important to the intent than usual, and others may be dialled down. For example, where a search query is deemed to be 'news' related, or simply a keyword that is suddenly trending (even if the algos don't know why) this can trigger a QDF signal (Query Demands Freshness) so that pages that are older, have more links they acquired over time, etc. won't block out the latest news or updates for that query.
In the reverse scenario, there are certain types of query that are actually best served by older, trusted, established pages, and not necessarily by whatever the latest blog post trying to rank for the words wants, and freshness might be dialled down in favour of stronger authority signals – e.g. if you were looking for an 'official' site for a brand, a govt agency, etc.
These don't change how each signal is scored, but rather change how much of that score might be factored into the final overall ranking. That's why some people think CTR changes the ranking of sites, but in fact it is changing the signal of search intent to favour freshness (which usually happens to be their test pages that are among the newest).
Finding which 'flavour' of algorithm, which precise blend of signals best suits the query is what Google do actually use click data for, but it tends to apply to the entire batch of query types that use that flavour, not an individual SERP, and certainly not any individual site. Basically, Google wants to find the 'blend' that results in the fastest satisfaction (no further clicks or refined searches). Sometimes that can mean that a site with a high bounce rate may actually score well on that 'flavour' choice, if it provided psychological 'anchoring' that made a later result seem better, more certain, than it otherwise had. That's just how that can work out, just like showing a customer a product they don't buy can help get more sales of a cheaper or alternate product that they do.
Also, absolutely edge distances are important. As far back as the noughties I was explaining it as being like Six Degrees of Kevin Bacon, where the lower that number (the fewer links between your site and one of the Internet's mega-trusted landmarks) the better.
None of which changes the fact that PageRank is one of the signals used in rankings, and in crawling/rendering/indexing, and that adding more pages without getting more links to match absolutely will and does slowly dilute PageRank for all other pages on the site. That's the only point I'm trying to make, not give a full breakdown of all the thousands of ranking signals Google use in various flavours of algo. 🙂
What we probably missed out on is the fact that each page you create comes with a residual page rank as well. In other words, while the dilution concept exists, more and more pages creates and retains higher amount of pagerank juice on your site. That probably answers User's question as well.
The 'default value' of any page is exactly and precisely identical to the damping factor, and is why the damping factor exists, precisely so that you cannot just create pages to create PageRank from nothing. Since I already explained that the damping factor was largely an unnecessary complication to add to a topic that is already complex enough to confuse people, I did explain that I'd skip it. Basically it is the damping factor (and the corresponding 'default value' allowing Google to start calculating from the first page it finds, before knowing *its* backlinks) that makes PageRank scale to all the known links on all the known pages. Looking at the papers where they explain when and where the reiteration stops really helps show what the damping factor is about. Without it you wouldn't ever get the convergence point to have a final calculated score.
The formula is (if I remember correctly): 0.15 + 0.85(total page rank from other pages). So, it is not getting canceled out if I understand it correctly. The damping factor is based on the Random Surfer Model, and it would make sense that if you create a page, there's a minuscule probability that a random surfer will come across that page somehow, and that is exactly how page rank is explained in the papers you mentioned I think. I am not talking about the reiteration either, because it is a limit to infinity equation, I believe, that converges to one score as you said. And it is seen in practice as User noted. The internal links have a certain page rank value, and that is not merely from what they received from the home/category pages but also that tiny residual amount they are born with.
Ammon ✍️ 🎓 » Ronny
Yup the original papers and patents had 15% as their example, but clearly and deliberately stated this was not the only value Google had used, and was merely illustrative. In fact, they gave a considerable mention to instead of using a score or percentage at all, using a specific page *as* the damping factor, which would be far more like the idea of a seed page.
If Google found a way to use a value that represented several trusted, core, 'seed pages' as their damping factor, that might have been really effective for them, and not something they'd necessarily want to give away to all rivals who read the patent. All I say on that score is that back in the day there were only a relative handful of TBPR 10 pages out there, and that most of them looked to have been hand-picked. We often talk about edge distances from those kinds of pages, so perhaps the damping factor is a part of how that works. The only thing we can know for sure is that it is very unlikely that Google will tell us, or that they gave away their own exact formula in the patent when an example value that was completely different would be just as good for getting the patent approved, and a lot better for preserving their edge.
Yes, similar to the model Majestic uses for their tool. I agree it could be totally different from what is included in the patent. I am sure it is drastically more complicated than this, at least in <year>.
Ammon ✍️ 🎓
To give just a little further illustration, I noticed that you mentioned the 'Random Surfer' model, which we all know has been replaced with better models in later patent filings, including the 'Reasonable Surfer' model that supposedly weights links that are in prime spots, or more visible as 'heavier' in weight than links in the footer, or links in the templated areas (which means the navigation menus).
Well, the guys at Majestic were the only mass link-crawler that was ever really willing to get into a discussion with me about which crawl model they use, and they told me about how they'd tested the values. The absolute difference between weighting of the most heavily bolded, right in your face flashing neon in-copy link, and the feint tiny link in the footer could be no more than a single digit percentile before it started to create a set of weighting that were clearly massively divergent from Google's.
But the way Google had phrased all their talk about it made it sound like the weight difference would be far more significant than a couple of percentile points, right? Google don't necessarily want us having all the math, and some parts are most definitely going to be the wrong values, as rival engines can work it out for themselves.
So the fewer the number of internal link in a post, the strong the link joice?
There's still only the 100% of value. The whole pie, if you will. But fewer links on that page means each get a larger slice of that same-sized pie than otherwise.
Also, not just the internal links, and certainly not just the links in the main content area, but each and every link in the code of the page counts toward the sharing out.
Marlo » Ammon
So we better keep the number of links for page as few as possible. Both internally and externally.
Ammon ✍️ 🎓 » Marlo
Yes and no. After all, if every page has fewer links, you start to need more and more pages just to cover internal links to all of your content, and those more pages means you need more links – or you have to reduce the amount of content pages you have.
It's fine to have lots of links on pages, as long as you are allowing for it in your strategy, by ensuring you have plenty of links giving the site that juice. Creating link-earning content, of almost any kind, works for that.
Food for thought…we agree that the more internal links on a page, the more diluted the PageRank passed to each, primarily because each successive link is deemed less likely to be clicked on. However, you could argue that the deeper a link is into the site, the more relevance (to the subject matter) it acquires along the journey. Could this potentially become a more important factor now that relevance is gaining prominence in the algorithm?
There's a couple of assumptions there that I'm not entirely sure I agree with. Deeper content is deeper, often thereby further into a niche, but that only makes it more obscure or specialized, not necessarily more relevant.
For example, if you had a site from an insurance provider, their homepage is all about being an insurance provider and is right there on the surface. Deeper pages would tend to be about being a provider of a specific type of policy, more niche, but not more relevant, except to that very specific niche.
The second assumption is that Google are making relevance more of a ranking factor. Can you point out to me any place Google ever said they were not doing their absolute best for relevance? Do you think Google engineers would say "Oh yeah, we were never worried about relevancy before, we just wanted to put random stuff in the Search Engine Result Pages (SERPs😃
Relevancy isn't more or less, just different. In a way, they are *less* relevant to some specific queries in terms of words today than 5 years ago, and are instead trying to understand the intent, the things that were not necessarily said in the query, but were meant.
Better understanding of intent, yes. More 'relevant' depends on whether to the actual wording, or to the intent.
A few years back, I could be confident that if I targeted the keywords "bespoke tailoring" that I was aiming for the particular demographic that would know and use the word 'bespoke'. Now, with query rewriting and synonyms I have lost some of that, and may be getting downmarket people who would only search for "custom fit" (who don't tend to be the same at all).
So, respectfully, I'm not sure I can agree with either of those assumptions. Not without a lot of reservations and conditional statements, at least.
Ah, but as you say, there are many degrees of relevance. And while I agree that deeper content CAN be more obscure, there is also the situation where someone keeps clicking because they are digging deeper into a subject and thereby become more invested in the subject matter. Suppose the algorithm could learn to associate the contextual relevance of each 'click' to the initial search query and either reward or demote the page accordingly, rather than simply diluting the pagerank based on hierachy (and other known factors).
Just food for thought because I love the subject matter I'm enjoying the discussion.
I think I'll call it the 'Serious Surfer' update. 🙂
Ammon ✍️ 🎓 » Lanyon
Deeper is pretty much always either more obscure, or is pagination, and either way, it is the page level above/before that gives it any context.
Think of any real-world situation where you hold off on making your biggest point until page 2, hoping that a 'less relevant' page will encourage them to even click through to page 2. Now, broader relevance on page 1, and more focused on the second, sure. That's what I mean by more specialized, more niche, and more obscure.
Imagine a fairly typical content structure like we might see on almost any product or service website:
Insurance Company > Insurance products > Car Insurance > Third-Party, Fire & Theft Car Insurance
The first or top-level page is like the homepage level, and mostly relevant to the brand, and to any query like "Insurance Company" or "Insurance Provider" or "Insurance Near Me" (I'm teasing a bit with the last, given that "near me" is almost always completely unnecessary, but the general idea is the same).
If I were a salesman looking to connect with Insurance companies because I sell something of interest to them, then that top level is the most relevant and every deeper page is less relevant to my needs.
If I want to focus on the actual insurance services themselves, but haven't specified a particular type of insurance then the second level in my example, the 'Insurance Products' page is more relevant. But did you notice the instinctive and correct use of the word 'focus'? The idea of narrowing the field of view, shutting out the peripheral stuff not directly relating to my focus? Think about the word for a second. Focus hasn't changed my eyes. I haven't suddenly gotten better vision (physical or mental), I've simply removed some of the irrelevant distractions and noise.
The third-level page is even more specific, more focused and is now ONLY relevant to those with cars, or looking to insure a car. It is far less relevant for anyone interested in home insurance, or life insurance, or travel insurance. It is only more relevant to a much smaller target market – more niche – those interested only or specifically in car insurance and not any other type.
The fourth-level, where we are right down to a specific type of car insurance cover, only covering third-party actions, fire and theft, is obviously less relevant to someone who wanted fully comprehensive car insurance. Its that focus thing. More relevant to just one specific thing, but commensurately less relevant to all the other, related things.
Can you think of any real-world, recognizable situation where a deeper page is relevant to more things, not less? I'm drawing a blank.
The part about "Suppose the algorithm could learn to associate the contextual relevance of each 'click' to the initial search query and either reward or demote the page accordingly" is a whole other matter, and a complex one. I've touched on parts of it in the past, and some is obvious.
i.e. Google can only universally track clicks on their own SERPs, not on anyone else's websites. While Android and Chrome give them some access to more, but only for the proportion of the market that uses those (so they'd have no signals for searches relating to Apple products, Linux, Unix, etc.) they are always under intense scrutiny for how they collect and use that data. By law they are only allowed to even save the data at all by taking away all personally identifiable use, meaning never ever a particular user, a particular session, because those would sometimes be personally identifiable (like every time someone check Facebook or LinkedIn or Twitter, etc).
They have to bundle thousands of sessions and millions of clicks into an amorphous, completely unidentifiable aggregate data, by law, with the penalty for not doing so being billions of dollars in fines, just from the EU alone (which has successfully won against them on more than enough cases, as per 'the right to be forgotten' etc.).
So point one is they don't have the data, and are not allowed to have the data. They can't afford to 'cheat' as just one single whistle-blower, one single disgruntled ex-employee would have the power to ruin them. And Google actually have had a LOT of disgruntled ex-employees leak all kinds of stuff, so that's a very real risk and situation.
Google only use click data and behaviour data that either is directly about interaction with their own pages (SERP usage, further query refinements, follow-up searches even days or weeks later), or aggregate data that has had all specific user/specific session related stuff stripped from it very, very carefully.
And that's just point one of a whole list. Point 2 is that most click data is noisy, has many possible meanings, and is something Google cannot use simply because it makes results worse to even try.
Did I go to a second site because the first result didn't satisfy me, or because the answer was so good I needed to confirm it was true? Was I just getting a second opinion that actually only reinforced the value of the first? That's a very real meaning and context most overlook. Pogosticking is absolutely useless, has no value at all, to telling you anything. It can be a positive signal, a very negative signal, or any point in between, and there's no way to tell which from the data itself.
What I can tell you is that the #1 SERP result almost always has the absolute highest bounce rate of all results. And it has that because it is the #1, the first you'll see, with nothing to compare it to *until* you look at a second, third, etc. result. Every time Google rank one result higher than before, they increase its bounce rate. Think of that awful noise you get from feedback over a microphone, and that is what they'd be doing with data to themselves if they even tracked bounce rate on the individual results.
Ammon, I love that you're taking the time to share in my musings. I'm drinking coffee now ☕️
(Firstly, just to clarify, I'm talking about internal links, not external, and assuming that Google could recognise the difference).
The difference in my perspective is that I'm proposing that specificity be rewarded by Google, not penalized by pagerank dilution.
I need a new car and I've seen an ad for the new Audi. I'm cashed up and ready to buy (the first of my hypothetical scenarios )
I don't know much about it, other than it looks good and I liked the ad.
I visit the website. First page grabs me with visuals and general sales pitch.
I want to know more.
I click on the link to the various models and price comparison.
I then click through to each model for more detailed information.
Each click takes me to more specific information, that is progressively more relevant to my original intent (to find out more about the Audi and decide if it's the right choice for me).
Yes, the specificity makes the page less relevant to some, but in my case, quite the opposite. It takes me closer to a purchase decision.
In this scenario, should the PageRank be diluted as someone takes this journey through a website? Or should it be rewarded for extending the conversation with even more specific information that takes the searcher closer to their objective (and the objective of the website, which is to sell cars).
Yes, the page above provides the context, but the deeper page provides more relevance to my particular search/intent.
Is it more obscure? That's relative to the point of entry. If my search was quite specific, – e.g. "audi models <year>", then I would likely be taken straight to that page.
In my hypothetical Google world, the algorithm could potentially 'learn' the most popular sequence of page visits performed on any particular site and reward subsequent patterns by increasing the rank of each visited page.
Or it could recognise that the journey taken by most searchers reflects the logic of the sitemap that has been submitted.
Or, Google could recognise that the subject matter of a linked page is still directly relevant to the initial search term. E.g. the original query might be "audi near me" (ha ha) which takes me to the homepage, and the linked page title might be "audi <year> range", adding value to my search. It's not random clicking, it's based on a considered page/content structure.
This is turn sends signals of quality and credibility that add further value to the page.
Overall, this would encourage site owners to create better, more considered websites and content. And SEO would be an unquestionable element of the design and build because it would correlate directly with the overall intent of the website.
Back to the real world…I do appreciate that PageRank is just one form of measurement and note your explanations in comments above about relevance (and other factors) being considered by Google before PageRank even enters the equation. It's such a complex subject. 🙂
Ammon ✍️ 🎓 » Lanyon
I love talking about Search Engine Optimization (SEO, and online marketing in general, as it truly is a passion of mine, so chatting about it with someone actively asking is always a lot more fun than trying to talk about it with people not that interested. 🙂
I'll try to address all the major points, and even in order, but whether I manage that as I type we'll have to wait and see.
>>talking about internal links, not external, and assuming that Google could recognise the difference
Google certainly *can* recognize the difference. The more pressing question is of course whether they choose to, and that comes down to whether or not it significantly improves results to do so at the cost of extra processing cycles. I generally see a lot of signs that they don't bother that much except in bolt-on processing, stuff like Panda and Penguin. But of course, Google do have patents about counting it somewhat in the link weighting, just like 'Reasonable Surfer' applies weighting to links by location. The evidence however is that the weighting difference is tiny, like at most a couple of percent difference plus or minus, if that much.
>>proposing that specificity be rewarded by Google, not penalized by pagerank dilution
It is. We call this rewarding 'Long Tail Search' and we call the way to gain that reward 'Long Tail Keywords'. i.e. more specific content ranks for more specific queries. In my previous reply I gave the example of car insurance, and third-party car insurance, being more specific pages deeper than the broader insurance pages. The car insurance page would be a better match for those searching for 'car insurance' or 'insurance for my car', but a much worse result for "insurance policies" where they hadn't said (either specifically or via their recent search history) that they wanted it for a car, and not a home insurance, life insurance, medical insurance, key-man insurance, or other 'insurance' interest.
This already happens, without any additional special measures, because the broader, higher-level content will tend to have a higher share of the PageRank through the site, but less matching to the keywords, synonyms, and more content talking about other less specifically-relevant stuff. While those deeper, more topically focused pages tend to be ALL about the specifics, thus have more relevancy signals matching the query – more synonyms because more content on the specific topic, instead of just one or two paragraphs on a page talking about other topics too.
To take your Audi example, if I search for just 'Audi', Google have very little context to work with, so they have to play kind of safe. The most obvious top result, considering that Audi is a famous brand, and an entity, is to have that big link to their official homepage for the manufacturer (brand). If they have a YouTube channel, an official twitter account, and other such stuff, that will probably be in the first page results too.
Assuming Google have any level of location data on you, they'd probably want to include your local official Audi dealership too. Again, this is a safe result since there will likely be help and links there to all the more specific things, and going from your unspecific query, broad results like that have the best chance to help the most people.
What would be terrible as a result is to link to a specific product page for, say , an Audi R8. No matter how cool the car, the chances that that's exactly what the searcher wants are minimal. Nor do I want a blog page from someone just talking about what a great car it is, unless that blog is on a well respected car reviews site, and the post is an indepth review of Audi as a whole, their ethos, etc.
We cannot know whether the person searching for 'Audi' wanted to buy a car, find out where the nearest dealer was, just learn about the brand, or find spare parts. Thus a page about the top models of <year> is a poor result.
The person who wants that result will almost certainly either be able to browse there quite quickly and happily from the official Audi homepage (or local dealership) all on that first click, or else will soon refine their search adding words like 'latest models' or "family saloon' or whatever refinements (long tail keywords) fit the specifics they want.
Kolodgie » Ammon
Do internal links have the same dampening factor as external links?
Yes. PageRank itself makes no mention of same site, same domain, etc. There are simply one URL linking to another. All the stuff about ownership or whatever tends to be in after-filters, or in weighting the links somehow in the link crawl. PageRank is the calculation part, and doesn't even understand the concept of 'sites', only link structures across the entire graph.
Fun experiment for every SEO to try once. Take a domain you can create subdomains on as you like. You have all already heard that Google treat subdomains as separate sites, right?
Okay, now create a site of ten pages, but instead of giving those pages different filenames in a single directory, make each the index page of its own subdomain. Create a navigation menu that works, but where each URL is on its own subdomain. The only thing making this a site, to the user Point Of View (POV is the link structure. In URLs this is not a site. Will Google treat this close enough to being a site, despite that no two pages are the same site technically? Its fun, and very educational.
Kolodgie » Ammon
Have you done this subdomain test? If so, what were the results?
Even though pagerank doesn't specifically mention a difference between internal and external links, in your experience, have you seen a difference in dampening factor?
I agree with everything you're saying and love this topic.
Ammon ✍️ 🎓 » Kolodgie
In my tests (multiple, I ran this a few times), there was very little difference in how Google reacted to this 'site' as opposed to how it treats one all on the same domain. The link structures are the majority of how Google determines sites, and inter-page relationships/relevancy.
The fun thing is that this is such an easy experiment to run that everyone should try it sometime and see for themselves how it responds and feels. You can also do the same thing with entirely different domains, having each page in the 'site's navigation on a completely different domain, but thanks to the navigational links, it looks and behaves like a regular site.
(Obviously the real reason we'd never do this outside of experimentation is that you can't easily manage the analytics and tracking, because of the cross-domain permissions and security stuff).
Haider » Ammon
People have previously posted in the group that linking back to the same page e.g. from a service page to the same service page, also increased their authority and hence improved rank.
Well, it does seem logical and acc. to the above math if 8 external links are to other service pages and we add one to the same service page that would be nine. Whatever the division is, will some portion of the link juice will flow back to the page?
The passable link value of a page is divided between all the links on a page. If there are ten links then each gets one-tenth of the passable value, if 20 links then each gets one-twentieth, etc. Linking to the same URL twice doesn't pass double the value though, but may still have split the value by one more link than needed.
There are several papers and patents that talk about passing more or less value according to where the link is, how big, how likely to be clicked according to modelling, etc. but those who tested this style of weighting and segmentation against proportional link values as could be determined on Google itself, the difference in weight had to be fairly small – single-digit percentile amounts between the most up-weighted and the most down-weighted.
Incidentally, it sounds like you are talking about what is termed 'self referencing links'. Honestly, there's no reason for these to be counted or helpful, but since a huge number of sites have a default header across the entire site, meaning their homepage carries a link to the homepage, there's not going to be a penalty for it either.
If you want to believe that a self-referencing link is going to help, despite any logic an engineer would have wanted to program, what I'd suggest is actually finding a way to use in-page anchors, i.e. linking to a named anchor on the same page, so that it is technically both a different URL, and yet also counting toward that page (or at least a section of it).
Haider » Ammon
I got u and yes, I know self referencing like table of content links are a recommendation by some SEO guys. Am not sure about the ranking but to me table of content and linking to specific section of a page, just improves the user experience and hence bounce rate and session duration.
Guest Post | Backlink | Press Release | Pagerank | Links Indexed By Google Start Affecting SEO (an Opinion to Discuss)
55k Backlinks From Decent DA and PA | Churn and Burn | Temporary Cheap and Spammy Tactics