Newsletter "Behind the Scenes" Newsletter

April 2012The monthly newsletter by Felgall Pty Ltd

My Word

Google and the Web

Does Google have too much control over the web?

With such a large percentage of people using the Google search engine when they want to find something on the web, it is Google that gets to decide what web pages that people can find and which pages they bury so deep in the results that no one apart from the person who created them ever knows that they are there.

Web pages that are at the top of Google's results for a particular search one day may be dropped out of sight the next and suddenly find that hardly anyone visits their site any more. This can easily happen with no action on the part of the actual site owner.

Yes there are other ways that people find web pages. There are other search engines that some people use. There are also links between related sites so that if you have found a page on one subject that you will often be able to simply follow links from that page to find related pages. Of course some people are now so used to using Google to find things that no matter how many useful links there are on a web page they'll return to Google to perform a search for that information rather than following the link.

The whole concept of the web is that pages link to other related pages and that someone ought to be able to find out whatever it is they want to know about a topic by simply following links from page to page. The only difficult bit was supposed to be finding a page about it to start from and web directories were originally created to fulfil this need. Directories differ from search engines in that they are manually created with real people adding all the links in the most appropriate categories with descriptions that tell you what each page provides.

Unfortunately as the web grew it became impossible to manually add all the new pages into directories. There just aren't enough people to do the work. Directories today tend to be on specific narrow topics where they provide a central resource for people interested in that specific topic to find lots of pages dealing with different aspects of the topic. Search engines were developed to automate the process.

The web grows at such a fast rate that even search engines are unable to keep up with all the new pages being added. One of the reasons why Google is so popular is that they have more than twice as many web pages in their search results as any other search engine does. Google is the only search engine to include over 10% of web pages in its database where there is at least a possibility of those pages showing up in the results.

The problem with using an automated process is that there needs to be a way to work out what a web page is about and what its relevance is with respect to what some particular person is looking for. Another reason why Google is so popular is that their automated processes do a reasonably good job of providing relevant pages toward the top of the results more often. The secret 'algorithm' they use for the matching is reasonably effective.

The problem is that where a computer is working out what pages are about and their relevance, it becomes possible to manipulate the results by making changes to the page content. Minor changes to the wording that make it easier for search engines to understand without adversely affecting the page for the real readers who will hopefully find the page can make it more likely that those people will actually find the page. Unfortunately there are people who take this to extremes and who misuse parts of the page that not everyone sees in an attempt to manipulate the search engines into giving them a higher position in the results than the page really deserves based on its genuine content. This is called Search Engine Optimisation - although it really ought to be called Search Engine Manipulation or even just simply Cheating.

As a result of this unwanted manipulation (since it makes the pages at the top of the results less useful and so may result in people switching to using a different search engine) the search engine providers try to keep their 'algorithms' that determine the topics and relevance a secret and they change them regularly in an attempt to make sure that anyone trying to manipulate the system has to keep changing their page if they want to maintain that underserved position in the results. This has side effects though on the web pages that are not trying to cheat. The changes that a search engine makes to the way it works out how relevant given web pages rank for a particular search will impact on all of the pages that it lists in the results for that search and not just on those trying to cheat.

The search engines change their algorithms frequently in order to try to block the cheats. If no one cheated then they would still change their algorithms but less frequently and only when the new ones were going to provide an improvement on the way that web pages are matched to searches. The changes just to stop people from so easily misusing a given test would not be required. Instead of genuine pages falling significantly in the results when a change is made to dump some of the cheats that unfortunately adversely affects non-cheating pages as well, web pages would only drop in the results when the search engines genuinely found other more relevant pages.

While there were a number of different search engines that were all popular and all using different algorithms to decide where to display pages in results these changes didn't affect genuine web pages as much. If a page was dropped down the results by one search engine it would still be listed in the same spot in other search engines and so would still get a large percentage of the visitors it deserves. Some of those visiting the page would find it useful enough to link to it and eventually the extra links to the page would assist in its recovery on the other search engine. Unfortunately with Google now having such a large percentage of the search engine market, a web page that undeservedly drops down the results there now may not get many new visitors at all and so may not gain any of the extra links it needs to assist in its recovery. In many ways Google now control the life and death of web pages. This is a power that no search engine should have.

It is people who have given Google that control over the web. It is those same people who will get to decide whether to continue to give Google the power to decide what web pages live and what pages die or to take that power away by starting to use other ways to search the web. Will the web eventually end up split between a living part that have listings on the first page of search results in Google and a dead part consisting of everything else with Google deciding which is which or will the people decide to restore some neutrality to the web by at least sometimes using other ways to find web pages?
 

On Site

A few new pages (marked * below) and quite a few more updated pages this month as I continue to work my way through all of the previously written pages looking to see what I can update and expand. Suggestions for additional new pages are always welcome.
 

What's New

The following links will take you to all of the various pages that have been added to the site or undergone major changes in the last month.

Main Links

Ask Felgall
Past Newsletters
Sign Up/Unsubscribe
Question Forum

Categories

Browsers
HTML
Javascript
Interactive Web
Mainframe
PC Software
Networking
Comms Software
Word Processing
DTP
Graphics
OS/2
Linux
DOS/Windows
NT/2000/XP
Book Reviews
Links

Other Links

My Javascript Site
My Blog

http://www.felgall.com/