Anatomy of a Web Address

Web addresses (or URLs as they are more commonly known) consist of a whole series of different parts. Some of these parts will appear in every web address that you see while others will appear so rarely that if I weren't about to go through them all here you'd probably never realise that they even existed.

So let's start by looking at a web address that contains all of the possible component parts so that we can see what all of the parts are.

http://fred:passkey@www.sub.somesite.com/members/mypage.htm?x=y&j=k#s5

You will probably never see such a web address actually occur anywhere but all of the parts that the above address includes will appear in addresses that occur somewhere on the web.

  1. The first part of a web address is the http:// on the front. This stands for HyperText Transfer Protocol and is what identifies what follows as a web address. The only spot where this part of the address is optional is in web browser address/location bars where what is entered is assumed to be a web address unless it is identified as something else.
  2. The next part of the web address is fred:passkey. This part of a web site address is only used if the site or sub-directory is password protected. In that instance the userid of fred and password of passkey would be passed in order to gain access. For sites without password protection this part of the address is completely ignored and so hackers have been knows to place a legitimate looking web address in front of the @ to try to convince unsuspecting people that they are actially visiting that site when the real site being visited is somewhere completely different. Because of the abuse many modern browsers no longer allow userids/passwords to be entered this way.
  3. The next part of a web address is www and you will often see this included in addresses. This is actually a computer identifier that identifies that the site as hosted on a computer that is serving pages to the World Wide Web. In most cases web addresses are configured so as to make the machine identifier optional and some may even be configured so that it must be left off.
  4. sub indicates that this is a sub-site of the main somesite domain. (as an example I have rail.felgall.com as a subsite of this site at felgall.com) This will be omitted if you are visiting the main site.
  5. The most obvious part of the above address that you will recognise is the part that appears in every web address that you will ever see. somesite.com actually consists of two parts that must appear in every address.
    • com is the top level domain to which the site belongs. There are a number of generic top level domains (net, info, org, com, biz, aero, name, coop, pro, museum, int) that are for sites with a world wide audience as well as a few US specific ones (edu, gov, mil) and hundreds of other top level domains that are allocated one to a country (eg. au - Australia, uk - United Kingdom, tv - Tuvala, la - Laos, us - United States). These country specific top domains may also be used for siters with a world wide audience but you can tell from the top level domain that their intended audience is a specific country.
    • somesite is the name of the domain within the top level domain that belongs to this site. In some cases such as in this example this part of the name is specific to the site while in other cases (particularly for country specific top domains) the domain name is further subdivided into a site name and sub-domain name. For example an Australian specific site may have the somesite.com.au domain name where somesite is the name of the site, com is the sub-domain and au is the top level domain.
  6. members/ identifies a directory or folder within the site. Many sites will organise their pages into a directory structure in order to make maintaining the site easier.
  7. mypage.htm identifies the actual web page to be displayed. If a page is not specified then the default page for the directory will be displayed (the server has a predefined list of default names to search for - index.html and index.htm are usually but not always at the top of that list).
  8. x=y&j=k is the query string being passed to that page. A query string is one of the ways that one web page can pass information to another web page (in this instance that 'x' should have a value of 'y' and that 'j' should have a value of 'k').
  9. and finally s5 identifies an anchor point within the page being loaded. Instead of the top of the page being displayed the page will be scrolled to the specified anchor point immediately after the page has loaded.

I don't think you will ever see a web address with all nine parts. In some cases you will only see part five. You will almost never see part two in a legitimate address since the idea is that you are typing your userid and password.

So next time that you see a URL (Universal Resource Locator) otherwise known as a web address you will know what the various parts of that address actually mean.

 

This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow
Donate