The first thing that you need to do in order to secure any PHP script is to ensure that all of the data that is being fed into the script is valid and cannot therefore contain anything that can present security issued.
In order to be able to do this effectively you need to be able to distinguish between fields that contain data that is known to be valid and fields that can contain invalid data. In order to achieve this separation between valid and tainted fields we need to have full control over what fields can have data inserted into them by the person who runs the script. That is why "register globals" was deprecated and is about to be removed from PHP because with that turned on all fields in yuor script are exposed to potential threats.
Those fields that you can consider right from the start to be valid are te fields that you create and assign known values to within the script itself. In those instances since you have complete control of the value the field is being set to you know that it is valid.
The next source of data which you may or may not be able to consider to be valid is that which you read from a database or files where you have control of the process that has created those files or loaded the data into the database in the first place and where therefore that data is presumed to have already been validated. The safest thing to do with this data source is to validate the data again in case someone has managed to somehow insert bad data into one of those sources. Continually validating the same data over and over is extremely inefficient though and so perhaps simply sanitising the data you read from those sources instead of validating it is perhaps an appropriate compromise. Sanitising data doesn't check that the content is valid but it does remove any characters known to be invalid and since that data has supposedly already been validated there shouldn't be any invalid characters for sanitising it to remove but if the data has become invalid at least sanitising it reduces the harm that data can do in your script.
The fields that you absolutely must validate before you do anything else with them is the content of the $_REQUEST array. Firstly you should never access that array directly since you have no way to tell which of the three data sources it loads from that any particular data came from. Instead you should access the three sources separately using $_GET, $_POST, and $_COOKIE so as to only look for the data coming from the source you expect it to come from.
In each case you will want to copy the values from these arrays either into a new array or individual fields and then only reference that new array or fields throughout the rest of your code. You should not just copy the value across from one place to the other though as that defeats the purpose of the separate fields. Instead you should validate the field and only once you know its contents are valid should you copy it. That way we will know that everything in the code after that validation step is valid data.
So just how should we validate fields? Well what you need to consider is just what each individual field is for and what content makes sense for that specific field and then validate as far as is sensible in confirming that the content of the field is valid. This provides benefits beyond that of security because it also prevents obvious garbage that is harmless but which isn't reasonable from being processed.
Where possible validations should be performed using functions built into PHP itself such as is_numeric or is_alphabetic. Where there isn't a direct function that can do the validation for you then the next option to consider is filter_vars which provides validation filters built in for validating a number of common field types including email addresses. Using one of these options reduces the chances of an error in your validation processing allowing something through that it is supposed to be blocking.
Where none of these validation options are suitable then the next best alternative is a regular expression to validate the content of the field. Where you use this alternative you may want to be somewhat more thorough in testing this particular code since if you make a mistake in your regular expression then there is the possibility of invalid data getting through. This is one of the places in code that sometimes needs to be patched when it is discovered that there is an error that introduices a security hole into your code.
Ensuring that all of the data your script is to use is valid before you try to use any of it and making sure that you can easily distinguish fields that have been validated from those that haven't is the most important step that you can take in starting to secure your script.
This article written by Stephen Chapman, Felgall Pty Ltd.