Sanitising Data

PHP 5 introduced a new tool for processing the data passed into your script. You can now sanitise your data using a PHP filter. These filters don't provide full validation of your data fields. What they do is to strip out characters that are never valid for the particular data type.

Here is a list of the sanitising filters that PHP 5 introduces.

By passing your data through one of these filters you remove those characters that may cause the actual processing of the field to fail at some point. For example sanitising a field as an int will remove all entered characters except for digits and plus and minus. That way you know that you can process the content of the field as a number without the code crashing due to something non-numeric appearing in the data.

Sanitising a field is not the same as validating it. While you may have sanitised a numeric field to remove all of the non-numeric characters from it that doesn't necessarily mean that your field is valid because the particular field may not be allowed to contain just any number. The actual valid numbers may just be those between two specific values or might be a specific list of numbers. By sanitising the data before validating it you simplify the process of validating it because you know prior to validating that the number is in the set of valid numbers that what you actually have is a number.

The same applies to the other filters. Sanitising an email address doesn't mean that you have a valid email address but it does mean that all the characters that appear in the field may be validly included in an email address and those characters which can never appear in an email address have been removed.

There are no equivalent sanitising filters built into JavaScript but you could always create your own JavaScript equivalent to the PHP filters and process your data through those first so as to obtain the client side equivalent of the processing that the filters provide on the server. When the changes to the form HTML proposed for HTML 5 are introduced you will also be able to use the pattern attribute to apply a regular expression to sanitise your input fields. Regular expressions can easily sanitise data by stripping all of the always invalid characters. A regular expression can always be used to sanitise a field where some other way of sanitising it such as the PHP filters is not available. Regular expressions cannot always fully validate fields though because there can be conditions apply that cannot be tested with a regular expression.


This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow