mySQL enum Fields

One non-standard data type that is supported only by mySQL is the enum data type. An enum field basically has a list of valid string values that the field can contain and whenever a record is inserted or updated it must be given one of these values.

This may sound like it is a very useful field type to have but in fact it has a number of issues with it that mean that it isn't suitable for many of the situations that you might think to use it in.

The first problem with this field type is that the values are stored in the table definition itself and so any reference to the list of values from elsewhere in your code will need to be hard coded somewhere else as well. These two places will need to be kept synchronised if there are any changes to the possible values. This field type therefore does not work well if there are ever likely to be changes to the possible values that the fueld can have.

There are further problems when it comes to changing the values in the list of possible values if you ever need to make any changes to the existing values that the field can have. This is because the field doesn't actually store the values as specified in the enum in the database itself. Instead of the string values the database actually just stores a number corresponding to the position that the particular string is in the list. This means that if you ever delete a value from the list that all of the entries for that or subsequent values in the list will change their meanings unless you actually rebuild all the data. A second effect of the way enums are stored is that you cannot sort properly on an enum field unless you define the values in alphabetical order since the field will be sorted on the actual numerical values that the actual data contains and not on the corresponding string values. This makes changing the possible values even more difficult if you also need the values in alphabetical order.

In practice the usefulness of this field type is directly related to the frequency at which the values in the field might change. Where the values are guaranteed to remain completely static and never change then there is no problem in using the field type since you can set it up once at the start and then forget about it.

Any change to this field type will involve complicated changes to your code and so the usefulness of the field where the values might change is greaty reduced unless any changes to the values are going to involve significant changes to your code anyway. For example I have a notes table that I set up that uses an enum field to indicate the type of note. The only time this enum field ever needs changing therefore is when I need to add another type of note. Since each type of note is referenced from completely different places in the code the new note type will need significant modifications to the code in order to support the new note type anyway and so the changes required when updating the values for the enum in that instance are just an insignificant portion of the changes required to implement a new note type.

When a field can have its list of possible values updated at least occassionally without significant changes to the code being required apart from those specific references to the field values then a better alternative to using an enum field type is to set up a separate lookup table in the database. Instead of the numbers that are stored in the database being defined by the corresponding entry in the enum definition of the field itself, the number is the value of the key of the lookup table entry that contains the text description.

Using a lookup table rather than an enum field has several advantages. The first is that by joining the tables it becomes possible to perform database processing on the text string value itself instead of just on the number equivalent. This means that you can sort the data alphabetically on the values without needing the fields to have been defined first in alphabetical order.

Deleting a possible value no longer upsets the integrity of the data when you use a lookup table since unlike an enum where all the numbers must be continuous a lookup table can have gaps in the numebring.

A lookup table also avoids the need to hard code the possible values into the code since a list of all the possible values can be read directly from the lookup table.

There is also an additional flexibility possible when you have a lookup table. Where the data in the lookup table is going to change very infrequently you can just set up the table within the database itself and manually run any database updates if the values need to be changed. This is simpler to do than updating the actual table definition itself as is required if you need to change enum values as unless there is an error in the code there shouldn't be any additional action required to add, delete or change the allowable values that simply updating the table content does not handle. Also this update avoids the downtime that any changes to the actual table definitions require.

If the content is expected to change more frequently you can make it eve easier to do the updates by actually adding an update option for the lookup table into your application itself so that those using the application (or an authorised subset of those people) can make the changes to the lookup table themselves.

Lookup tables have one further advantage over using an enum field. While enum is a proprietary construct only supported by mySQL, a lookup table uses standard SQL commands for all its definition and access commands and so can be easily converted to work on a different database if required.


This article written by Stephen Chapman, Felgall Pty Ltd.

go to top

FaceBook Follow
Twitter Follow