Handling enumerated values

From time to time we come across the need to define class properties that take values from a limited set of values. A set of values that rarely (or never) changes and when it changes it changes only manually (by the developer) and not as a result of the running application. Usually, these values are strings because it is easier to express their semantic through a real word or phrase. For example, an Invoice class that has a status property taking values from {“pending” , “paid” , “cancelled”} or a User class with status property taking values from {“active” , “blocked” , “banned” , “new”}. For the sake of this post, we will call such a property an “enumerated property”.

A quick look at the database level

Managing enumerated properties should, generally, happen it two levels. In the code level and in the database level (a database here represents your persistence layer and, though it doesn’t have to be a database, it usually is). We are more interested in how we should handle them in the code level but we will also try to have a look in how persistence affects our choices in the database level. In fact, we will first have a quick look in the most straightforward solutions for storing such properties and then we will move to what happens at the code level. We will assume the use of MySQL, but you can easily extrapolate the ideas according to the functionality that is offered by your database.

So, what are the major ways to store enumerated properties in a MySQL database ?

(a) Use the MySQL’s ENUM data type

CREATE TABLE `users2` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`username` VARCHAR(30) NOT NULL DEFAULT '0',
`password` VARCHAR(80) NOT NULL DEFAULT '0',
`status` ENUM('single','married','divorced') NOT NULL,
PRIMARY KEY (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
AUTO_INCREMENT=7

It may seem the easiest one but there are several reasons why you may not want to go with this. A nice overview can be found here: http://komlenic.com/244/8-reasons-why-mysqls-enum-data-type-is-evil/ . Even if you define an ENUM type based on a set of acceptable string values, MySQL represents and stores internally each string value as an integer. ENUM columns are stored based on these integers and not as string values. However, keep in mind that ENUM is not bad. It is just tricky. If it offers considerable advantages to your specific case, you should take it into consideration.

(b) Use a reference table with numeric key

I would say the most popular solution. This is how a snapshot of the user_status and the users2 table would like:

(c) Use a reference table with string key.

A bit simpler to the previous case and it makes it easier to understand what is going on when looking into the database. The main disadvantages are the lookup speed and the size of indexes. But keep in mind that if no side-information is needed in the reference table, then you don’t need to JOIN the reference table in your queries. The only JOINS that will be required are the ones related to the referential constraint when you add or modify rows of the users2 table. What is more, indexes are not always needed since an enumerated column is a column with low selectivity.

Handling enumerates values at the code levelĀ  (Part A: without ORM)

Now, let’s move to the code-level, the more interesting part. Dealing directly with string values can be error-prone. A small typo can lead to wrong functionality and such mistakes are difficult to detect because the typo may change how the application works but not brake it. Here is where enumerated types come into stage. Enumerated types prevent invalid values from getting saved to a repository. As you may know, there is no inherent support for enumerated types in PHP. We have to work around this limitation. We will take advantage of PHP’s OOP features and try to imitate the works of an enumerated type. Three examples follow going from simpler to more elaborated implementations.

(a) Enumerated values declared as class constants

class JobStatus {
    const UNEMPLOYED = 'Unemployed';
    const EMPLOYED   = 'Currently working';
    const RESTING    = 'Taking a brake from work';
}

class Person {
    private $jobStatus;

    public function setJobStatus($jobStatus){
        $this->jobStatus = $jobStatus;
    }

    public function getJobStatus(){
        return $this->jobStatus;
    }
}

$person = new Person();
$person->setJobStatus(JobStatus::RESTING);
echo $person->getJobStatus();

Of course, it is better than using just strings but there are two problems with such a naive implementation. First, no type-hinting is possible and hidden dependencies are created. You cannot tell that class Person is dependent on class JobStatus even if you go through Person’s implementation. Developer’s should know it by heart that they should pass a constant and not a string.

(b) Enumerated values declared as interface constants

interface JobStatus {
    const UNEMPLOYED = 'Unemployed';
    const EMPLOYED   = 'Currently working';
    const RESTING    = 'Taking a brake from work';
}

class Person implements JobStatus {

    private $jobStatus;

    public function setJobStatus($jobStatus){
        $this->jobStatus = $jobStatus;
    }

    public function getJobStatus(){
        return $this->jobStatus;
    }
}

$person = new Person();
$person->setJobStatus(Person::RESTING);
echo $person->getJobStatus();

This is similar to the example (a) but now we can reveal the dependency to these constants by making the Person class implement the interface. Again, no type hinting is possible. What is more, in my opinion, this is a kind of abuse for the OOP interface concept. This abuse is even obvious from the name that is being used for the interface. A “Person” implements a role of “JobStatus” ? Hmmm! Maybe a better name could take away these concerns. A “Person” could implement the role of “WithUserStatus” or something like that. And what if the Person class (or any other class) uses 2 ,3 or more enumerated values ? Classes will start to become bloated. Ugly, in my opinion.

(c) Class that implements an enumerated type

I don’t want to focus here on how to implement such a class. I will use a popular one that you can find here: https://github.com/myclabs/php-enum It is a very small class and you can easily go through its implementation. The important thing is how such class can change our code.

include 'Enum.php';

use MyCLabs\Enum\Enum;

class JobStatus extends Enum {
    const UNEMPLOYED = 'Unemployed';
    const EMPLOYED   = 'Currently working';
    const RESTING    = 'Taking a brake from work';
}

class Person {

    private $jobStatus;

    public function setJobStatus(JobStatus $jobStatus){
        $this->jobStatus = $jobStatus;
    }

    public function getJobStatus(){
        return $this->jobStatus;
    }
}

$person = new Person();
$person->setJobStatus(JobStatus::RESTING());
echo $person->getJobStatus();

As you can see, no hidden dependencies exist and type hinting is now possible. The creation of an enumerated type is as simple as extending the Enum class and defining as constants the set of values that are part of the enumerated type.

Handling enumerates values at the code levelĀ  (Part B: with ORM)

Everything is nice up to here. But I know what you think. What if I am using an ORM ? Do things get complicated and how much ? Though I am not a fan of ORM’s, I would like to give some examples for this case, too. And the reason is :

– Writing custom code always provide more flexibility, especially in handling persistence. So, the usage of enumerations may trouble more people when using an O.R.M.

– I can’t deny reality! O.R.Ms are quite popular, especially in small or medium-size applications.

The ORM that I will use for the examples will be Eloquent, but this is not so important. The idea illustrated by the examples is what counts.

(a) ORM without Enumerated Type

The ORM model:

namespace App\Models;

use Illuminate\Database\Eloquent\Model;

class SimpleUser extends Model {

    protected $table      = 'users';
    protected $fillable   = ['username','password','status'];
    public    $timestamps = false;

    public function updateStatusTo($newStatus)
    {
        $this->status = $newStatus;
    }

    public function setStatusAttribute($statusValue)
    {
        $this->attributes['status'] = strtolower($statusValue);
    }
}

First, let’s say a few words about this model for those who are not familiar with Eloquent. The setStatusAttribute is a “magic” setter for the “status” property. In Eloquent models most properties are publicly accessible and someone can directly assign them values like this:

$this->status = $newStatus;

This setter steps in when such an assignment takes place in order to validate or modify the assigned value.

Now, what about the updateStatusTo method ? Since the status property is publicly accessible, the updateStatusTo method is not needed. We have just added it in order to demonstrate the use of the enumerated type in case we need to pass a parameter to the a method.

Using the model:

use App\Models\SimpleUser as User;
...
// Create
$user = new User([
    'username' => 'gooffy1',
    'password' => '45t049t7n4cny48y',
    'status'   => 'single'
]);
// Save
$user->save();
// Query
$user = User::where('status', 'single')->first();
// Check equality
if ($user->status === 'single') {
    echo "Yes, this user is single.";
}
// assign
$user->status = 'married';
$user->save();

// pass parameter
$user->updateStatusTo('divorced');
$user->save();

(b) Use validation logic in the model

A middle ground between using strings and defining a enumerated type is to make the model watch out for invalid values.

The new model becomes:

namespace App\Models;

use Exception;
use Illuminate\Database\Eloquent\Model;

class ConstUser extends Model {

    const MARITAL_STATUS = [
        'single',
        'married',
        'divorced'
    ];

    protected $table = 'users';
    protected $fillable = ['username','password','status'];
    public $timestamps = false;

    public function updateStatusTo($newStatus)
    {
        if (!in_array($newStatus, self::MARITAL_STATUS)) {
            throw new Exception('Invalid user marital status.');
        }

        $this->status = $newStatus;
    }

    public function setStatusAttribute($statusValue) {

        if (!in_array($statusValue, self::MARITAL_STATUS)) {
            throw new Exception('Invalid user marital status.');
        }

        $this->attributes['status'] = strtolower($statusValue);
    }
}

Using the model:

This model can be used in the same way as the previous one. Just remember to change our Use declaration:

use App\Models\ConstUser as User;

Of course, it is an improvement but there are a few problems here. First, typos can still happen when checking for equality or building queries through the ORM’s query builder. Second, no type hinting is possible. So, in every place we need to check for a valid value we need to check if this value is included in the ConstUser::MARITAL_STATUS array. It would be nicer to have the method signature make this check automatically. Finally, the IDE cannot help us remember the set valid values for a user status. We need to go back to our model and check it out.

(c) Using an enumerated type

We will use the same MyCLabs\Enum\Enum class that we introduced earlier in this post. The definition of the enumerated type is:

namespace App\Enumerations;

use MyCLabs\Enum\Enum;

class UserStatus extends Enum
{
    const SINGLE   = 'single';
    const MARRIED  = 'maried';
    const DIVORCED = 'divorced';
}

The ORM model will use the enumerated type class as follows:

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use App\Enumerations\UserStatus;

class EnumUser extends Model {

    protected $table = 'users';
    protected $fillable = ['username','password','status'];
    public $timestamps = false;

    public function updateStatusTo($newStatus)
    {
        $this->status = $newStatus;
    }

    public function getStatusAttribute(string $statusValue)
    {
        return new UserStatus($statusValue);
    }

    public function setStatusAttribute(UserStatus $statusValue)
    {
        $this->attributes['status'] = $statusValue->getValue();
    }
}

As you see, setting the status property requires a UserStatus instance as a value that will be converted to a string. And every time we access this property the string retrieved from the database will be converted to a UserStatus instance before returned.

How the use the Model:

use App\Models\EnumUser as User;
use App\Enumerations\UserStatus;
...
// Create
$user = new User([
    'username' => 'gooffy1',
    'password' => '45t049t7n4cny48y',
    'status'   => UserStatus::SINGLE()
]);

// Save
$user->save();

// Query
$singleUser = User::query()->where('status', UserStatus::SINGLE())->first();

// Check equality
if ($singleUser->status->equals(UserStatus::SINGLE())) {
    echo "It is equal!.";
}

// assign
$singleUser->status = UserStatus::MARRIED();
$singleUser->save();

// Pass parameter
$singleUser->updateStatusTo(UserStatus::DIVORCED());
$singleUser->save();