Pippin Williamson is doing a great series on when to use custom database tables in WordPress. If you haven’t read at least his primer yet, do it now – seriously, I’ll wait. This tutorial is going to assume that you already know the general structure of your data, but if you do have any questions about this feel free to ask. Lastly, for those of you who like to read the last page of a book before starting, all of the code is publicly available on GitHub. P.S. The code for this tutorial heavily relies upon language features available in PHP 5.3.

When I use custom tables in a WordPress project there are a couple of requirements I have:

  1. Clear and well defined models. I don’t like passing around stdClass objects throughout my code. It’s too easy to make mistakes – is that property customer or customer_id – and can be a pain for other developers who have to use your code later on.
  2. Performance. If I have to retrieve an instance of a particular model multiple times in one page-load, I don’t want to have to make a query to the database every time.
  3. Easily queryable. I don’t want to have to write raw sql throughout my application, and I doubt you do either. It should be easy to find the data I’m looking for, without referencing the mysql handbook every five minutes.

Throughout this tutorial I’ll be using simplified examples from my upcoming Software Licensing add-on from iThemes Exchange. On that note, let’s get started!

1000 Foot Overview

This project makes use of a number of classes stretched across multiple directories. So you don’t get lost I’ll give you a quick overview here.

  • Cache and Cacheable are very basic wrappers for WordPress’ Object Cache. These are used in lieu of those functions so we can easily pass a Cacheable model to the Cache manager.
  • Table is the interface that defines the contract between custom tables and their managers. For example class Customers implements Table.
  • Simple_Query and Complex_Query are two classes used for retrieving models. Simple_Query provides the CRUD functionality, while Complex_Query is an abstract class that can be used to provide WP_Query like functionality to your database model. Along with this is the Builder class and an assortment of sql tag classes such as SelectFromWhere, etc…
  • Model is the heart and soul of this project. It is what represents objects that come from a certain Table. It is an abstract class that is meant to be extended for each Table interface. For example class Customer extends Table.
  • Manager is a utility class that handles registering and installing custom tables with WordPress. It also provides convenience methods for retrieving Simple_Query and Complex_Query objects for a particular Table.

The Table Interface

For every custom table you plan to use, you should create a separate class that implements the Table interface. If you like, you can see the entire Table interface on GitHub, but here we are going to go over a concrete example.

The following class Releases is used for storing information about releases – who’d’a’ thunk it. Each new release of the product gets its own entry in the Releases table.

The first method get_table_name( \wpdb $wpdb ) is used to retrieve the full name of the table as it should appear in mysql. It is important to not only use the prefix provided by wpdb, but to also provide another prefix that is unique to your project or plugin. get_slug() is mainly a convenience method to retrieve a human readable representation of what this table stores.

The next three functions get_columns() and get_column_defaults(), and get_primary_key() define what data this table stores. The first method returns an array of column names to their associated sprintf type. These are used by WordPress when we either insert data or update data in the database. We are limited here to %s%d, and %f. The second method returns an array of column names to their default values, and the third method defines what column is to be used as the primary key for this table. In most cases the primary key is an auto-incrementing integer ID, but it doesn’t have to be.

Lastly we have get_creation_sql( \wpdb $wpdb ) and get_version(). The first method returns the sql statement used to create the table during installation. The second method defines the version of the table as it exists in code. The Manager class worries about keeping track of the currently installed version, and if needed will run the get_creation_sql() method again using dbDelta() to upgrade to the newest version of the table.

Once your table class is complete, you should register it with WordPress and install it during upgrades or activations.

Manager::register( new Releases() );

$fn = function install_tables() {
    Manager::maybe_install_table( new Releases() );
}

add_action( 'itelic_activate', $fn );
add_action( 'itelic_upgrade', $fn );

Defining your Models

The next class to explore is the abstract Model class. This class is critically important, so we will be examining both the base abstract class and a concrete example using our Releases table from earlier.

If you weren’t able to process that all on the first glance, don’t worry we’ll take it step by step.

get()

The first method to look at is get( $pk ). This method is a static method that is used to retrieve objects by their primary key (pk). It is a static method that utilizes the get_called_class() function provided in PHP 5.3. Essentially, this allows us to determine what class get() was called from. With that knowledge we can determine which database table the record is located in.

First we call another static method get_data_from_pk() to try and retrieve the record’s data from the database. If we successfully retrieve the data, we then instantiate a new model with the correct class. Next, we call the method is_data_cached() to determine if this model exists in the class. If it doesn’t we store it using the Cache manager I talked about earlier.

get_data_from_pk()

The next important method is get_data_from_pk( $pk ). This is how we actually retrieve the data from the database. First we attempt to retrieve the data from the cache, saving us a costly trip to the database. If the data doesn’t exist in the cache we have to go ahead and get it from the database. To do this we call the method get_cache_group(). This method uses a prefix of static. This keyword functions similarly to get_called_class(), it calls the static class get_table() from the class that called this function. For example, in a class Release with a definition of Release extends Model, when we call Release::get_data_from_pk( $pk ) the method Release::get_table() is called, instead of Model::get_table().

get_table()

This method returns an object implementing the Table interface. This is how we are able to intelligently determine the correct database table to use depending on which model class is being used. Unfortunately PHP doesn’t allow us to define abstract static methods – for good reason, though – so if that method is not overridden in a child class of Model we throw an UnexpectedValueException.

update() and delete()

These methods aren’t particularly interesting, but they contain two very important bits of code. The update( $key, $value ) method updates the data in the database, and it also updates the value in the cache. This second step is critical to ensure that the cache and the database aren’t out of sync. The delete() method does the same thing; it deletes the data from the database and from the cache.

get_data_to_cache()

This is an interesting method. It tries to be as smart as it can, but it has its limits.

public function get_data_to_cache() {

	$data = array();

	$defaults = static::get_table()->get_column_defaults();

	foreach ( $defaults as $col => $default ) {

		if ( method_exists( $this, 'get_' . $col ) ) {
			$method = "get_$col";

			$val = $this->$method();

			if ( is_object( $val ) ) {

				if ( $val instanceof Model ) {
					$val = $val->get_pk();
				} else if ( $val instanceof \DateTime ) {
					$val = $val->format( 'Y-m-d H:i:s' );
				} else if ( isset( $val->ID ) ) {
					$val = $val->ID;
				} else if ( isset( $val->id ) ) {
					$val = $val->id;
				} else if ( isset( $val->term_id ) ) {
					$val = $val->term_id;
				} else if ( isset( $val->comment_ID ) ) {
					$val = $val->comment_ID;
				}
			}

			$data[ $col ] = $val;
		} else {
			$data[ $col ] = $default;
		}
	}

	return $data;
}

This method returns the data as it should be stored in the cache. To try and make this as seamless as possible, the method makes an important assumption, that for each column in the database a method exists in the child model class that follows the format of get_$column(). If that method doesn’t exist, than the default value is saved. Don’t worry though, this method is meant to be overridden.

If a method following that format does exist, it retrieves the value and attempts to convert it to something that can be stored in the database. First it checks if the value is an object, because objects by themselves cannot be stored in the database. Next it performs a variety of checks to try and determine a scalar value to save in the database. First it checks if the value is another Model if it is, then all we have to do is get the primary key of that object. Next it checks if it is a DateTime object, and if so, formats the date to the format required by mysql. The next four if statements try to determine if it is a WordPress object such as a post, a comment, or a term.

Finally, all of this data is returned in an array of column names to their respective values. The function get_cache_group() is then used to determine in what WordPress Object Cache group to store the values. By default, this is the slug specified by the Table class, but could be overwritten if need be.

An Example

This is a highly simplified example of what a subclass of Model looks like. Not all of the getters and setters are present in the gist, and most of the edge cases the model has to handle aren’t there as well. Also, in the real class I use class constants to represent the status, instead of passing around a string.

There are a couple of things I’d like to point out here. First is the init( \stdClass $data ) method. This method is what initializes the object with data. It is defined as a separate method, and not in the constructor so that the serialize() and unserialize() functions can make use of it. Next is the create() method. This is how new objects are created and inserted into the database. The two important things to notice is using the Simple_Query object to insert the data, and then caching the data using the Cache manager.

In this example I’ve also included some examples of what getters and setters look like. Setters make it easy to update an object’s properties while enforcing a restricted set of inputs. It also lets you intelligently call methods when a value changes. In this example the pause() and activate() methods are called automatically when the release’s status is changed.

You can also take a look at the get_table() method to see how we are returning the correct table object for this model.

Using the Query Objects

We have two ways to retrieve our custom data from the database. We can either use the Simple_Query object or the Complex_Query object. The Simple_Query class is primarily meant for retrieving and editing information about one particular record, but it also provides a couple of other utility methods. Simple_Query is very similar to how Pippin handles custom database tables in his plugins Easy Digital Downloads, and Affiliate WP. But it is modified to make use of an injected Table class, and uses the sql builder classes I briefly referenced earlier. Complex_Query objects, on the other hand, are used to retrieve multiple records based on a variety of where conditions.

Simple_Query

Retrieving a Simple_Query is easy by using the provided factory method in the Manager class. Simply call Manager::make_simple_query_object( $table_slug );. Once the object is available to you, there are a variety of methods you can perform on it. This is a long, but not very complex class, so I’ll only highlight a couple of methods.

What not to do

Simple_Query provides insert()update()delete(), and delete_many() methods. These are used by Model classes to perform CRUD operations. Since these methods themselves don’t handle caching and cache invalidation, you should likely stick to their Model class equivalents.

What to do

Simple_Query is handy tool for retrieving a single record by a particular column. In most cases this would be by primary key, and you should use Model::get() to do that. But if your table has multiple unique columns you can use Simple_Query::get_by() to retrieve the raw data that matches the column you specify. Additionally, if you only need to retrieve a particular column, perhaps retrieve the ID of a customer that matches a certain email, you can call Simple_Query::get_column_by().

count() method is also provided to quickly determine how many records match a certain where clause. Simple_Query::count() expects an array of column to value conditions. All conditions are looking for an = match, and are ANDed together. For your more complex counting needs we have the Complex_Query class.

Complex_Query

Complex_Query is what we use when we need to make WP_Query like queries to find data. It handles pagination, distinct queries, order by clauses, and retrieving full models, or partial column lists. Again, this is a long class, so I’m not going to cover every method, but if you have any questions let me know. Complex_Query is an abstract class that requires a Table object and a set of $args used to build the query. In most cases when you extend Complex_Query you’ll provide the necessary Table object within your parent constructor call, so your users only need to pass their desired $args array.

Pagination

Pagination is handled with the items_per_page and page arguments. By default all records are returned, but you can turn on pagination by specifying the desired number of items you’d like to have return on each page, and by specifying a page number. page is 1-index based, so for the first page pass 1 not 0. When a value other than -1 is provided to items_per_pagesql_calc_found_rows is automatically turned on to retrieve the total number of records that match your where conditions. If this isn’t necessary you can pass false to prevent that query from being run. If you are only retrieving the first 5 records that match a condition, and you don’t need to provide pagination, you should turn this off for performance reasons.

Select

By default, Model objects will be returned from the query. However, you can specify a variety of other return types if you’d like as well in the return_value argument. You can pass an array of column names, or a single column name to only retrieve data from those columns. Or you can pass count to simply retrieve the total number of records that match your where condition, without actually retrieving the records from the database. Finally, you can optionally pass true in the distinct argument to apply a sql distinct filter on the results returned.

Order

Records can be ordered by passing an array in the order argument. The array should follow the format of $column_name => $direction, where direction is either ASC or DESC. The passed array can have one value, or multiple. If multiple orders are provided they are evaluated left to right. If you’d like to order by rand simple pass rand to the order argument instead of an array of column values.  The parse_order methods also makes use of translate_order_by_to_column_name(). This method can be used if you want to use a more human readable order by value in the order argument. For example customer could translate to the column cid in your table.

Where

I’ve saved the best for last. To really see how this works I’ll show you an example. But I wanted to go over one of the building block methods first Complex_Query::parse_in_or_not_in_query(). This allows you to do things like the following to retrieve all releases with a type of major for all products with an ID of either 5, 12, or 32, that aren’t the v1.0 release.

$args = array(
    'product__in'     => [ 5, 12, 32 ],
    'version__not_in' => [ '1.0' ],
    'type'            => 'major'
);

Now I’m going to go over the Complex_Query object for releases. This is the simpler of Complex_Query subclasses. It is also possible to do joins or between conditions using the packaged Where and Join sql tag classes. If you’d like to see some example of this let me know.

Let’s get the easy stuff out of the way first. In the constructor, we only accept the $args array, and pass the Table object to the parent constructor directly. Next is make_object() this method is called with the full data of an individual record and is expected to return a Model subclass. This data should not be cached. The caching is handled by Complex_Query internally.

Now for the big one, build_sql(). This method is responsible for returning a sql query that will then be executed by wpdb. You can build this however you like, but it is easiest using the provided Builder class and its associated sql tag classes. This method is a fairly straightforward assembly of sql parts. First we instantiate a new Builder object. Then we ask Complex_Query to build the SELECT tag for us based on the arguments provided by the user. Next we construct a FROM tag based on the current table we’re querying. We also pass an alias of q. This is used for differentiating between records when using JOIN statements or subqueries.

The next section is for building the various where clauses. Each individual where clause is built by a method in the class, and either returns null or a Where object. If it isn’t null we AND it with the previous where statement. The Where object also supports OR and XOR expressions as well. It can also keep track of nesting if you need a more complex where statement.

After we have our full Where object constructed, we build an ORDER tag and a LIMIT tag based on the pagination parameters. Finally we assemble all of the pieces together and return the full SQL statement.

parse_product() and parse_download() are fairly straightforward wrappers for the parse_in_or_not_in_query() method we talked about earlier. parse_status() allows us to build a Where statement similar to how WP_Query handles the post_type argument. This could also be a in or not in query, but is implemented this way for convenience sake.

parse_start_date() is pretty cool. WordPress provides an awesome class called WP_Date_Query that allows us to perform date queries like before, between certain months, days and a whole lot more. We use that class to power our Where_Date class. This allows us to only retrieve releases that occur on certain days of the week, or happened in the previous year, etc…

Conclusion

I hope this was an informative tutorial of one of the ways you can manage and implement custom tables in WordPress. This is by no means the only way to do it, or the correct way to do it. I would love to hear your feedback, ideas, and questions.

The DB package is available on GitHub under the MIT license. I hope to add composer support soon.

Published by Timothy Jacobs

1 Comment

  1. Awesome tutorial. Thanks Timothy. Love the clear details and explanations. Now it’s time to get coding!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *