If you don’t have to deal with large groups of data, this post is not for you. But if you do, and aren’t familiar with the modulo operator, read on.
The modulo operation essentially calculates the remainder of a division. For example:
5 % 4 = 1
What’s very useful, is that you can take any number, and easily group it into any size group. For example, if you needed to separate 6 items into 3 groups (OK—obviously you would do this differently, but it’s just for demonstration), you can do the following:
1 % 3 = 1 2 % 3 = 2 3 % 3 = 0 4 % 3 = 1 5 % 3 = 2 6 % 3 = 0
Now your numbers are in group 0, 1, or 2. This gets much more useful when you think about how almost all RDBMS tables have a numeric primary key value. Need to split all the users in your system into 10 batches? Just calculate the user ID % 10 and you have all your users evenly (or, evenly enough) split.
This is a great way to implement sharding. For example, WordPress 3.0 and higher support multiple sites running on the same codebase. It does this by creating special tables for each site, in the format of PREFIX + ID + TABLE NAME (so a database might look like wp_322_posts). Well, when you have a thousand sites, each with 10 or more tables, you start running into trouble keeping them all in one database. That’s where the modulo operator comes into play:
function db_callback($query, &$wpdb) {
// $wpdb->base_prefix = 'wp_';
// $wpdb->table = 'wp_322_posts';
if (preg_match("/^{$wpdb->base_prefix}(\d+)_/i", $wpdb->table, $matches)) {
// Pretend we have 3 databases
// Will return 'database0', 'database1' or 'database2'
return 'database' . ($matches[1] % 3);
}
}
Another recent way I’ve used the modulo operator is sending out a weekly batch email to a large number of users. In this case, I had hundreds of thousands of records to read, parse, and then send out an email based on the results. Rather than do all 600,000 records at once, I split them up into 7 batches, based on day. Here’s a simplified version of that query:
$sql = 'SELECT * FROM table
WHERE user_id MOD 7 = ' . date('w');
Then I just set up the script to run daily through my crontab, and each user gets their email once a week, on the same day each week. Alternately, you could send the message every hour to break it into 168 separate blocks throughout the week:
$sql = 'SELECT * FROM table
WHERE user_id MOD 168 = ' . ((date('w') * 24) + date('H'));
It’s something that you don’t necessarily use often, but it’s very handy to have in your repertoire.