Escaping Problems with Slashes in WordPress

If you’re a web developer who uses WordPress, you’ve probably been surprised to find extra backslashes magically added to the $_GET and $_POST request data, right? It’s a confusing situation, with information dispersed sparsely across the internet. While working on Event Espresso, we’ve had our share of troubles with it, and wanted to share what we’ve learned.

If you get “slashes wrong” in your code, you can end up with really big security problems (specifically, SQL injection problems), bugs, or find user-created content gets litterred with extra slashes like I\\\\\'m \\\\\'drowning\\\\\" in slashes!.

So, do you know how to work with slashes? Here’s a quiz:

  1. Consider this line of PHP:echo 'it\'s';. The \ is an “escape character”, and \' is an “escape sequence”. True or False?
  2. The HTML <input value="\ \\ \r\n \' \""> will be displayed as . True or False?
  3. The following 3 lines of PHP all show the exact same thing: echo '\ \\ \r\n \' \"';, echo "\ \\ \r\n \' \"";, and ?>\ \\ \r\n \' \"<?php . True or False?
  4. The PHP code echo addslashes('billy \"the nose\"') will echo billy \\\"the nose\\\". True or False?
  5. Adding backslashes to request data is the recommended way to prevent SQL injection. True or False?
  6. Your WordPress code can rely on $_GET and $_POST request data always having backslashes added to it, regardless of whether PHP’s “Magic Quotes” is turned on. True or False?
  7. $wpdb->prepare, $wpdb->insert and $wpdb->update all take care of adding slashes, so data provided to them should not have slashes added to it. True or False?
  8. Calling $_POST = stripslashes_deep($_POST); is the best way to handle backslashes added to request data. True or False?

Each of the following sections will contain an answer to one of the questions.

Slashes Explained

Escape Characters and Escape Sequences

Consider this line of PHP:echo 'it\'s';. The \ is an “escape character”, and \' is an “escape sequence”. True

Wikipedia says

An escape sequence is a sequence of characters that does not represent itself when used inside a character or string literal, but is translated into another character or a sequence of characters that may be difficult or impossible to represent directly.

For example, in this PHP echo 'it\'s'; the \' is an escape sequence. Because the string started with a single-quote, we can’t add a regular single-quote to the string without abruptly ending it. echo 'it's'; would be invalid PHP, because there’s a valid string it, but then there’s an unexpected s and single quote. So if you want to put a single-quote in your string, you need to use the escape sequence \' to represent that.

And by the way, in the string \', the backslash is acting as an escape character, meaning the character(s) after it has a special meaning. Backslashes are often used as escape characters in PHP strings and MySQL queries, but not in HTML.

HTML and Backslashes

The HTML <input value="\ \\ \r\n \' \""> will be displayed as . True

When displaying your page’s HTML, Web browsers leave backslashes alone and don’t treat them as escape characters.

So if there is \ \\ \r\n \' \" in the page’s source code, that’s also exactly how the browser will display it to the user.

That goes for inside HTML input attributes too: backslashes are not escape characters. So

<input value="\ \\ \r\n \' \""> will show this

notice the double-quotation mark disappeared. That’s because \" wasn’t considered an escape sequence, so the " was interpreted as a normal double-quote. (And so that’s actually bad HTML because there is now an extra double-quote in that tag).

So clearly, adding slashes in front of quotes inside HTML attributes doesn’t escape them. If you want to display a double-quotation mark inside an HTML input’s value, you should use the HTML entity &quot;

PHP and Backslashes

The following 3 lines of PHP all show the exact same thing: echo '\ \\ \r\n \' \"';, echo "\ \\ \r\n \' \"";, and ?>\ \\ \r\n \' \"<?php . False

A backslashes can mean different things depending on whether they appear in a single-quoted string, double-quoted string, or are fetched from the database.
In a single quoted string, they’re only treated as an escape character if they appear in front of a single-quote or another backslash.

So echo '\ \\ \r\n \' \"'; will show \ \ \r\n ' \". Notice the backslashes are all interpreted as literal backslashes EXCEPT

  1. \\ became \
  2. \' became '

In a double-quoted string, they are treated as escape characters when placed in front of many other characters (see the table under “Double quoted” on the PHP page), but in front of double-quote instead of a single quote.
So, echo "\ \\ \r\n \' \"" will show \ \
\' "
. This time, many more of the backslashes were considered escape characters, but not all, and not all the same ones. Specifically,

  1. \\ became \ (same as with a single quoted string)
  2. \r became a carriage return
  3. \n became a new line
  4. \" became "

Notice the \' was NOT considered an escape sequence, like it was with a single quote string.

Lastly, backslashes are also treated differently if they’re not from a string in your PHP code, eg if they’re from a database column or HTML code entered outside of PHP tags. In this case, they are never treated as escape characters. So,?>\ \\ \r\n \' \"<!--?php , and global $wpdb; echo $wpdb->get_var('SELECT option_value FROM wp_options WHERE option_name="my_option"');, where the option_value is \ \\ \r\n \' \", will both show \ \\ \r\n \' \". Ie, none of the backslashes were considered an ecape character.

Magic Quotes and Backslash functions

The PHP code echo addslashes('billy \"the nose\"') will echo billy \\\"the nose\\\". True

From php.net, addslashes

Returns a string with backslashes before characters that need to be escaped. These characters are single quote ('), double quote ("), backslash (\) and NUL (the NULL byte).

So code echo addslashes('billy \"the nose\"') will echo billy \\\"the nose\\\", because each backslash in the original string gets another one added, and each double-quote gets a backslash added in front of it. Each time addslashes is called, more slashes get added. So echo addslashes('billy \\\"the nose\\\"') will echo billy \\\\\\\"the nose\\\\\\\", etc.

(Side note: because we’re talking about slashes and WordPress, you may like to know about WordPress’ wp_slash and wp_unslash functions. They’re mostly the same as addslashes and stripslashes, except they work with either a string or array; and it’s possible they may diverge more in the future.)

SQL Injection and Extra Backslashes

Adding backslashes to request data is the recommended way to prevent SQL injection. False

Adding slashes to a string before using it in MySQL code can save you from SQL Injection, so some PHP developers a long time ago came up with the idea of always adding slashes onto request data received from the user, eg $_GET and $_POST data. This feature was known as “Magic Quotes”.

PHP’s Magic quotes was an attempt to make GET and POST data safe, by default, for use in database queries. Consider the next line of code

If $_GET['title'] were ";DROP TABLE wp_posts;-- this would be SQL injection because the generated SQL would be

Which would delete the entire posts table! So that would be a big problem if $_GET didn’t get slashes added onto it. But if we called wp_slash on $_GET['title'] the SQL generated would have instead been:

Which would instead look for posts with the title ";DROP TABLE wp_posts;-- which is harmless.

So, the original purpose of magic quotes was good. But it turns out that’s not the only thing that you should do to use input before using it in your database- what you do to it depends on what type of database you’re using, because MySQL escapes sequences are different from Postgre escape sequences, etc. So in reality, you usually need to remove extra slashes and then use a database-specific escaping function. Needing to do this was obviously a pain, and so PHP 5.4 and higher officially no longer support magic quotes.

WordPress, Superglobals, and Backslashes

If you incorrectly think a string has backslashes added onto it, and then directly use it in an SQL query, you can have SQL injection problems. But conversely, if a string does have slashes added onto it, and you add backslashes to it again, users will see extra slashes added all over. And because not all WordPress users had Magic Quotes enabled, it was difficult to know if request data had slashes already added or not. So WordPress core developers tried to at least make things consistent by always adding backslashes onto request data regardless of whether Magic Quotes were enabled or not, even for versions of PHP that don’t even support Magic Quotes. (Documentation on this is pretty sparse, but read the notes on stripslashes_deep if you want more info.)

Your WordPress code can rely on $_GET and $_POST request data always having backslashes added to it, regardless of whether PHP’s “Magic Quotes” is turned on. False

So because WordPress always adds backslashes to the request data, you can rely on there being backslashes, right? Wrong. Despite it being a bad practice, many plugins and themes have $_POST = stripslashes_deep($_POST); which removes backslashes again, for all other code too, but only if that particular plugin is active. So watch out!

What’s more, even if there are no plugins or themes interfering with slashes, the request data might not have slashes added onto it yet. WordPress only adds the slashes after it has done the plugins_loaded action (inside wp-settings.php it calls wp_magic_quotes). So if you have code that’s using request data before or during plugins_loaded, its request data will NOT have slashes added onto it by WordPress; but any code after plugins_loaded, like code running during sanitize_comment_cookies or afterwards, will have slashes added onto it.

WPDB and Backslashes

$wpdb->prepare, $wpdb->insert and $wpdb->update all take care of adding slashes, so data provided to them should not have slashes added to it. True

When interacting with the database in WordPress, you’re best to use the global $wpdb class. Among others, it has many methods for interacting with the database, but it’s important to know which ones expect your input to be escaped and which ones do not.

When passing data into its get_results, get_var, get_col, and query, its expects that the string you’re passing into them was already prepared for use in the database, otherwise it could introduce SQL injection problems. Providing slashed request data helpers prevent SQL injection, but it’s not the best way to prepare request data for use in a database query because there are other escape sequences it might miss.

WPDB’s prepare is the preferred way to prepare request data for use by one of those previously-mentioned functions. Here’s a link to its documentation. But an important aspect not mentioned in the documentation is that the arguments being passed into it, if from request data, should not have backslashes added onto them. That’s part of what prepare does.

E.g., if $_GET has had slashes added onto it, this is how you should use it with $wpdb->prepare

Because if $_GET['title'] had the value of What I think about \"Star Wars\" (notice the extra slashes, which would have been added by WordPress, and not present in the user’s original request data), we want to send What I thnk about "Star Wars" to the database, without those pesky extra slashes. That’s why I added wp_unslash to that snippet.

There are other methods on WPDB that also take care of preparing the data you send them. Specifically, insert, update, and delete. You don’t need to call prepare on input for them because they assume the data you’re providing them hasn’t been prepared, nor had slashes added onto them.

So long as you use WPDB’s methods that prepare the data for use in the database (and avoid calling it repeatedly on the same string, which is another gotcha), you don’t have to make sure the request data has slashes added to it. And if, by mistake, you call wp_unslash too many times on a string, you will probably irritate users because their backslashes will disappear on their submitted content, but SQL injection will not be a problem for you. (While neither calling wp_slash or wp_unslash too many times it good, wp_unslash will probably be less annoying for users. If you call wp_unslash too many times, users’ backslashes will disappear, but how often do folks use backslashes in their submitted content, anyway? Whereas if you call wp_slash too many times, slashes will appear in front of single and double quotes, which are characters used far more often.)

Your Code and Backslashes

Calling $_POST = stripslashes_deep($_POST); is the best way to handle backslashes added to request data. False

Changing the $_POST globally, for all other plugins, can lead to plugin conflicts. If another plugin does that too, then you will be removing backslashes the user intended to add. It’s best to not interfere with how other plugins are handling the backslashes.

If you’re making a small, simple plugin or theme, here’s what I’d suggest: during the plugins_loaded action, before any of your other code runs, create a copy of $_GET, $_POST, $_REQUEST, and any other request superglobals you will want to use. Make those copies available globally, and use them instead of the PHP superglobals.
(Note: originally suggested copying the superglobals right away, but I realize that isn’t very friendly to other plugins which may want to modify your plugin’s behaviour, and it’s bad form for a plugin to do any actions before init action. See this video from WordPress.tv for me.)

E.g., here’s a plugin’s code that copies those global variables into its own global variables

In the above code we know $my_get['username'] doesn’t have slashes added onto it because we created that global variable before WordPress added slashes onto $_GET, so we don’t need to call wp_unslash on it. Also, because WPDB’s prepare takes care of escaping the input, we don’t have to call wp_slash nor do anything else to prepare it.

Event Espresso and Backslashes

In Event Espresso, that’s more-or-less what we’re trying to do. During the plugins_loaded action, we create a object called EE_Request which stores the request data separately BEFORE WordPress has added slashes onto it. We make that object available from a singleton. And in the rest of our code, we can use that EE_Request object, instead of $_REQUEST, and know it reliably does not contain extra slashes. When we go to use request data in database queries, we make sure to prepare the data using WPDB’s prepare method so we can protect from SQL injection.

Although, in full disclosure, we are working to avoid using the singleton because both singletons and global variables are considered bad design in the programming world. We are working to rectify this sub-par code by instead using dependency injection, but it will take time.

Also, at the time of writing, admittedly we still use $_GET and $_POST directly, although we’re working to remove those. (Part of the purpose of writing this post was to think through the problems of that. We also plan to totally avoid using the request superglobals in the future.)

If you have created code that integrates with Event Espresso, we highly recommend you also start using EE_Request instead of $_GET and $_POST directly. E.g., $my_var = isset($_REQUEST['my-key']) ? $_REQUEST{'my-key'] : null; can be replaced with $my_var = EE_Registry::instance()-&tl;REQ->get('my-key', null). Failure to do so means you may get fall into the slashes trap!

Summary

In order to avoid the tangled mess of slashes in request data in WordPress, create your own copy of the request data before WordPress adds slashes to it, and use your copies instead. Also, be sure to always prepare that data using WPDB’s helper methods, like prepare before using it in the database.

And don’t be fooled by slashes again!