If you’re a web developer who uses WordPress, you’ve probably been surprised to find extra backslashes magically added to the $_GET
and $_POST
request data, right? It’s a confusing situation, with information dispersed sparsely across the internet. While working on Event Espresso, we’ve had our share of troubles with it, and wanted to share what we’ve learned.
If you get “slashes wrong” in your code, you can end up with really big security problems (specifically, SQL injection problems), bugs, or find user-created content gets litterred with extra slashes like I\\\\\'m \\\\\'drowning\\\\\" in slashes!
.
So, do you know how to work with slashes? Here’s a quiz:
- Consider this line of PHP:
echo 'it\'s';
. The\
is an “escape character”, and\'
is an “escape sequence”. True or False? - The HTML
<input value="\ \\ \r\n \' \"">
will be displayed as . True or False? - The following 3 lines of PHP all show the exact same thing:
echo '\ \\ \r\n \' \"';
,echo "\ \\ \r\n \' \"";
, and?>\ \\ \r\n \' \"<?php
. True or False? - The PHP code
echo addslashes('billy \"the nose\"')
will echobilly \\\"the nose\\\"
. True or False? - Adding backslashes to request data is the recommended way to prevent SQL injection. True or False?
- Your WordPress code can rely on
$_GET
and$_POST
request data always having backslashes added to it, regardless of whether PHP’s “Magic Quotes” is turned on. True or False? $wpdb->prepare
,$wpdb->insert
and$wpdb->update
all take care of adding slashes, so data provided to them should not have slashes added to it. True or False?- Calling
$_POST = stripslashes_deep($_POST);
is the best way to handle backslashes added to request data. True or False?
Each of the following sections will contain an answer to one of the questions.
Slashes Explained
Escape Characters and Escape Sequences
Consider this line of PHP:
echo 'it\'s';
. The\
is an “escape character”, and\'
is an “escape sequence”. True
Wikipedia says
An escape sequence is a sequence of characters that does not represent itself when used inside a character or string literal, but is translated into another character or a sequence of characters that may be difficult or impossible to represent directly.
For example, in this PHP echo 'it\'s';
the \'
is an escape sequence. Because the string started with a single-quote, we can’t add a regular single-quote to the string without abruptly ending it. echo 'it's';
would be invalid PHP, because there’s a valid string it
, but then there’s an unexpected s and single quote. So if you want to put a single-quote in your string, you need to use the escape sequence \'
to represent that.
And by the way, in the string \'
, the backslash is acting as an escape character, meaning the character(s) after it has a special meaning. Backslashes are often used as escape characters in PHP strings and MySQL queries, but not in HTML.
HTML and Backslashes
The HTML
<input value="\ \\ \r\n \' \"">
will be displayed as . True
When displaying your page’s HTML, Web browsers leave backslashes alone and don’t treat them as escape characters.
So if there is \ \\ \r\n \' \"
in the page’s source code, that’s also exactly how the browser will display it to the user.
That goes for inside HTML input attributes too: backslashes are not escape characters. So
<input value="\ \\ \r\n \' \"">
will show this
notice the double-quotation mark disappeared. That’s because \"
wasn’t considered an escape sequence, so the "
was interpreted as a normal double-quote. (And so that’s actually bad HTML because there is now an extra double-quote in that tag).
So clearly, adding slashes in front of quotes inside HTML attributes doesn’t escape them. If you want to display a double-quotation mark inside an HTML input’s value, you should use the HTML entity "
PHP and Backslashes
The following 3 lines of PHP all show the exact same thing:
echo '\ \\ \r\n \' \"';
,echo "\ \\ \r\n \' \"";
, and?>\ \\ \r\n \' \"<?php
. False
A backslashes can mean different things depending on whether they appear in a single-quoted string, double-quoted string, or are fetched from the database.
In a single quoted string, they’re only treated as an escape character if they appear in front of a single-quote or another backslash.
So echo '\ \\ \r\n \' \"';
will show \ \ \r\n ' \"
. Notice the backslashes are all interpreted as literal backslashes EXCEPT
\\
became\
\'
became'
In a double-quoted string, they are treated as escape characters when placed in front of many other characters (see the table under “Double quoted” on the PHP page), but in front of double-quote instead of a single quote.
So, echo "\ \\ \r\n \' \""
will show \ \
. This time, many more of the backslashes were considered escape characters, but not all, and not all the same ones. Specifically,
\' "
\\
became\
(same as with a single quoted string)\r
became a carriage return\n
became a new line\"
became"
Notice the \'
was NOT considered an escape sequence, like it was with a single quote string.
Lastly, backslashes are also treated differently if they’re not from a string in your PHP code, eg if they’re from a database column or HTML code entered outside of PHP tags. In this case, they are never treated as escape characters. So,?>\ \\ \r\n \' \"<!--?php
, and global $wpdb; echo $wpdb->get_var('SELECT option_value FROM wp_options WHERE option_name="my_option"');
, where the option_value
is \ \\ \r\n \' \"
, will both show \ \\ \r\n \' \"
. Ie, none of the backslashes were considered an ecape character.
Magic Quotes and Backslash functions
The PHP code
echo addslashes('billy \"the nose\"')
will echobilly \\\"the nose\\\"
. True
From php.net, addslashes
Returns a string with backslashes before characters that need to be escaped. These characters are single quote (
'
), double quote ("
), backslash (\
) and NUL (theNULL
byte).
So code echo addslashes('billy \"the nose\"')
will echo billy \\\"the nose\\\"
, because each backslash in the original string gets another one added, and each double-quote gets a backslash added in front of it. Each time addslashes
is called, more slashes get added. So echo addslashes('billy \\\"the nose\\\"')
will echo billy \\\\\\\"the nose\\\\\\\"
, etc.
(Side note: because we’re talking about slashes and WordPress, you may like to know about WordPress’ wp_slash and wp_unslash functions. They’re mostly the same as addslashes
and stripslashes
, except they work with either a string or array; and it’s possible they may diverge more in the future.)
SQL Injection and Extra Backslashes
Adding backslashes to request data is the recommended way to prevent SQL injection. False
Adding slashes to a string before using it in MySQL code can save you from SQL Injection, so some PHP developers a long time ago came up with the idea of always adding slashes onto request data received from the user, eg $_GET
and $_POST
data. This feature was known as “Magic Quotes”.
PHP’s Magic quotes was an attempt to make GET and POST data safe, by default, for use in database queries. Consider the next line of code
1 2 |
global $wpdb; $sites_with_requested_title = $wpdb->query('SELECT * FROM ' . $wpdb->posts . ' WHERE post_title="' . $_GET['title'] . '";'); |
If $_GET['title']
were ";DROP TABLE wp_posts;--
this would be SQL injection because the generated SQL would be
1 |
SELECT * FROM wp_posts WHERE post_title="";DROP TABLE wp_posts;--"; |
Which would delete the entire posts table! So that would be a big problem if $_GET
didn’t get slashes added onto it. But if we called wp_slash
on $_GET['title']
the SQL generated would have instead been:
1 |
SELECT * FROM wp_posts WHERE post_title="\";DROP TABLE wp_posts;--"; |
Which would instead look for posts with the title ";DROP TABLE wp_posts;--
which is harmless.
So, the original purpose of magic quotes was good. But it turns out that’s not the only thing that you should do to use input before using it in your database- what you do to it depends on what type of database you’re using, because MySQL escapes sequences are different from Postgre escape sequences, etc. So in reality, you usually need to remove extra slashes and then use a database-specific escaping function. Needing to do this was obviously a pain, and so PHP 5.4 and higher officially no longer support magic quotes.
WordPress, Superglobals, and Backslashes
If you incorrectly think a string has backslashes added onto it, and then directly use it in an SQL query, you can have SQL injection problems. But conversely, if a string does have slashes added onto it, and you add backslashes to it again, users will see extra slashes added all over. And because not all WordPress users had Magic Quotes enabled, it was difficult to know if request data had slashes already added or not. So WordPress core developers tried to at least make things consistent by always adding backslashes onto request data regardless of whether Magic Quotes were enabled or not, even for versions of PHP that don’t even support Magic Quotes. (Documentation on this is pretty sparse, but read the notes on stripslashes_deep if you want more info.)
Your WordPress code can rely on
$_GET
and$_POST
request data always having backslashes added to it, regardless of whether PHP’s “Magic Quotes” is turned on. False
So because WordPress always adds backslashes to the request data, you can rely on there being backslashes, right? Wrong. Despite it being a bad practice, many plugins and themes have $_POST = stripslashes_deep($_POST);
which removes backslashes again, for all other code too, but only if that particular plugin is active. So watch out!
What’s more, even if there are no plugins or themes interfering with slashes, the request data might not have slashes added onto it yet. WordPress only adds the slashes after it has done the plugins_loaded
action (inside wp-settings.php
it calls wp_magic_quotes
). So if you have code that’s using request data before or during plugins_loaded
, its request data will NOT have slashes added onto it by WordPress; but any code after plugins_loaded
, like code running during sanitize_comment_cookies
or afterwards, will have slashes added onto it.
WPDB and Backslashes
$wpdb->prepare
,$wpdb->insert
and$wpdb->update
all take care of adding slashes, so data provided to them should not have slashes added to it. True
When interacting with the database in WordPress, you’re best to use the global $wpdb
class. Among others, it has many methods for interacting with the database, but it’s important to know which ones expect your input to be escaped and which ones do not.
When passing data into its get_results
, get_var
, get_col
, and query
, its expects that the string you’re passing into them was already prepared for use in the database, otherwise it could introduce SQL injection problems. Providing slashed request data helpers prevent SQL injection, but it’s not the best way to prepare request data for use in a database query because there are other escape sequences it might miss.
WPDB’s prepare
is the preferred way to prepare request data for use by one of those previously-mentioned functions. Here’s a link to its documentation. But an important aspect not mentioned in the documentation is that the arguments being passed into it, if from request data, should not have backslashes added onto them. That’s part of what prepare
does.
E.g., if $_GET
has had slashes added onto it, this is how you should use it with $wpdb->prepare
1 2 3 4 5 6 7 8 9 |
function fetch_posts_with_title() { global $wpdb; $posts_with_title = $wpdb->get_results( $wpdb->prepare( 'SELECT * FROM wp_posts WHERE username=%s', wp_unslash( $_GET['title'] ) ) ); } |
Because if $_GET['title']
had the value of What I think about \"Star Wars\"
(notice the extra slashes, which would have been added by WordPress, and not present in the user’s original request data), we want to send What I thnk about "Star Wars"
to the database, without those pesky extra slashes. That’s why I added wp_unslash
to that snippet.
There are other methods on WPDB that also take care of preparing the data you send them. Specifically, insert
, update
, and delete
. You don’t need to call prepare
on input for them because they assume the data you’re providing them hasn’t been prepared, nor had slashes added onto them.
So long as you use WPDB’s methods that prepare the data for use in the database (and avoid calling it repeatedly on the same string, which is another gotcha), you don’t have to make sure the request data has slashes added to it. And if, by mistake, you call wp_unslash
too many times on a string, you will probably irritate users because their backslashes will disappear on their submitted content, but SQL injection will not be a problem for you. (While neither calling wp_slash
or wp_unslash
too many times it good, wp_unslash
will probably be less annoying for users. If you call wp_unslash
too many times, users’ backslashes will disappear, but how often do folks use backslashes in their submitted content, anyway? Whereas if you call wp_slash
too many times, slashes will appear in front of single and double quotes, which are characters used far more often.)
Your Code and Backslashes
Calling
$_POST = stripslashes_deep($_POST);
is the best way to handle backslashes added to request data. False
Changing the $_POST
globally, for all other plugins, can lead to plugin conflicts. If another plugin does that too, then you will be removing backslashes the user intended to add. It’s best to not interfere with how other plugins are handling the backslashes.
If you’re making a small, simple plugin or theme, here’s what I’d suggest: during the plugins_loaded
action, before any of your other code runs, create a copy of $_GET
, $_POST
, $_REQUEST
, and any other request superglobals you will want to use. Make those copies available globally, and use them instead of the PHP superglobals.
(Note: originally suggested copying the superglobals right away, but I realize that isn’t very friendly to other plugins which may want to modify your plugin’s behaviour, and it’s bad form for a plugin to do any actions before init
action. See this video from WordPress.tv for me.)
E.g., here’s a plugin’s code that copies those global variables into its own global variables
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
<?php /* Plugin Name: My Plugin Description: My plugin is awesome, and always knows the request data isn't slashed Version: 1.0.0 Author: Me */ function my_plugin_copy_request_globals() { global $my_get, $my_post, $my_request; //double-check PHP didn't already add magic quotes in which case remove them. props @jdgrimes if( get_magic_quotes_gpc() ) { $my_get = wp_unslash( $_GET ); $my_post = wp_unslash( $_POST ); $my_request = wp_unslash( $_REQUEST ); } else { $my_get = $_GET; $my_post = $_POST; $my_request = $_REQUEST; } } add_action('plugins_loaded', 'my_plugin_copy_request_globals'); function my_plugin_fetch_posts_with_title() { global $my_get, $wpdb; $posts_with_title = $wpdb->get_results( $wpdb->prepare('SELECT * FROM wp_posts WHERE username=%s', $my_get['title'] ) ); ... } add_action('init', 'my_plugin_fetch_posts_with_title'); } |
In the above code we know $my_get['username']
doesn’t have slashes added onto it because we created that global variable before WordPress added slashes onto $_GET
, so we don’t need to call wp_unslash
on it. Also, because WPDB’s prepare
takes care of escaping the input, we don’t have to call wp_slash
nor do anything else to prepare it.
Event Espresso and Backslashes
In Event Espresso, that’s more-or-less what we’re trying to do. During the plugins_loaded
action, we create a object called EE_Request
which stores the request data separately BEFORE WordPress has added slashes onto it. We make that object available from a singleton. And in the rest of our code, we can use that EE_Request
object, instead of $_REQUEST
, and know it reliably does not contain extra slashes. When we go to use request data in database queries, we make sure to prepare the data using WPDB’s prepare
method so we can protect from SQL injection.
Although, in full disclosure, we are working to avoid using the singleton because both singletons and global variables are considered bad design in the programming world. We are working to rectify this sub-par code by instead using dependency injection, but it will take time.
Also, at the time of writing, admittedly we still use $_GET
and $_POST
directly, although we’re working to remove those. (Part of the purpose of writing this post was to think through the problems of that. We also plan to totally avoid using the request superglobals in the future.)
If you have created code that integrates with Event Espresso, we highly recommend you also start using EE_Request
instead of $_GET
and $_POST
directly. E.g., $my_var = isset($_REQUEST['my-key']) ? $_REQUEST{'my-key'] : null;
can be replaced with $my_var = EE_Registry::instance()-&tl;REQ->get('my-key', null)
. Failure to do so means you may get fall into the slashes trap!
Summary
In order to avoid the tangled mess of slashes in request data in WordPress, create your own copy of the request data before WordPress adds slashes to it, and use your copies instead. Also, be sure to always prepare that data using WPDB’s helper methods, like prepare
before using it in the database.
And don’t be fooled by slashes again!