The following is a fairly typical script that might be written by a well-meaning but naïve new coder. I used it as a brief technical test recently while interviewing candidates for a senior developer role, giving the scenario that a friend of theirs had just taken up PHP and wanted advice on the functionality, coding style and safety of the script.
What advice, I asked, would they provide this hypothetical, and thick-skinned, novice?
Some of them did quite well; two did really well and got themselves hired (after further interviews and tests).
Others just scared me with how long they'd be coding in oblivious bliss.
Read through the script and see what you can spot. Hovering over the text portion of each line should (works in FF anyway) bring up my annotations. They are not intended to be complete, or I'd be here all day.
Printing this page will show all annotations inline.
The source of this page is probably not very clean as I hacked it up from the .phps filter. The original file is here.
© Copyright Richard George (richard@phase.org) 2006. If you want to use it, contact me.
<?
Enabling and using SHORT_OPEN_TAGS will cause collisions with XML preambles and possibly other scripting engines. All modern scripts should use the full <?php syntax
#This might need some fixing...
#-style comments are the mark of scripters, not coders, and comments like this one invite liability issues. Not a problem for My First Script, but a habit to avoid anyway.
echo '<html><head><title>Guestbook 0.00000001(a)</title></head>';
To me, echo "" points to a coder who's come from a scripting rather than programming background. Personally, I prefer print(""), as I started as a software engineer in C.
echo '<body>';
echo '<h1>Guestbook</h1>';
if(!$mailto) {
Where does this user expect $mailto to be set? Is there a missing configuration block? As we see further down, REGISTER_GLOBALS must be set for this script to work, so $mailto can be overridden in the URL, making this a very handy spam-anyone tool.
Also, this wiill trigger an "undefined variable warning" if your error_reporting is suitably high for a development environment.
$mailto='webmaster@localhost';
For the purpose of this exercise, we'll assume this address is correct.
}
if($action=='store') {
We could very easily use a switch() statement for the values of $action. But where's $action coming from? Should it not be $_REQUEST['action']? (it could be POST or GET in this case, see note on form tag)
If this works for the user as is, REGISTER_GLOBALS must be on and a whole world of hurts awaits.
To add to the fun, since we're not checking for POST values, this is a gift for CSRF attacks.
mysql_connect('localhost','realuser','asdhjskdfnjsn');
Keeping the database connection variables inside the script is risky. If the PHP filter gets turned off (we've all done it) they'd be exposed to the viewer. At least it's not a root connection.
These variables can be set in a require()'d file, either as $x=y or as define's. This file should ideally live outside the web root, although the user may not always have that option.
mysql_query("insert into guestbook (name,email,subject,body) values (\"$name\",\"$email\",\"$subject\",\"$body\")");
Oops. Forgot to do a mysql_select_db() here. But that's the only bug in the script that'll actually stop it from working.
More critically, if this query were executed, it would be a major security hole, allowing completely arbitrary SQL injection. Of course, if MAGIC_QUOTES_GPC were enabled it might help, but that functionality is hideously broken and should be disabled (I think it's gone from PHP6 anyway).
Always escape values for SQL queries explicitly.
echo 'Thanks for your comment!';
Very friendly - it's always worth letting the user know that their operations complete sucessfully. But it might be worth checking the return value from mysql_query() before assuming success.
mail($mailto,'New guestbook entry: '.$subject,$body,"From: robomail@mydomain.com");
As mentioned above, the lack of tight checking on $mailto makes this a spam engine.
But far more subtly, this probably won't set the "From:" header correctly without setting "-frobomail@mydomain.com" as the 5th parameter to mail(). And the webserver may need to be configured as a trusted user in the mail server.
}
if($action=='add') {
echo "<form action=\"$_SERVER[PHP_SELF]\">";
Ideally, you'd set method="post" here. But the real nasty is an XSS attack: $_SERVER is rarely realised to be user-submitted data. A malicious person could construct a URL to this page that embedded arbitrary javascript here that would execute within the security context of this page, causing a redirect and/or stealing cookies.
echo "<input type=\"hidden\" value=\"store\" name=\"action\">";
As all this data is static, you can save a lot of echo's and \" by:
Closing the ?> tag and writing straight HTML,
Using a multi-line echo statement with single quotes
or using here-doc (>>>EOB) syntax
But don't tell a beginner to write a templating engine or install smarty for a dozen lines of output from a 40-odd line script. That's overkill, and you're not trying to scare them off.
echo "Name: <input name=\"name\" ><br />";
echo "Email: <input name=\"email\" ><br />";
echo "Subject: <input name=\"subject\" ><br />";
echo "<textarea name=\"body\" ></textarea>";
echo "<input type=\"submit\" >";
echo "</form>";
}
if(!$action) {
Testing for the absence of a value by checking if it's false is a legacy trick that causes a huge mess if you turn E_STRICT/E_NOTICE error checking on. Use isset() to check instead.
As noted above, however, this should really be the "default:" clause of a switch statement.
mysql_connect('localhost','realuser','asdhjskdfnjsn');
This code is replicated above; it should be put in a function for re-use. A database abstration layer however is not required for a script of this simplicity.
However, if you really want to do direct DB access, you should prefer the mysqli_ range of functions.
$res=mysql_query("select * from guestbook");
You might prefer to use fieldnames than "*" in the select query to reduce the SQL overhead. In this hypothetical case "select *" is valid as we really do want all fields. How do we know this? Ask the coder!
While making working assumptions is a useful skill for a developer, communication is even more so.
while($comment=mysql_fetch_array($res)) {
Using mysql_fetch_assoc() would avoid the use of "magic" numbers below and so make the code more readable.
Using numeric indexes is particularly risky with the "select *" query as they can cause the code to break if the table schema changes.
echo "<h2>$comment[3]</h2>";
No validation whatsoever was carried out on user data before we stored it, so the code we're printing out here can contain any javascript, flash, images or CSS the submitter desires. This allows XSS, redirects, cookie theft, page hijacking - you name it.
Oh and that email address is almost certainly going to be broken or invalid.
echo "<p>Posted by $comment[1] (<a href=\"mailto:$comment[2]\">$comment[2]</a>)</p>";
echo "<p>$comment[4]</p>";
$i++;
If you increment an undefined value it will act as though it was set to zero before you started, but will issue complaints with tighter error checking turned on.
}
echo("$i comments found. <a href=\"$_SERVER[PHP_SELF]?action=add\">add one</a>");
If there were no comments, $i will never have been assigned and therefore appear blank (rather than the desired '0') here.
}
echo'</body>';
echo'</html>';
?>
The comment for this line used to be "This line is probably safe". But closing the ?> tags at the end of a PHP script is now a deprecated behaviour.