seanmonstar

Aug 5 2010

PHP Error Suppression Performance

Sometimes, a function might cause a warning, like when you’re messing around with files. It might be tempting to just prepend an @ to the function, and live on. But being curious, I researched if this does much harm besides being sloppy/lazy. It turns out there are performance costs for doing so as well.

I first built a simple test that would loop a million times accessing a variable with and without the suppression operator prepended. The differences were small, yet noticeable. Using the suppression operator ended up taking 40% longer to execute. Interesting, but then an article by Vegard Andreas Larsen pointed out something I failed to test:

[The] assertion that it is the act of the @ operator that is very slow, is wrong. It is in fact the actual triggering of the error or warning by itself.

His tests show that while the suppression operator does add a little overhead, when an actual error1 occurs, you see a bigger cost. When using the suppression operator, you’re writing in a style that let’s you cause errors and not care, which decreases performance. The same thing applies to setting error_reporting to ignore notices or warnings. Just because their ignored, doesn’t mean PHP doesn’t try to throw them first.

A common example is when checking for properties of an object. When ignoring notices, you might do something like this:

if($obj->prop) { 
    do_stuff($obj->prop); 
}

If the property is undefined, a notice will be thrown, and then ignored. Performance penalty. Turns out that isset is quite important, after all. Another instance could trying to call file without first calling is_file. I used to think, So what, it’s a dynamic language, it’ll be just fine. Now, I’ll be littering my code with isset everywhere.

Just to be sure, I altered my original test to check the differences between using a conditional test with isset versus just suppressing the notice. The difference was that suppressing the notice (a notice!) took 100% as long as just checking if it existed first. Try it yourself.

<?php

function no_suppress() {
    $a = 0;
    $b = new stdClass;
    $a = (isset($b->asdf) ? $b->asdf : null);
}

function suppress() {
    $a = 0;
    $b = new stdClass;
    $a = @$b->asdf ? $b->asdf : null;
}

function do_test($suppress = false, $loops = 1000000) {
    if($suppress) {
        echo "starting suppress...\n";
        $start = microtime(true);
        for($i = 0; $i < $loops; $i++) {
            suppress();
        }
        $end = microtime(true);
    } else {
        echo "starting no_suppress...\n";
        $start = microtime(true);
        for($i = 0; $i < $loops; $i++) {
            no_suppress(true);
        }
        $end = microtime(true);
    }
    echo "ended: " . ($end - $start) . "\n";
}

Now, this might not be the biggest thing in the world. But it’s enough for me to change my ways, since it affects me once to write it, and my users an infinite amount of times having to execute. Since I believe in optimizing for users instead of developers, that’s how it’s going to be.


  1. I use the term error to mean any message thrown. It includes errors, warnings, notices, etc. 


Jul 13 2010

Helpful Errors Messages Are Important

If you write software, you write bugs. And after those bugs, there’s still errors that will happen in code you didn’t get to touch. Errors happen. That’s not new. And users know that things can go wrong sometimes. What pains me, is that we know exactly what went wrong, and we don’t translate that for the user.

In Windows

It’s surprising to find how often you’ll find an error message that’s only helpful to the creator of the original system. So often as user, you can reliably create a certain error to happen, and the only message you get back from the program is something like Error 0x2211108: Don't do that. If we as software engineers would just take the time to make sure that every error shown to a user let them know what happened (if its useful, which it actually very often is), and any steps they can take to fix the error.

Some of the most terrible error message design is the infamous blue screen of death. It completely disrupts the computer (which could be acceptable, some fatal errors can’t be recovered from). But worse, it tells the user virtually nothing useful. Just a bunch of computer gibberish.

Disregarding how ugly the error looks, the Windows team could have at least intercepted the error and displayed a useful message. Here’s an example:

Original BSOD

original BSOD

Improved BSOD

improved BSOD

In Blazonco

A while ago, I went through and documented many of the errors that errors that might bubble up in Blazonco. We’ve created a help page for each error, which depends on its unique error code number. I also tried to make sure that all errors are in a language the user understands. Error codes mean nothing to most people, so we don’t display them. They’re only in the link that the user can click to learn more about1. They don’t need to see that I tried to reference a property of a null object, or that file system threw an error about permissions. Instead, I tried to catch all those, and throw a much prettier error.

cryptic SQL erroruser friendly error

We basically have a try/catch at the top most function of our application, which will try to show the error message in a graceful way (inside that little error popover). But it’s not in that top function where we convert gibberish into English. It’s done whereever a proper error could happen.

$post = new BlogPost;
//...
try {
    $post->save();
catch (PDOException $ex) {
    if($ex->getCode() == UNIQUE_CONFLICT_ERROR) {
        throw new BlogPostException('You already have a blog post with that URL. You will need to use a different one so people can view your post.', 40006023);
    }
}

We still throw an exception, because an error did exist, but we can show much more meaning knowing that it’s a BlogPostException, and the message is much more meaningful to the user than what the SQLSTATE would have said.


  1. We also log all errors that occur. That way, each morning, I can open up our Superadmin and see if a user has ran into an unexpected internal error, or if a lot of users are receiving an “excepted” error too often, suggesting a flaw in the UI. 


Dec 21 2009

Hacking To Meet Deadlines

As a deadline approaches far faster than you can type, you’re required to write some quick-and-dirty code to fulfill those feature requests.

In case you don’t know what I’m talking about, this is when there happens to be a flaw in your program’s structure. It’s an architectural problem: you did properly build the system to elegantly behave in an expected manner. Sometimes, it’s a problem from bad planning at the start. In other cases, it comes from scope creep, where features get slipped into a system that previously was not going to have such features. Gamasutra had a nice write-up about cases like this happening in the games industry. To put it as they did:

Programmers are often methodical and precise beasts who do their utmost to keep their code clean and pretty. But when the chips are down, the perfectly-planned schedule is shot, and the game needs to ship, “getting it done” can win out over elegance.

In a case like this, a frazzled and overworked programmer is far more likely to ignore best practices, and hack in a less desirable solution to get the [code] out the door.

My favorite example in that article is about a game where in a certain level, some game object needed to be hidden. Instead of doing things the proper way, code was written along the lines of:

if( level == 10 && object == 56 ){    
	HideObject();
}

I’m guilty of that

We know that happens far too often in our industry. Just the other night, I was guilty of doing exactly that.

As a developer for a commercial CMS, we provide basic e-commerce functionality. Recently, a client specifically needed coupon codes added to our getup. With a rather short deadline, and being busy as the year gets close to ending, I didn’t have much time to flesh out the design of this new functionality. Besides everything else we needed to do, this feature ended up with a 2 day implementation, with very little planning. Things ended up looking like this:

foreach($items as $item) {    if($item instanceof CouponItem) {    //get discount    }    //do stuff with CartItems}

I just tied in where we were calculating the totals of all CartItems, and if one of the items was a CouponItem, do discount stuff instead. Another situation was in the view, when showing the cart. I had another loop through the cart items, in order to show them all. With CouponItems now apart of the Cart, I had to handle the different properties of a CouponItem there.

<? foreach($items as $item) { ?>
<td>
	<? if($item instanceof CouponItem) { ?>
	<!-- Coupon stuff -->
	<? } else { ?>
	<!-- normal stuff -->
	<? } ?>
</td>
<? } ?>

One can’t help but feel dirty doing this. Just hodge-podging rules into place, that magically make everything all better.

Is it really that bad?

Some may be wondering, what’s the big deal? It works, doesn’t it? Why, yes! It works! And depending on your situation, that may be all that’s really necessary. After all, the point of software is to ship it. And it’s got to work. In the end, howit works on the inside has little value to the end user. That’s the same in my case. In the end, what really matters is that users can add products to a shopping cart and then checkout.

However, at the same time, since our product is something that I have to work on every day, writing software like this makes things more confusing, and hard to maintain. Come a few months from now, when we find a bug in the Cart Items somewhere, or when I need to add a new feature in there, that hacked in behavior is unexpected behavior. Unexpected behavior means its easy to break something when I modify code elsewhere. It also means that it will take more time to learn what the heck I was doing when I read it again in the future. And I at least wrote that code. The other developers have it worse.

Now that the deadline has passed, and it just works, I can spend the next couple days planning out a better, more elegant way of handling these coupon codes. My users won’t see the difference. But I will certainly notice it in a month or more.


Sep 17 2009

Week Period of a Given Date

As I work on an internal time tracker for our company, I needed to show all the TimeEntries for a specified week. To specify which week, it made sense to simply select 1 day in the week, since that’s the easiest default control in Flex. This let’s me get Sunday and Saturday, the start and end of the week, so I can build a query that grabs entries between those 2 dates.

In PHP

$dayOfWeek = date('w', $date);
$secondsInDay = 60 * 60 * 24;
$sunday = date('Y-m-d', $date - ($dayOfWeek * $secondsInDay));
$saturday = date('Y-m-d', $date + ((6 - $dayOfWeek) * $secondsInDay));
$entries = TimeEntry::find(array('timestamp =<' => $saturday,'timestamp >=' => $sunday));

In English

First off, PHP’s date method can be given a parameter to return the numerical day of the week. So, if $date were today, we’d be storing 4 (Sunday is 0). Now, with just a little bit of math, we can subtract the number of days to get Sunday, and add the difference to get Saturday.

That last bit is a data model searching for all entries between those dates.


Sep 10 2009

You Don’t Always Need Identity Operators

In two languages that I use often (PHP and Javascript), there’s 2 different equality operators when comparing values. It’s become quite common to see places expressly tell you that you should only ever use one of them. That the other is evil. People see this, and then point fingers whenever you use 2 equal signs instead of 3. Here’s perfectly valid reasons to use equal operator (==) instead of identity(===).

In Javascript, depending on the browser, getting the value from inputs, especially when I expect to get a number value, can often times be a number in string format. And in those situations, I don’t care . If I’m looking for someone to be 21, I don’t care if they claim to be 21, or ‘21’. They both work for me.

My options are to use a nice, simple equal operator, or do something more verbose , harder to read , and suckier to write , to make sure I use the identity operator.

I’ll do it this way:

if(elem.value == 21) {
	//enter bar.
}

So I don’t have to write this sillyness:

if(elem.value === 21 || elem.value === '21') { 
}
//or
if(parseInt(elem.value, 10) === 21) {
}

Alternatively, I could write some method in PHP that manipulates a string of input. Before I do anything to it, I’m going to verify it’s something I can manipulate. If it’s null, or false, or an empty string, or zero, I don’t much care. All those things should rejected. There’s no need to use some sort of identity operator when I can safely say that anything falsy is no good to me .

function echoProperName($str = null) {
	if($str != null) {
		echo ucwords($str);
	}
}

I recognize I don’t even need to use comparison operators at all in this example, yet this very usage I have seen flamed for not using the identity operator. Ridiculous. It doesn’t fit here.

Programming languages give us loads of expressive tools, but by some sort of “convention”, we get told not to use many of them. You can smash your thumb with any tool . No need to tell me not to use ‘==’ or ‘++’ or some such because some people have hurt themselves with it.


Jul 9 2009

Requiring Login to CakePHP Admin

For some of my freelance clients, I have provided a home-brewed CMS, built using CakePHP, since I’d already used Cake for the rest of their web-site. I wanted to create a couple users, and an interface to be able to create pages that made up the navigation and content.

Building an admin area for this functionality, the first thing I did was turn on admin routing.

Since we’re logging in Users, we build a User model and UsersController. In the User model, write whatever extra you need, but the basics is just a validation function, to match username with password when logging in.

function validateLogin($data) {    
	$user = $this->find(array(        
			'username' => $data['username'],        
			'password' => md5($data['password'])        
		), array('id', 'username'));        
	return !empty($user) ? $user['User'] : false;      
}

You can add any additional validation rules in here, and you might want to use a better hashing use than simpy straight md5. Anyhow, this function will be called from the UsersController. So let’s dive into there.

UsersController

Throw together a simple login function. I’ll let your imagination put together the view, that’s just some simple html form stuff. Check for post data, validate with the above function, write the User object to the session and redirect to the admin area.

And conversely, logging out is simply destroying the User object from the session and redirecting out of the admin area. Simple stuff.

To help enforcelogging in though, we pay attention to the filter event of controllers, and check the session first. The process of a controller responding to the router is beforeFilter -> action -> beforeRender -> render -> afterFilter. You can make sure things happen before it calls any action, by writing a beforeFilter method. You could use this method to check for things, and if it’s not up to snuff, redirect to a different page. That’s what we’re going to do.

So, in the UsersController, we want to make sure that a user can’t do anything pertaining to users until they login.

function beforeFilter() {    
	if($this->action != 'login' && $this->action != 'logout') {        
		if($this->Session->check('User') == false) {            
			$this->redirect('login');           
			$this->Session->setFlash('The URL you\\'ve followed requires you login.');        
		}    
	}
}

Admin Requires Login

The last step is ensuring that any time a user wants to access the admin part of our app, they must be a logged in user, or they the boot. This involves using the same event as above, but in the AppController. The AppController lets us write something that should happen in every controller, because by default we should be extending AppController.

However, we don’t want it to alwaysbe forcing logins. Visitors who access public areas shouldn’t have to login. But if they want to access the admin areas, that’s when we want to force them.

With our admin routing turned on, any time the Router wants to invoke a method because of the Routing.admin configuration, it will add to the params of the controller 'admin' = 1. Thus, we can easily make our filter only care if admin is set.

class AppController extends Controller {     
	function beforeFilter() {
        if ($this->params['admin']) {
            if($this->Session->check('User') == false) {
                $this->flash("The URL you\'ve followed requires you login.",'/login',2);
            }
            $this->layout = 'admin';
        }
    } 
}

Admin Galore

Now, any page we want to setup as requiring admin access, like in my CMS, allowing admins to edit pages, we create functions prepended with value in your config for Routing.admin. By default, that’s admin.

This simple addition now requires you be logged in (with a user made either manually by myself or using a registration form you can build into the UsersController) in order to see the list of pages.

class PagesController extends AppController {    
	function admin_index() {        
		//setting the layout could be done in        
		//the AppController beforeFilter, so all        
		//admin pages use the admin layout instead.        
		$this->layout = 'admin';                
		$this->set('pages',$this->Page->getNav());
	}   
}

The default view for this follows the same kind of automagic rules for CakePHP: it’ll be app/views/pages/admin_index.ctp.


Jul 1 2009

A Basic Lesson in Password Hashing

In the world of the web, lots of sites are popping up requiring users to login. When you need to do so, there’s a bit more security than you might realize. You might be making a simple To-Do list, and might think:

Security? Pfft, I’m not too worried about people’s to-do lists being stolen.

But what you didn’t account for, is that all those username/password combinations a hacker just made off with? Yea, those are the same login’s to important stuff, like e-mail , or bank accounts . Yikes!

Worst Case Scenario

Most users of the web don’t think this hard about their own security, and even those that do, it’s too complicated to remember a unique username and password for every single web-site you visit . And most of those users that don’t know much about security also don’t realize the need for using complicated passwords. So if you’re collecting usernames and passwords, you need to design for the worst possible scenario:

Password: password1

Just Hash It All

So hopefully, the first thing you’re thinking is that you shouldn’t store your passwords in plain text. Good idea. You were thinking a hash, right? Because 2-way encryptions has it’s own security problems. Once a user discovers the encryption key, they can decrypt every password you have. So let’s hash everything.

There’s plenty of debate about the best hashing algorithms, so for sake of simplicity, I’ll just use md5*. But we’re not going to use the straight output of the hash. That’s like storing plain text. Instead, we’ll make sure we use a nice salt.

Not any salt. Nothing like “NaCLS4lt”. No no, that also makes it too easy for hackers to precompute. Instead, we’re going to generate a random salt for every single password. So even if they manage to crack just one password, they haven’t gained the salt for the rest of the hashes.

$salt = substr(md5(uniqid(rand(), true)), 0, $saltLen);  

We grab a nice, random string that gets hashed, and this garbled text becomes our salt.

$hash = md5($salt . $plain);  

Optionally, here you could do some manipulation, like splitting the password in half, or adding some salt to the end. The same goes for this entire process. The more you can mix it up without trying to come up with your own hashing algorithm, the more non-standard your passwords become.

Now then, we need this salt for later use, or else we’ll never be able to regenerate this hash when a user logs in! We’re not ganna store it in a seperate database field called salt. That gives it straight away to the hacker. Instead, we attach it to the hash and make it seem like the whole long thing is the password.

Many sites will suggest simply concatenating them together, $salt . $hash . However, I figure, with such a constant location, while the hacker does have to deal with random salts, he doesn’t have to worry about the location of it.

So we take something that is constant with the user, but different enough to allow variations for storing the salt. I’ll use simply the length of the password in plain text. If the password is 11 characters long, I insert the salt at character 11 of the hash. This way, it’s different than a password that is 8 characters in length.

return substr($hash,0,$saltStart).$salt.substr($hash,$saltStart+1);

Now, this function you’ve been building up, all you need to do is add an optional argument for a hash. When a user tries to login, you look up the hash from the database, and supply their password attempt and hash to this function. Now check if the hash is supplied, and if so, calculate the position of the salt in the hash, grab the salt, and use that for your md5($salt.$hash) part. If the function returns a hash that equals the hash in the database, you have the correct password.

*I don’t claim to be a cryptographer, so use at your own risk.


Jun 23 2009

Automagic Prefixes for Model Fields

Say we have a player model, and every field in playerstable is prepended with player_. For example, player_username, player_email, etc.

I’m personally not used to this database design, but I know plenty of people use it. When I work on projects that have this, I’m not particularly found of having to write:

$p = new Player;
echo $p->player_username;

I’d rather ditch the prepended part in all my PHP code.

echo $p->username;

Use Accessors

We can do this by writing some __get and __set functions:

public function __get($name) {    
	$prepend = 'player_'.$name;    
	if(isset($this->$prepend)) {        
	return $this->$prepend;    
	}    
	return parent::__get($name);
}

public function __set($name, $value) {    
	$prepend = 'player_'.$name;    
	if(isset($this->$prepend)) {        
		$this->$prepend = $value;    
	} else {        
		parent::__set($name, $value);        
		//if no parent, you might want the default:        
		//$this->$name = $value    
	}
}

Basically, stated before, these get called when you try to access a property that doesn’t exist on the object. So when we try to access username, we check if player_username exists, and if so, return that value.

MY_Model: Easily extendable

You could work this into a MY_Model class that extends Model, and then make all your models extend MY_Model. If you wanted to do this, I’d say make a property of MY_Model called ‘prefix’, and use prefix in the accesors. Then, in each sub-class, all you need to do is define the prefix.

class MY_Model extends Model {    
	protected $prefix;    
	public function __get($name) {        
		$prepend = $this->prefix.$name;        
		//...    
	}   
}
class Player_Model extends MY_Model {    
	protected $prefix = 'player_'; 
}

Dec 18 2008

Overloading Objects in PHP

In PHP, objects are all dynamic. If you declare a variable for object after instantiation, it just throws it right in, no questions asked. Much friendlier than, say, Java, where you absolutely must define a variable prior to use or the JVM will smite you. PHP also lets you define extra or different instructions when using a previously unknown variable with magic functions.

__get and __set

Usually the native implementation of these functions is the desired result. But sometimes, adding some extra features into getting or setting a variable can really make things easier.

__get

Let’s say we have a model called Person:

class Person {    
	public function __get($name) {        
		$this->$name = Model::query('select * from people_table where name=? limit 1',array($name));        
		return $this->$name;    
	}   
}

…and we try to access a variable that doesn’t exist:

$obj = new Person;
echo $obj->Sean['last_name'];

Since Sean isn’t a predefined value in $obj, it queries the database, grabs the row with name Sean, and returns it. And the echo statement proceeds to print my last name to the screen. I’ve also stored the result in the variable requested, so concurrent requests will get the stored variable and leave my database alone.

__set

We could also try to do something in the reverse, by setting an unknown property.

class Person {    
	public function __set($name, $value) {        
		$value []= $name;        
		$this->$name = Model::query('insert into Person_table last_name=?, job=?, name=?',$value);    
	}   
}

Now this function doesn’t do a bunch of checking, and you’ll probably want do some. But since this is a simple example, I’m not. I’m just assuming that any unknown variable that I try to set on the model should be part of the database:

$obj = new Person;
$obj->Sean = array('last_name' => 'McArthur', 'job' => 'Developer');

There are some actual real good uses for this overloading; the ones I’ve shown are simplistic and possible a little too extreme. But now with the understanding of these magic functions, hopefully you can put them to good use.

__call (and __callStatic)

The __call function helps us when we try to call a function that doesn’t belong to an object. Overloading this function is quite often used in API implementations. Let’s look at a small example:

class PersonAPI {    
	//query function    
	//...        
	public function __call($name, $args) {        
		$this->query($name,$args);    
	}   
}

Assume we have a query function which makes a connection and tries to make a function call on a foreign API. Imagine $person is an object of our PersonAPI, and the follow two statements would then be identical:

$person->query('getLists',array('token'=>'test'));
$person->getLists(array('token'=>'test'));

This is very simple, one line solution. You could, of course, make it much more interesting than that.

__callStatic

The same can done with static methods, so when we call a static method that doesn’t exist, instead of getting the error thrown in our faces, we could try to see if there’s something extra to do first.

class PersonAPI {    
	//static query function    
	//...       
	public function __callStatic($name, $args) {        
		self::query($name,$args);    
	}  
}

Get Overloading

After knowing this, it’s pretty easy to fill in these functions for several classes you have. You could consider throwing common functionality into them, therefore getting a certain procedure on a simple get or set command.


Page 1 of 2