seanmonstar

Aug 5 2010

PHP Error Suppression Performance

Sometimes, a function might cause a warning, like when you’re messing around with files. It might be tempting to just prepend an @ to the function, and live on. But being curious, I researched if this does much harm besides being sloppy/lazy. It turns out there are performance costs for doing so as well.

I first built a simple test that would loop a million times accessing a variable with and without the suppression operator prepended. The differences were small, yet noticeable. Using the suppression operator ended up taking 40% longer to execute. Interesting, but then an article by Vegard Andreas Larsen pointed out something I failed to test:

[The] assertion that it is the act of the @ operator that is very slow, is wrong. It is in fact the actual triggering of the error or warning by itself.

His tests show that while the suppression operator does add a little overhead, when an actual error1 occurs, you see a bigger cost. When using the suppression operator, you’re writing in a style that let’s you cause errors and not care, which decreases performance. The same thing applies to setting error_reporting to ignore notices or warnings. Just cause their ignored, doesn’t mean PHP doesn’t try to throw them first.

A common example is when checking for properties of an object. When ignoring notices, you might do something like this:

if($obj->prop) { 
    do_stuff($obj->prop); 
}

If the property is undefined, a notice will be thrown, and then ignored. Performance penalty. Turns out that isset is quite important, after all. Another instance could trying to call file without first calling is_file. I used to think, So what, it’s a dynamic language, it’ll be just fine. Now, I’ll be littering my code with isset everywhere.

Just to be sure, I altered my original test to check the differences between using a conditional test with isset versus just suppressing the notice. The difference was that suppressing the notice (a notice!) took 100% as long as just checking if it existed first. Try it yourself.

<?php

function no_suppress() {
    $a = 0;
    $b = new stdClass;
    $a = (isset($b->asdf) ? $b->asdf : null);
}

function suppress() {
    $a = 0;
    $b = new stdClass;
    $a = @$b->asdf ? $b->asdf : null;
}

function do_test($suppress = false, $loops = 1000000) {
    if($suppress) {
        echo "starting suppress...\n";
        $start = microtime(true);
        for($i = 0; $i < $loops; $i++) {
            suppress();
        }
        $end = microtime(true);
    } else {
        echo "starting no_suppress...\n";
        $start = microtime(true);
        for($i = 0; $i < $loops; $i++) {
            no_suppress(true);
        }
        $end = microtime(true);
    }
    echo "ended: " . ($end - $start) . "\n";
}

Now, this might not be the biggest thing in the world. But it’s enough for me to change my ways, since it affects me once to write it, and my users an infinite amount of times having to execute. Since I believe in optimizing for users instead of developers, that’s how it’s going to be.


  1. I use the term error to mean any message thrown. It includes errors, warnings, notices, etc. 


Jul 13 2010

Helpful Errors Messages Are Important

If you write software, you write bugs. And after those bugs, there’s still errors that will happen in code you didn’t get to touch. Errors happen. That’s not new. And users know that things can go wrong sometimes. What pains me, is that we know exactly what went wrong, and we don’t translate that for the user.

In Windows

It’s surprising to find how often you’ll find an error message that’s only helpful to the creator of the original system. So often as user, you can reliably create a certain error to happen, and the only message you get back from the program is something like Error 0x2211108: Don't do that. If we as software engineers would just take the time to make sure that every error shown to a user let them know what happened (if its useful, which it actually very often is), and any steps they can take to fix the error.

Some of the most terrible error message design is the infamous blue screen of death. It completely disrupts the computer (which could be acceptable, some fatal errors can’t be recovered from). But worse, it tells the user virtually nothing useful. Just a bunch of computer gibberish.

Disregarding how ugly the error looks, the Windows team could have at least intercepted the error and displayed a useful message. Here’s an example:

Original BSOD

original BSOD

Improved BSOD

improved BSOD

In Blazonco

A while ago, I went through and documented many of the errors that errors that might bubble up in Blazonco. We’ve created a help page for each error, which depends on its unique error code number. I also tried to make sure that all errors are in a language the user understands. Error codes mean nothing to most people, so we don’t display them. They’re only in the link that the user can click to learn more about1. They don’t need to see that I tried to reference a property of a null object, or that file system threw an error about permissions. Instead, I tried to catch all those, and throw a much prettier error.

cryptic SQL erroruser friendly error

We basically have a try/catch at the top most function of our application, which will try to show the error message in a graceful way (inside that little error popover). But it’s not in that top function where we convert gibberish into English. It’s done whereever a proper error could happen.

$post = new BlogPost;
//...
try {
    $post->save();
catch (PDOException $ex) {
    if($ex->getCode() == UNIQUE_CONFLICT_ERROR) {
        throw new BlogPostException('You already have a blog post with that URL. You will need to use a different one so people can view your post.', 40006023);
    }
}

We still throw an exception, because an error did exist, but we can show much more meaning knowing that it’s a BlogPostException, and the message is much more meaningful to the user than what the SQLSTATE would have said.


  1. We also log all errors that occur. That way, each morning, I can open up our Superadmin and see if a user has ran into an unexpected internal error, or if a lot of users are receiving an “excepted” error too often, suggesting a flaw in the UI. 


Dec 21 2009

Hacking To Meet Deadlines

As a deadline approaches far faster than you can type, you’re required to write some quick-and-dirty code to fulfill those feature requests.

In case you don’t know what I’m talking about, this is when there happens to be a flaw in your program’s structure. It’s an architectural problem: you did properly build the system to elegantly behave in an expected manner. Sometimes, it’s a problem from bad planning at the start. In other cases, it comes from scope creep, where features get slipped into a system that previously was not going to have such features. Gamasutra had a nice write-up about cases like this happening in the games industry. To put it as they did:

Programmers are often methodical and precise beasts who do their utmost to keep their code clean and pretty. But when the chips are down, the perfectly-planned schedule is shot, and the game needs to ship, “getting it done” can win out over elegance.

In a case like this, a frazzled and overworked programmer is far more likely to ignore best practices, and hack in a less desirable solution to get the [code] out the door.

My favorite example in that article is about a game where in a certain level, some game object needed to be hidden. Instead of doing things the proper way, code was written along the lines of:

if( level == 10 && object == 56 ){    
	HideObject();
}

I’m guilty of that

We know that happens far too often in our industry. Just the other night, I was guilty of doing exactly that.

As a developer for a commercial CMS, we provide basic e-commerce functionality. Recently, a client specifically needed coupon codes added to our getup. With a rather short deadline, and being busy as the year gets close to ending, I didn’t have much time to flesh out the design of this new functionality. Besides everything else we needed to do, this feature ended up with a 2 day implementation, with very little planning. Things ended up looking like this:

foreach($items as $item) {    if($item instanceof CouponItem) {    //get discount    }    //do stuff with CartItems}

I just tied in where we were calculating the totals of all CartItems, and if one of the items was a CouponItem, do discount stuff instead. Another situation was in the view, when showing the cart. I had another loop through the cart items, in order to show them all. With CouponItems now apart of the Cart, I had to handle the different properties of a CouponItem there.

<? foreach($items as $item) { ?>
<td>
	<? if($item instanceof CouponItem) { ?>
	<!-- Coupon stuff -->
	<? } else { ?>
	<!-- normal stuff -->
	<? } ?>
</td>
<? } ?>

One can’t help but feel dirty doing this. Just hodge-podging rules into place, that magically make everything all better.

Is it really that bad?

Some may be wondering, what’s the big deal? It works, doesn’t it? Why, yes! It works! And depending on your situation, that may be all that’s really necessary. After all, the point of software is to ship it. And it’s got to work. In the end, howit works on the inside has little value to the end user. That’s the same in my case. In the end, what really matters is that users can add products to a shopping cart and then checkout.

However, at the same time, since our product is something that I have to work on every day, writing software like this makes things more confusing, and hard to maintain. Come a few months from now, when we find a bug in the Cart Items somewhere, or when I need to add a new feature in there, that hacked in behavior is unexpected behavior. Unexpected behavior means its easy to break something when I modify code elsewhere. It also means that it will take more time to learn what the heck I was doing when I read it again in the future. And I at least wrote that code. The other developers have it worse.

Now that the deadline has passed, and it just works, I can spend the next couple days planning out a better, more elegant way of handling these coupon codes. My users won’t see the difference. But I will certainly notice it in a month or more.


Sep 17 2009

Week Period of a Given Date

As I work on an internal time tracker for our company, I needed to show all the TimeEntries for a specified week. To specify which week, it made sense to simply select 1 day in the week, since that’s the easiest default control in Flex. This let’s me get Sunday and Saturday, the start and end of the week, so I can build a query that grabs entries between those 2 dates.

In PHP

$dayOfWeek = date('w', $date);
$secondsInDay = 60 * 60 * 24;
$sunday = date('Y-m-d', $date - ($dayOfWeek * $secondsInDay));
$saturday = date('Y-m-d', $date + ((6 - $dayOfWeek) * $secondsInDay));
$entries = TimeEntry::find(array('timestamp =<' => $saturday,'timestamp >=' => $sunday));

In English

First off, PHP’s date method can be given a parameter to return the numerical day of the week. So, if $date were today, we’d be storing 4 (Sunday is 0). Now, with just a little bit of math, we can subtract the number of days to get Sunday, and add the difference to get Saturday.

That last bit is a data model searching for all entries between those dates.


Sep 10 2009

You Don’t Always Need Identity Operators

In two languages that I use often (PHP and Javascript), there’s 2 different equality operators when comparing values. It’s become quite common to see places expressly tell you that you should only ever use one of them. That the other is evil. People see this, and then point fingers whenever you use 2 equal signs instead of 3. Here’s perfectly valid reasons to use equal operator (==) instead of identity(===).

In Javascript, depending on the browser, getting the value from inputs, especially when I expect to get a number value, can often times be a number in string format. And in those situations, I don’t care . If I’m looking for someone to be 21, I don’t care if they claim to be 21, or ‘21’. They both work for me.

My options are to use a nice, simple equal operator, or do something more verbose , harder to read , and suckier to write , to make sure I use the identity operator.

I’ll do it this way:

if(elem.value == 21) {
	//enter bar.
}

So I don’t have to write this sillyness:

if(elem.value === 21 || elem.value === '21') { 
}
//or
if(parseInt(elem.value, 10) === 21) {
}

Alternatively, I could write some method in PHP that manipulates a string of input. Before I do anything to it, I’m going to verify it’s something I can manipulate. If it’s null, or false, or an empty string, or zero, I don’t much care. All those things should rejected. There’s no need to use some sort of identity operator when I can safely say that anything falsy is no good to me .

function echoProperName($str = null) {
	if($str != null) {
		echo ucwords($str);
	}
}

I recognize I don’t even need to use comparison operators at all in this example, yet this very usage I have seen flamed for not using the identity operator. Ridiculous. It doesn’t fit here.

Programming languages give us loads of expressive tools, but by some sort of “convention”, we get told not to use many of them. You can smash your thumb with any tool . No need to tell me not to use ‘==’ or ‘++’ or some such because some people have hurt themselves with it.


Jul 16 2009

Random String Generation with Symbols

I’ve been playing with some random string generation, since I built a fairly simple one in a recent project for when users forget their password, and I reset it. It seemed decent enough: produced a string of strong size, alpha-numeric. It was good enough.

base_convert(uniqid(rand(),true),10,36)

It didn’t take long after I had commited that code before I started thinking it could be better. At least, if I need it to be better, then it could be. For instance, if it’s going to be used for security reasons, it should include symbols, and upper-case characters. Then I thought that this “improved” version would be far superior for providing a salt than previously seen examples.

substr(md5(uniqid(rand(),true)),0,8)

The above line will provide a random string, but it’s weak in a few ways. First, it’s only 8 characters long. We should be aiming in the high ‘teens. But that’s easily fixed by changing the substr length. However, its bigger weakness is that it’s text coming from a hexidecimal output. Each character only has 16 possibilities, which is way smaller than 62, or more.

I Like “Or More”

The amount of possible combinations is dependent on the number of characters, and the number of characters available. In math terms it would look something like this:

f(x,y) = x^y

Where x is the size of the character set, and y is the number of characters in the string. So, the possible variations of a string of length 15 hexidecimal characters is: 16^15 = approx. 1.15 quintillion .

If we increase the character set to include all letters, upper and lower case, plus most all the symbols on the keyboard, we can get something like 92 characters to choice from. With the length staying the same, we get: 92^15 = approx 286 octillion . Don’t worry, I did the math for you: That’s an increase of 250 million times .

Like a computer really wants to brute force that .

How I Got “More”

I didn’t want to simply create a huge list of possible characters, and then use a loop and a random number generator to eventually build the string. I wanted to try to keep it as much as possible in native functions. I recognize that using a loop might provide more randomness, as there would be no possible pattern, but I feel this is sufficient.

I start with my original method, I just call that randomAlphaNumeric , cause that might be useful in other situations. Then, a whole mess of things.

I split the string based on a random number, then rebuild the string with spaces. I capitalize the first letter of every word, then remove all the spaces. Then I grab a couple of symbols based on a random length, attach them on the end, and shuffle the string. Last, I take the length of the original alphanumeric, simply to prevent hugely differing string lengths.

function randomAlphaNumeric() {    
	return base_convert(uniqid(rand(),true),10,36);
}

function randomPassword() {    
	$alphanum = randomAlphaNumeric();    
	$symbols = str_shuffle('~`!@#$%^&*()_-+={[]}|\\\\;:,<.>/?');        
	return substr(           
		str_shuffle(             
			str_replace(' ','',              
				ucwords(               
					implode(' ',                
						str_split($alphanum,                
							rand(1,strlen($alphanum)-1)                 
						)                
					)               
				)              
			)             
			.substr($symbols,0,              
				rand(1, strlen($symbols) - 1)              
			)           
		),0,strlen($alphanum)-1);
}

Again, this isn’t the best way of doing things. It was more of an exercise on my part to find an interesting way to generate a string that contained all the characters I wanted. We can easily remove some of the symbols from the $symbols string if any are illegal in the usage of the generated string.


Jul 9 2009

Requiring Login to CakePHP Admin

For some of my freelance clients, I have provided a home-brewed CMS, built using CakePHP, since I’d already used Cake for the rest of their web-site. I wanted to create a couple users, and an interface to be able to create pages that made up the navigation and content.

Building an admin area for this functionality, the first thing I did was turn on admin routing.

Since we’re logging in Users, we build a User model and UsersController. In the User model, write whatever extra you need, but the basics is just a validation function, to match username with password when logging in.

function validateLogin($data) {    
	$user = $this->find(array(        
			'username' => $data['username'],        
			'password' => md5($data['password'])        
		), array('id', 'username'));        
	return !empty($user) ? $user['User'] : false;      
}

You can add any additional validation rules in here, and you might want to use a better hashing use than simpy straight md5. Anyhow, this function will be called from the UsersController. So let’s dive into there.

UsersController

Throw together a simple login function. I’ll let your imagination put together the view, that’s just some simple html form stuff. Check for post data, validate with the above function, write the User object to the session and redirect to the admin area.

And conversely, logging out is simply destroying the User object from the session and redirecting out of the admin area. Simple stuff.

To help enforcelogging in though, we pay attention to the filter event of controllers, and check the session first. The process of a controller responding to the router is beforeFilter -> action -> beforeRender -> render -> afterFilter. You can make sure things happen before it calls any action, by writing a beforeFilter method. You could use this method to check for things, and if it’s not up to snuff, redirect to a different page. That’s what we’re going to do.

So, in the UsersController, we want to make sure that a user can’t do anything pertaining to users until they login.

function beforeFilter() {    
	if($this->action != 'login' && $this->action != 'logout') {        
		if($this->Session->check('User') == false) {            
			$this->redirect('login');           
			$this->Session->setFlash('The URL you\\'ve followed requires you login.');        
		}    
	}
}

Admin Requires Login

The last step is ensuring that any time a user wants to access the admin part of our app, they must be a logged in user, or they the boot. This involves using the same event as above, but in the AppController. The AppController lets us write something that should happen in every controller, because by default we should be extending AppController.

However, we don’t want it to alwaysbe forcing logins. Visitors who access public areas shouldn’t have to login. But if they want to access the admin areas, that’s when we want to force them.

With our admin routing turned on, any time the Router wants to invoke a method because of the Routing.admin configuration, it will add to the params of the controller 'admin' = 1. Thus, we can easily make our filter only care if admin is set.

class AppController extends Controller {     
	function beforeFilter() {
        if ($this->params['admin']) {
            if($this->Session->check('User') == false) {
                $this->flash("The URL you\'ve followed requires you login.",'/login',2);
            }
            $this->layout = 'admin';
        }
    } 
}

Admin Galore

Now, any page we want to setup as requiring admin access, like in my CMS, allowing admins to edit pages, we create functions prepended with value in your config for Routing.admin. By default, that’s admin.

This simple addition now requires you be logged in (with a user made either manually by myself or using a registration form you can build into the UsersController) in order to see the list of pages.

class PagesController extends AppController {    
	function admin_index() {        
		//setting the layout could be done in        
		//the AppController beforeFilter, so all        
		//admin pages use the admin layout instead.        
		$this->layout = 'admin';                
		$this->set('pages',$this->Page->getNav());
	}   
}

The default view for this follows the same kind of automagic rules for CakePHP: it’ll be app/views/pages/admin_index.ctp.


Jul 1 2009

A Basic Lesson in Password Hashing

In the world of the web, lots of sites are popping up requiring users to login. When you need to do so, there’s a bit more security than you might realize. You might be making a simple To-Do list, and might think:

Security? Pfft, I’m not too worried about people’s to-do lists being stolen.

But what you didn’t account for, is that all those username/password combinations a hacker just made off with? Yea, those are the same login’s to important stuff, like e-mail , or bank accounts . Yikes!

Worst Case Scenario

Most users of the web don’t think this hard about their own security, and even those that do, it’s too complicated to remember a unique username and password for every single web-site you visit . And most of those users that don’t know much about security also don’t realize the need for using complicated passwords. So if you’re collecting usernames and passwords, you need to design for the worst possible scenario:

Password: password1

Just Hash It All

So hopefully, the first thing you’re thinking is that you shouldn’t store your passwords in plain text. Good idea. You were thinking a hash, right? Because 2-way encryptions has it’s own security problems. Once a user discovers the encryption key, they can decrypt every password you have. So let’s hash everything.

There’s plenty of debate about the best hashing algorithms, so for sake of simplicity, I’ll just use md5*. But we’re not going to use the straight output of the hash. That’s like storing plain text. Instead, we’ll make sure we use a nice salt.

Not any salt. Nothing like “NaCLS4lt”. No no, that also makes it too easy for hackers to precompute. Instead, we’re going to generate a random salt for every single password. So even if they manage to crack just one password, they haven’t gained the salt for the rest of the hashes.

$salt = substr(md5(uniqid(rand(), true)), 0, $saltLen);  

We grab a nice, random string that gets hashed, and this garbled text becomes our salt.

$hash = md5($salt . $plain);  

Optionally, here you could do some manipulation, like splitting the password in half, or adding some salt to the end. The same goes for this entire process. The more you can mix it up without trying to come up with your own hashing algorithm, the more non-standard your passwords become.

Now then, we need this salt for later use, or else we’ll never be able to regenerate this hash when a user logs in! We’re not ganna store it in a seperate database field called salt. That gives it straight away to the hacker. Instead, we attach it to the hash and make it seem like the whole long thing is the password.

Many sites will suggest simply concatenating them together, $salt . $hash . However, I figure, with such a constant location, while the hacker does have to deal with random salts, he doesn’t have to worry about the location of it.

So we take something that is constant with the user, but different enough to allow variations for storing the salt. I’ll use simply the length of the password in plain text. If the password is 11 characters long, I insert the salt at character 11 of the hash. This way, it’s different than a password that is 8 characters in length.

return substr($hash,0,$saltStart).$salt.substr($hash,$saltStart+1);

Now, this function you’ve been building up, all you need to do is add an optional argument for a hash. When a user tries to login, you look up the hash from the database, and supply their password attempt and hash to this function. Now check if the hash is supplied, and if so, calculate the position of the salt in the hash, grab the salt, and use that for your md5($salt.$hash) part. If the function returns a hash that equals the hash in the database, you have the correct password.

*I don’t claim to be a cryptographer, so use at your own risk.


Jun 23 2009

Automagic Prefixes for Model Fields

Say we have a player model, and every field in playerstable is prepended with player_. For example, player_username, player_email, etc.

I’m personally not used to this database design, but I know plenty of people use it. When I work on projects that have this, I’m not particularly found of having to write:

$p = new Player;
echo $p->player_username;

I’d rather ditch the prepended part in all my PHP code.

echo $p->username;

Use Accessors

We can do this by writing some __get and __set functions:

public function __get($name) {    
	$prepend = 'player_'.$name;    
	if(isset($this->$prepend)) {        
	return $this->$prepend;    
	}    
	return parent::__get($name);
}

public function __set($name, $value) {    
	$prepend = 'player_'.$name;    
	if(isset($this->$prepend)) {        
		$this->$prepend = $value;    
	} else {        
		parent::__set($name, $value);        
		//if no parent, you might want the default:        
		//$this->$name = $value    
	}
}

Basically, stated before, these get called when you try to access a property that doesn’t exist on the object. So when we try to access username, we check if player_username exists, and if so, return that value.

MY_Model: Easily extendable

You could work this into a MY_Model class that extends Model, and then make all your models extend MY_Model. If you wanted to do this, I’d say make a property of MY_Model called ‘prefix’, and use prefix in the accesors. Then, in each sub-class, all you need to do is define the prefix.

class MY_Model extends Model {    
	protected $prefix;    
	public function __get($name) {        
		$prepend = $this->prefix.$name;        
		//...    
	}   
}
class Player_Model extends MY_Model {    
	protected $prefix = 'player_'; 
}

Mar 2 2009

Try end() next() time

Here’s a short tip today. I’ve been finding that when using foreach in PHP to check if there’s more in the array, the use of next() has failed me on multiple occasions . I just converted every instance of next() in my code to instead use end().

if($cat !== end($categories)) {    echo ", "}

This way, I don’t need to worry about the internal pointer being pushed around before the loop, or during the loop. I really only care if the value I’m abusing is the last value, anyways.


Page 1 of 2