seanmonstar

Mar 3 2010

A Less-Random Generator

In game development, it’s very common to want a random number. Maybe you want to determine damage done, if there was a critical, or what slot on the board to insert your piece at. And surprisingly (or perhaps, not), programmers are often looking to make this random number a little less… random.

Less Random, you say?

Yea, seriously. We really don’t want a real random number, because random is too random. Backgammon players rage on about how un-random virtual dice rolls are, and programmers can go to extreme circumstances to provide the right kind of random.

Role playing games have to overcome this too-random problem as well. In an RPG, you might be getting a random number to determine if the player character hit the monster. And considering the situations you find yourself in when playing an RPG, this dramatic but randomly generated experience can really suck:

A blade spider is at your throat. It hits and you miss. It hits again and you miss again. And again and again, until there’s nothing left of you to hit. You’re dead and there’s a two-ton arachnid gloating over your corpse. Impossible? No. Improbable? Yes. But given enough players and given enough time, the improbable becomes almost certain. It wasn’t that the blade spider was hard, it was just bad luck. How frustrating. It’s enough to make a player want to quit.

—Randomness without Replacement

It’s really quite interesting how much we really don’t want real randomness. Because real randomness is not biased. Pure randomness doesn’t care if that’s your 20th “1” in a row. It can lead to frustration, and cause people to blame the game for sucking, when really it was just a bad sequence of random numbers.

How We Perceive Random

It turns out, when we say random, as a player, we actually mean controlled random. Compare these images of dots plotted on a grid:

Which picture looks like the random we want? Apparently, the first is too random for humans. What I mean is, it doesn’t look random. We quickly identify patterns in the truly random picture. We see groupings and think there must have been numbers that were favored to get that result. We want to believe that randomness will evenly distribute it’s results across the spectrum. No patterns, no missed numbers, no repeats. Even.

The truth behind these 2 images is that the left image is a true random plot. The image on the right was controlled.

The plot on the right is […] composed of 64 smaller squares, each of which has 4 points placed at random. People don’t like the leftmost plot because it has several clumps of points that seem non-random. In fact, true randomness consists of a mixture of clumps and non-clumps. Randomness is different from homogeneity.

—Warning Signs in Experimental Design and Interpretation

A programmers solution

The way to handle controlled randomness is actually pretty simple. It’s commonly called a shuffle bag. The principle is that you take a bunch of tokens and put them in a bag. Then when you need another value, you pull a token out of the bag, and use that. Once the bag is empty, you fill it back up again.

You can control the percentage of a positive or negative result by setting the ratio of tokens you insert into the bag. You can also control what sort of “sprees” you can get from you bag by inserting duplicate values.

For example, with 1 hit value and 1 miss value, you have a 50% (1/2) chance of hitting. You also have the possibility of getting 2 misses or hits in a row. If you changed that to contain 5 hits and 5 misses in the bag, you could possibly end up with 10 in a row.

import random
class ShuffleBag(object):
	def __init__(self, values):
		self.values = values
		self.list = None
	def next(self):
		if (self.list is None) or (len(self.list) == 0):
			self.shuffle()
		return self.list.pop()
	def shuffle(self):
		self.list = self.values[:]
		random.shuffle(self.list)

The usage would be pretty simple. If I want a 20% chance of getting a critical hit on a damage roll, I would implement that like so:

bag = ShuffleBag([1, 0, 0, 0, 0])
while attacking:
	is_critical = bag.next()
	if is_critical:
		dmg = MAX_DMG
		doDmg(dmg)

Who’d have thought that you needed to do something special just to get “fun” random numbers? I think the root of it has to do with how statistics are all just a lie.


Jan 27 2010

Import * Considered Harmful

Something a Java programmer learns first is that there is this big, amazing library already built-in to Java, and you can easily use plenty of useful classes by using an import statement. Possibly the first thing you want to do is pop open a box to prompt your name, or say hello, and thus starts this terrible habit:

import javax.swing.*;

I’m guilty of it too. You don’t really know what you’re doing is all that bad. You know what you want from Swing. You only need the JOptionPane. And sure, the compiler shouldn’t be stupid enough to pack the rest of the Swing package into your jar file. In Java, at least, it won’t. There’s talk about whether certain bulk imports in Python will cause things to be included multiple times.

Collisions, or which List did you want?

However, in Java, you canget namespace collisions. coobird on Stack Overflow gives an excellent example:

Now, if we were to use a wildcard in the package import, we’d have the following.

import java.awt.*;
import java.util.*;

However, now we will have a problem!

There is a java.awt.List class and a java.util.List, so referring to the List class would be ambiguous. One would have to refer to the List with a fully-qualified class name if we want to remove the ambiguity:

import java.awt.*;
import java.util.*;// 'List' from java.awt -- need to use a fully-qualified class name.
java.awt.List listComponent = new java.awt.List()

This problem is exactly what I was trying to avoid when doing working with some Java, and prompted my need to let people know never to do this again. I was trying to call a class from the YUI Compressor jar, and the constructor required several classes. Unfamiliar with a couple of the names, I didn’t simply want to copy their import statements, since I already had written my own File class that was far more basic than Java’s. No need for conflicts, please.

Your code doesn’t have this problem? You’re only importing from one package, you say? What about the future of your code? Your class is still auto-importing the rest of your class’ residing package. What happens when someone creates a class called List? Or something else? Conflicted.

This leads to another frustrating reason not to use import star.

It screws Discoverability

Specifically, I was unsure which ErrorReporter was needed for the JavaScriptCompressor. The import statements at the top list 3 packages it could come from, and the only way for me to find out it to search each package.

package com.yahoo.platform.yui.compressor; 
import org.mozilla.javascript.*;
import java.util.*;

ErrorReporter could be a class defined in this package (com.yahoo.platform.yui.compressor), or it could be java.util, or org.mozilla.javascript. It turns out it’s in the latter, but discovering that took longer than it ever should have. Even the few minutes I had to spend to lookup which package contained the class so I could import it into my class file was minutes wasted. It’s effortless to have to used a more specific import statement instead. Especially if you’re using an IDE like Eclipse (which you are if you’re doing Java development, just press Ctrl + Shift + O).

Flex Builder is an extension of Eclipse, so no excuses there either. I imagine Flash has a similar shortcut, though even if it didn’t, just like in Python, it’s really not that hard. Honestly, it takes no extra effortto write the name of a specific class instead of importing the whole dang package or module.

This reason is what I feel is the more important reason why you shouldn’t use import * ever again. The more time it takes another programmer (or even yourself) to understand what in the world was going on inside your head at the time of writing, that’s time (and thus money) you’re costing your company.


Jan 14 2010

3 Tips When Switching to Python

If you write a lot of Javascript or PHP, there are a couple of habits you might be used to that need to change a bit when you switch over to Python.

  1. Accessing a property in a dictionary with a variable
  2. Setting properties on objects with a variable
  3. Using While with a function call

Check in first

When looping through a list or dictionary, it’s not uncommon to compare the current indexed value to a value in a different list or dictionary. However, doing that will quickly teach you that trying to access a key that doesn’t exist will raise a KeyError.

After my first KeyError, I first tried to wrap the comparison in a try block. But that can start to look unwieldly:

try:
	val = params[key]
except KeyError:
	val = None
if val:
	#...

Alternatively, Python dictionaries have a get method. Hence we do this, and it’s all pretty!

val = params.get(key, 'defaultVal')

You can just setattr

A while ago, when doing some initial Django development, I tried to naively handle submission of forms the same way I do in PHP. I loop through each key value pair in the POST dictionary, and assign it to an instance of the model I want to insert. No worries if extra information has been submitting, the model will send data that we have specifically set at fields in the class definition. However, objects in Python don’t allow item assignment like PHP or Javascript. Every object does have a personal __dict__ that I could access, but then I can get KeyErrors as the above example shows.

Denis Otkidach showed me Python’s setattr function, which lets me do exactly what I wanted.

for key, value in POST.iteritems():    
	setattr(my_model, key, value)

While True, loop foreverrrrr

Often times, when you don’t have a predetermined length of something, you’ll use while to do your looping. A common occurrence of this is when reading in a file. You call a function, and store its return value, and as long as that value is usable, do your loopity loop stuff. However, in Python you can’t do assignment inside a condition for a control structure like while, probably because Guido likes to prevent bad practices from being possible in his language, and that is usually a bad practice unless you know what you’re doing.

There maybe a more elegant way of handling this, but i resorted to making an infinite loop that breaks on a bad condition:

while True:
	val = func()
	if val:
		pass
	else:
		break

If you write a lot of Javascript or PHP, there are a couple of habits you might be used to that need to change a bit when you switch over to Python.

  1. Accessing a property in a dictionary with a variable
  2. Setting properties on objects with a variable
  3. Using While with a function call

Nov 25 2009

Extending Django Models, Managers, And QuerySets

In a recent pet project, I’m exploring Django. As I’m used to in our PHP framework, I like to extend Models with methods that a model should keep contained, and then I can call multiple times elsewhere in the Controller View in Django (don’t start me on the stupidity of the naming scheme). In PHP, it’s a bit more straight forward: You can simply write some new functions inside the class. In Django, it was a little more complicated. I explored several different parts that all affect writing methods that should be contained in the Model area of the application.

Models

First, Models. You can simply write some methods in the Model class definition, the same way you’d like to in PHP. A difference though, in Python we don’t get function decorators like we do in PHP. In PHP, I would write instance methods that manipulate an object, or instance, of the Model. Such as $ball->explode(). I would write static function that manipulate the table of models, such as Ball::get_exploded().

In a Django Model, the methods we define are only there to manipulate instances of the Model (in most cases). For example:

class Ball(models.Model):
	def explode(self):
		self.exploded = true;
		self.lifetime = datetime.now() - self.created_at

We would use this elsewhere to make sure that when the Ball explodes, we also record how long the ball was inflated.

ball = Ball.objects.get(id=1)
ball.explode()
ball.save()

Manager

The Manager is how we access the table. It’s largely like the static methods we might use in PHP. The default property to access the manager is objects.

Managers provide a good set of methods to select and filter the objects you want to receive. However, I started to notice certain trends in the functions I would use, receive a certain group of objects. Naturally, moving those combinations of methods, plus complicated extra calls, into their own methods is good for DRY. I’ll use a simple example for now:

class BallManager(models.Manager):
	def get_exploded(self):
		return self.filter(exploded=True)class Ball(models.Model):
	objects = BallManager()

Now, we get a new method to access all exploded Balls in the database.

Ball.objects.get_exploded()

This is supremely more useful when you start making complicated queries, in several different views. Let me just show you an real example in my pet project:

def due_this_week(self):
	return self.extra(where=["due_date > now() - interval '1 day'", "due_date < now() + interval '7 days'", "not(due_date isnull)"])

This insures that if I find my query to be slightly buggy, or if I want to pad it an extra day, I only have to change my method. The benefits should be obvious enough. Manager methods usually return QuerySets, so let’s see why extending the QuerySet is also useful.

QuerySet

By adding our method to the manager, you can call it from the Manager property, but we might want to use our methods later on in a query. Currently, the method is only available from Ball.objects.get_exploded().

By adding the methods to the QuerySet for Ball, we can use get_exploded() after a filter().filter().extra().

However, adding a method to the Manager and also the QuerySet the Manager uses would mean writing the method twice.

QuerySetManager

Doing some searching, I found the QuerySetManager, a snippet someone had put together that allows us to add methods to both the QuerySet and Manager at the same time. We define the QuerySetManager, and then tell the model use that for its Manager. Then, we can define a QuerySet inside the Model declaration, since Python classes allow you define inner classes.

Here we go.

class QuerySetManager(models.Manager):
	def get_query_set(self):
		return self.model.QuerySet(self.model)
	def __getattr__(self, attr, *args):
		return getattr(self.get_query_set(), attr, *args)

The Manager.get_query_set is the function that gets called internally whenever it needs to retrieve a query set of the manager’s models. By overwriting it, we can return a different QuerySet, one we extend to have new methods.

Defining __getattr__ is like defining magic functions in PHP: any attribute (read: method or property) that doesn’t exist, will try the __getattr__ method, before raising an AttributeError. This lets us write all the methods on the QuerySet, and then any method we call on the Manager, will try to get the method from the QuerySet instead.

With the QuerySetManager, we can define a QuerySet to use in the Ball model.

class Ball(models.Model):
	objects = QuerySetManager()
	class QuerySet(QuerySet):
		def get_exploded(self):
			return self.filter(exploded=True)

Now we can use our custom method in whichever order we want.

Ball.objects.get_exploded() # called on the ManagerBall.objects.filter(size=4).get_exploded().order_by('created_at') # called on a QuerySet