Quantile Regression

Quantile regression is a variant of regression which instead of minimising error (finding the mean), aims to have a certain percentage of observations above and below the target.  In its simplest form the target is 50% and so the algorithm will find the median.

That’s enough theory, for people who want more I’d suggest Wikipedia to start and this great paper for more details.  I’m more interested in applications so I’ll concentrate on them.  If normal regression is about answering the question ‘what’s most likely to happen’, then quantile regression is about answering the question ‘how likely is this to happen’.  Some practical examples follow:

How confident can I be that this campaign will make a profit?

How confident can I be that this patient doesn’t have cancer?

What’s the most I can reasonably expect this person to spend?

What’s the least revenue I can reasonably expect this store to make?

Such questions can be attempted with regular regression if errors are assumed to be normally distributed - predict the expected value and add a standard deviation or two to increase confidence.  In practice I’ve found this a very poor approximation to make.  Say somebody is an ideal candidate for spending all their money, they will be predicted to spend quite a bit but after adding a couple standard deviations we’ll predict they will spend more than all their income!  

Quantile regression hasn’t made it into many tools yet, you’re pretty much limited to R and SAS if you want to give it a whirl.  And even then, it’s an optional add-in to R and in SAS it’s marked experimental and effectively can’t be called from Enterprise Miner.

To all those statistical pedants out there, yes I have oversimplified my explanation above.  But for the kinds of uses I make, the description above is accurate enough.