Feb 20, 2011

Values of Different Degrees

As I said in my last post, I would put up some details about how much various types of degrees were worth, according to Canadian census data - as I said in the last post, this type of data should be fairly similar across countries that have similar economies as Canada, like the US, UK, or Australia. I'll make my code available so that if you want to test against your own country, you shouldn't have too much trouble.

Let's take a look at what we have. Here is a ranking of the types of degrees based on a 95% confidence interval of median 2006 incomes (all the values are inflation-adjusted to 2010 Canadian dollars):

1) Engineering: $54.4k - $58.4k
2) Commerce: $51.1k - $55.2k
3) Sciences: $49.8k - $54.3k
4) Education: $47.4k - $48.6k
Median income for university graduates: $45.3k - $47.5k
5) Social Sciences: $43.2k - $46.2k
6) Health/Food Sciences: $38.5k - $42.3k
7) Humanities: $35.0k - $37.4k
8) Fine Arts: $24.5k - $28.4k
Median income for non-university gradates: $25.5k - $25.8k

So it looks like engineers are on top, although their confidence interval has some overlap with commerce graduates which means that if we want to be 95% sure we are right, we can't say that engineers make more than commerce grads. If we reduce our confidence in our results, the data says that once you get down to about 80% confidence you can say that engineers make more than commerce grads. In other words, given our data here, there's a roughly 80% chance that engineers make more.

In the last post I also looked at the data from 1986. Let's see how the rankings change over time:

1) Engineering: $67.4k - $70.8k
2) Sciences: $53.1k - $56.4
3) Commerce: $51.7k - $54.7k
4) Education: $49.0k - $51.0k
Median income for university graduates: $47.9k - $48.9k
5) Social Sciences: $42.7k - $44.8k
6) Humanities: $38.6k - $40.7k
7) Health/Food Sciences: $35.3k - $38.7k
8) Fine Arts: $23.9k - $28.6k
Median income for non-university graduates: $23.0k - $23.2k

The most striking difference between these results and the ones from 2006 is the drop in wages for engineers. They used to be, by far, the most well-paid university degrees, whereas now they are only slightly higher than commerce degrees (which haven't changed all that much). This is somewhat disappointing for up-and-coming engineers (well, necessarily)!

The rest of the degrees haven't changed too much. A couple of them have statistically significant changes (such as education), but for the rest we can't really distinguish any change here from statistical noise - at least not at a 95% confidence level, if we accept a higher probability of being wrong then we can say things have changed.

This analysis here focuses on the median. Why do I do that? Why not use the average/mean? This is because incomes are something that are very skewed, meaning that that the average will be influenced a lot by the outliers that make tons of cash, making it look like a certain degree may be worth a lot when in fact most of the people with that degree are making a lot less than the average. The difference is most pronounced with commerce degrees: in 2006 the average income for a commerce degree holder is around $78k, which is the highest average for all the degree types. However the median is roughly $53k, showing that there is a massive amount of skewness in the distribution of commerce grads: those MBAs that are making big 6-figure salaries are pulling up the average. To get a feel for how most people with a certain degree are doing it is better to look at the median, which is much less likely to be affected by huge outliers than the mean is.

Again I'll repeat that these are correlations, not causations. In fact since the variance of the incomes for all of the degree types have increased dramatically, that could indicate that your degree is less important in determining your salary today than it was 25 years ago. In order to isolate the actual effect of the degree on wage you'd have to have a more sophisticated model that takes into account all sorts of other relevant factors like age, experience, ability, etc.

As always, the code is available here and the results of running the code are here.

Feb 17, 2011

Wages Over The Years

I recently read this article which makes a claim that real average wages (aka inflation-adjusted wages) have fallen over the years and supposedly "continue to fall." I looked at this and thought the idea was preposterous, so I rolled up my sleeves and did some real data analysis. Since I've been talking about statistics, I figured some of you might be interested in this knowledge.

tl;dr: Wages overall have actually increased, but not for university grads. University grads still make more than non-university grads. Income inequality has increased a lot, especially among university grads.

First off, my sources. I grabbed the data from the 1986 and 2006 Canadian Censuses (Censi?). Unfortunately I can't share this data with you since I am not legally allowed to distribute it, but if you have access to a university you should be able to dig it up somehow. This data is probably some of the best you can get, since it is much less likely to have selection biases compared to other surveys - people are legally required to fill this data out.
Second source was the Consumer Price Index (CPI) which can be used as a measure of inflation. That's how I adjust the raw figures in the census data for inflation. Statistics Canada is kind enough to list these figures here. All dollar values in this post will be inflation-adjusted to 2010 Canadian dollars.

Now I know that many readers here are not Canadian, however these results should be similar for countries with a similar economy to Canada like the US, the UK, or Australia. For those of you so inclined you can probably find the same data for your respective country to do the same analysis.

Let's get started. How would you go about figuring this stuff out? Well, you need the data. Once you get that, it's pretty straight-forward. I did my analysis using the following steps:
1) Filter the data. I want to look at people who are at least 15 years old, and have a regular old job. This excludes self-employment. This isn't a huge deal, just keep in mind that the averages here are for employed people.
2) Adjust wages for inflation. This is done by dividing by the CPI for the year of the data (1986 or 2006) and multiplying by the CPI for 2010.
3) Construct a confidence interval for the average wage. This lets us see if the wages between the two periods are actually statistically different. The formula for a 95% confidence interval in R-pseudocode is:
mean(wages) ± 1.96 * sd(wages) / sqrt(length(wages))
The 1.96 is the critical value of the normal distribution (sample averages follow a normal distribution) for a 95% confidence interval.

What are the results? Here's the R output:
[1] "Average wages for all employed individuals:"
[1] "Confidence for 1986: 30221.611097 to 30436.550551"
[1] "Confidence for 2006: 38479.736828 to 38790.160974"
[1] "Standard Deviation for 1986: 27382.937843"
[1] "Standard Deviation for 2006: 52381.845010"
[1] "Median Confidence for 1986: 25083.328237 to 25325.457532"
[1] "Median Confidence for 2006: 29704.694341 to 30054.387142"
What are we looking at here? Well, the average wage in 1986 was roughly $30k/year, where the average wage in 2006 was roughly $38.5k/year. Looks like wages in general are not falling.
The median is a better measure here though, since medians are a bit more robust to outliers (aka those few people who make hundreds of thousands of dollars a year). As we can see, the shift in the median is not quite as big as the shift in the mean, meaning that while the wages have gone up, they haven't gone up quite as much as the mean might indicate.
What does all this tell us? Well it is difficult to say for certain, but it would appear that in 2006 people are in fact making more money (as shown by the higher median), but there are also more people making giant salaries than before which will skew the average. Given the big jump in the standard deviation, we can see that there is an increase in income inequality.

Now, here's the real interesting part. I decided to run this again, but with one more twist: I filtered for people who have a bachelor's degree or higher. Let's see the results:
[1] "Average wages for bachelor's degree or higher:"
[1] "Confidence for 1986: 49543.651801 to 50383.047856"
[1] "Confidence for 2006: 59994.274103 to 61070.608811"
[1] "Standard Deviation for 1986: 37181.295254"
[1] "Standard Deviation for 2006: 82837.561885"
[1] "Median Confidence for 1986: 47938.828738 to 48884.408393"
[1] "Median Confidence for 2006: 46310.094270 to 47522.585319"
The average and the median wages here are much higher in both periods than for the entire group. This gives us a pretty good indication that university graduates make more money than non-university graduates.
However, it would appear that while the average university graduate makes far more in 2006 than they did in 1986, this result is misleading. When we use the median we can see that university salaries have actually gone down slightly over this time period. One conclusion that we can guess from this data here is that there are some university graduates in 2006 that are making huge salaries, while the bulk of the university grads aren't doing quite as well.

So there we have it, some statistical data. While the interpretations of the data and the methods for analysis here are up for debate, the numbers calculated are not. They're taken directly from census data, which is pretty darn good data (unfortunately it might not be quite as good for 2011, since the Conservatives have scrapped the long-form census and I'm not sure if this data will be on the short-form one).

Keep in mind that these are purely correlations, not causations. This data is not saying, "if you get a university degree then you will make more money." This data is saying, "people who currently have university degrees make more money on average."
It is also looking at the aggregate. I'm sure most of you can come up with examples of non-university graduates who are making good salaries, and of university graduates who are not making great salaries. These people are the exceptions, not the rule.

One thing that I could do in a later analysis is split the groups up into the different types of degrees. Both these censuses provide the level of education the people get (Bachelor's, Master's, etc.) and the discipline (sciences, engineering, arts, etc.). However this post is getting long enough, so that can wait for another day.

For the code, it is written using R and is available here as a gist on Github. Feel free to fiddle with it if you feel like it. You'll notice I generate some histograms of the data, however I found that they don't really reveal much info so I didn't include them here.

Feb 9, 2011

Oops!

In case you read the post earlier today, just ignore it. I got lazy and didn't do enough research beforehand, which was pointed out. My apologies!