Analysing the Good University Guide

Have you ever wondered how Oxford and Cambridge get to the top of the good university guide? Is it an accident? Is it because they are the oldest universities in the UK? Is it because they really are the best? Well. these universities are at the top of the pile just about every year because of the way they assess what they mean by the best. Here's how the system works.

The following table shows the top ten universities as at the beginning of the summer 2009:

No
Uni
Student Satis
Res Qlty
SSR

Serv &

Facs £

Entry Stds
Completion
Good Honours
Grad Prospects
Total
1
Oxford
85.00
3.50
10.80
3396.00
524.00
97.70
91.10
82.30
1000.00
2
Cambridge
86.00
3.70
11.60
2385.00
539.00
99.00
87.00
85.50
968.00
3
Imperial College
75.00
2.70
10.30
3513.00
439.00
97.10
63.50
83.40
359.00
4
St Andrews
84.00
2.50
12.40
1423.00
468.00
94.20
85.10
77.80
792.00
5
University College London
77.00
2.70
8.90
1734.00
452.00
92.00
30.40
82.90
775.00
6
Warwick
77.00
2.40
13.10
2118.00
463.00
96.00
79.70
79.20
772.00
7
LSE
73.00
2.80
13.30
1699.00
483.00
96.50
76.00
90.60
763.00
8
Durham
79.00
2.50
14.80
1573.00
459.00
96.70
77.50
78.30
749.00
9
Exeter
85.00
2.50
17.50
1378.00
394.00
94.80
79.40
71.70
723.00
10
Bristol
74.00
2.60
13.10
1657.00
447.00
95.60
81.50
82.00
722.00

The full table has 114 universities in it and that's what we will be looking at. You can see the variables in this database from the above table, summarised below:

University name
Student Satisfaction
Research Quality
Student Staff Ratio
Services & Facilities £
Entry Standards
Completion
Good Honours
Graduate Prospects

In the book Excel 2007 with Excel Master you will learn how to carry out regression analysis on two or more variables. Here is a rather complicated case study of what you can learn from using regression analysis.

Installing the Excel 2007 Data Analysis Add In

In case it's already been installed, here's how to find it: click on the Data Tab and then on the right hand side of the ribbon you will see Data Analysis in the Analysis section ... this is where the installed Add Ins are kept in Excel 2007.

If the Excel 2007 Data Analysis Add In has not been installed on your system yet, this is how to do it: click to watch the video.

Excel 2007 Data Analysis Add In

What we are going to do is to tell Excel 2007 that a University's ranking is the dependent (Y) variable that depends on all of the other variables, the X variables. Regression analysis tells us whether there may be any linkages between the variables and whether we can use them to make predictions or forecasts.

Here are the abbreviated outputs of the regression analysis using the Data Analysis Add In:

SUMMARY OUTPUT
   
Regression Statistics  
Multiple R 0.9461
R Square 0.8952
Adjusted R Square 0.8872
Standard Error 53.6808
Observations 114
 
Coefficients
Standard Error
t Stat
Intercept -10.0535 88.5966 -0.1135
Student Satis 0.3851 0.3749 1.0271
Res Qlty 43.9892 15.5429 2.8302
SSR -7.1081 2.4591 -2.8905
Serv&Facilities £ -0.0174 0.0148 -1.1728
Entry Stds 0.6856 0.1898 3.6132
Completion 0.4896 0.3606 1.3577
Good Honours 2.2467 0.8190 2.7433
Grad Prospects 2.6057 0.9510 2.7398

The first thing we might notice is that the R2 value is very high at 0.8952 which means that overall the X variables are highly correlated with the Y variable.

Secondly, the regression equation is rather large:

Y = -10.053a + 0.385Student Satis + 43.989Res Qlty -7.108SSR -0.017Serv&Facilities £ + 0.686Entry Stds + 0.49Completion + 2.247Good Honours + 2.606Grad Prospects

The direction of the slope: positive or negative;

A positive value of a coefficient (the 'b' value) means that there is a positive relationship between an X variable and the Y variable. That is, as X increases so does Y and vice versa. Similarly, if the 'b' coefficient is negative it means that as X increases, Y decreases and vice versa.

Let's take a look at a few of the 'b' coefficients then to try to understand their true meaning and significance:

  • Student satisfaction is a positive variable: this tells us that the higher the level of satisfaction of the students with their university the higher up the rankings the university will be.
  • Student Staff Ratio (SSR) is a negative variable: can it be true that the lower the number of students to the number of members of staff, the higher up the rankings a university will go? Yes indeed. Everyone seems to know of the university tutorial system and that in universities such as Oxford and Cambridge, it is not entirely unusual for there to be 1: 1 and 2: 1 tutorials. The low number of students at tutorials is a sign of the quality of a university.
  • Services and Facilities: this is an odd one as it is negative which means that the fewer services and facilities there are, the higher up the rankings the university will go. Why would that be I wonder?

Size of the 'b' Coefficients

Is there anything to be said about the size of the 'b' coefficients? In brief, yes! We can see that Research Quality has a 'b' coefficient value of 43.9892, the highest coefficient by a factor of around 6. We can conclude from this that the higher the quality and possibly quantity of research, the higher up the rankings the university will go. We can conclude that the research base of a university is the biggest single key to the quality of a university.

 

The second largest 'b' coefficient is the SSR at -7.1081, again the conclusion being, as we said earlier, that class sizes are an important determinant of university rankings.

 

We did say that services and facilities was negative and therefore the fewer of them the better as far as rankings are concerned. However, the value of the services and facilities 'b' coefficient is very small, in fact virtually zero meaning that it is really an insignificant variable.

t statistics

With the number of observations being so high at 114, we expect significant t statistics to be above 2. The t statistic is above 2 in only five of the eight variables. This means that student satisfaction, services and facilities and whether students complete their courses are probably insignificant and can probably be deleted from the model.

Reformulate the Model

In view of the t statistics, we should reformulate model by dropping the three variables with the insignificant t statistics, this gives us:

 

Regression Statistics
Multiple R 0.9517
R Square 0.9057
Adjusted R Square 0.9004
Standard Error 10.4291
Observations 114

 

 
Coefficients
Standard Error
t Stat
Intercept 151.8017 17.0855 8.8848
Student Satis -0.0472 0.0717 -0.6577
Res Qlty -7.1715 2.9679 -2.4163
SSR 2.3277 0.4699 4.9537
Entry Stds -0.0979 0.0359 -2.7283
Good Honours -0.4391 0.1569 -2.7988
Grad Prospects -0.9557 0.1838 -5.2004

 

with the new model of:

 

Y = 151.802a -0.047Student Satis -7.172Res Qlty 2.328SSR -0.098Entry Stds -0.439Good Honours -0.956Grad Prospects

 

We have improved the R2 value as it is now higher than it was before and is now 0.9057.

 

The 'b' coefficients are radically different now: research quality is still the largest variable except that it is now negative! Is this rational? Have we really improved the model? Was our previous analysis flawed? Moreover, all but one of the 'b' coefficients are negative now. That needs thinking about, further analysis required.

What Other Variables Could we Include?

 

Is there anything else we should say here before we finish? Yes, we should ask whether there are any other variables that should be included in deciding whether a university should be ranked 1 or 2 or 114. One that immediately springs to mind is teaching: quality of, quantity of; how about the size of the university, that is, the number of students; how about the size of the library/library budget?

Conclusions

There you are, a taster of what we can do with Excel 2007's Data Analysis Add In: we have not finished this analysis by any means. The purpose of this page is to let you see that for quick and easy analyses, Excel 2007 can have a lot to offer. Even with much larger databases than we are using here, Excel 2007 can cope and provide us with a good initial, if not final, analysis.

© Duncan Williamson

20th August 2009

 

Download this page in PDF Format

Download the Excel 2007 file on which this page is based