Grouping schools to tackle disadvantage
8th November 2024 by Timo Hannay [link]
Update 8th November 2024: See also this summary from The Gatsby Foundation and this further coverage from Schools Week.
This is the second of two posts about disadvantage in England's schools. If you have't done so already, we suggest reading Part One first.
Dimensions and degrees of disadvantage
Our previous post outlined a wide variety of often underappreciated headwinds faced by schools serving more disadvantaged communities, including difficulties in recruiting, greater use of supply teachers, lower levels of staff experience, higher spend on training, higher levels of sickness leave and (perhaps unsurprisingly given all the above) different areas of focus in Ofsted reports. Thus disadvantage manifests itself in many different ways, not just in rates of eligibility for free school meals and academic outcomes.
As we explored in some detail last year, social deprivation is not fully captured by income indicators alone: aspects like health, crime and the environment also have their effects. Furthermore, the main income metrics used for schools – pupil eligibility for free school meal (FSM) and the Pupil Premium (PP) – use a binary threshold to divide pupils into two somewhat arbitrary groups deemed 'disadvantaged' and 'not disadvantaged', and therefore fail to reflect the fact that deprivation exists on a continuum.
The upshot is that even schools with identical PP measures often experience very different local conditions with respect to levels of income deprivation, as well as with other aspects of deprivation such as crime, education, the environment, health and housing. In important ways, each school is different when it comes to deprivation, a fact that tends to get lost in the FSM and PP statistics.
One solution might be to prioritise by place. Education Investment Areas (EIAs) were proposed by Britain's previous Conservative government in 2023 as a way to target educational support to parts of England that had fallen behind. But as we have previously argued these also do a bad job of grouping together truly similar schools – except in the trivial sense that they happen to be located in the same local authority areas.
Are there better ways of grouping schools that retain important nuances without having to treat each one as its own special case? We believe there are. The work presented here provides a preliminary segmentation of schools in England based on Index of Multiple Deprivation (IMD) characteristics and POLAR4 (higher-education participation rates) of their local areas. For now, we'll look only at mainstream state secondary schools, but similar approaches could be applied to primary schools, as well as to other phases and forms of education.
The tl;dr version is that we end up with six clusters, roughly characterised as follows:
- Cluster 1, Suburban: This represents 'middle England' outside the major cities. Socioeconomic and educational indicators are mostly unexceptional.
- Cluster 2, Affluent Suburban: Richer suburban and rural neighbourhoods. The incidence of income deprivation is very low, but educational outcomes are not as good as you might expect.
- Cluster 3, Affluent Urban: Richer city areas, especially in London. Much greater levels of income deprivation than Cluster 2, but also higher levels of educational engagement and better outcomes.
- Cluster 4, Poor Urban: Especially in the North and the Midlands, but also to the east of London and elsewhere. Lots of adverse socioeconomic indicators, coupled with relatively weak educational outcomes – though not as bad as you might think given the levels of poverty.
- Cluster 5, Poor Suburban: Again, mainly in the North and Midlands. IMD indicators are mixed, but income deprivation is high and educational outcomes are poor. These are the areas that have fallen furthest behind.
- Cluster 6, Urban: Middling city areas in London, Birmingham and Manchester, among other places. Moderately high levels of income deprivation, but relatively good educational outcomes.
Each of these has distinct characteristics, not only in terms of the socioeconomic factors by which they are defined, but also in terms of school characteristics and educational outcomes. The journey to creating these clusters is as informative as the destination, so we hope you'll join us and read on.
Our sincere thanks once again to the Gatsby Foundation for collaborating in, and generously supporting, this work.
Counting clusters
Our main approach here will be to apply a clustering algorithm to the IMD and POLAR4 metrics of each school's local area (defined as postcodes within a 4km radius). Specifically, we will use k-means, which is a relatively simple unsupervised machine-learning method that can divide entities (in this case schools) into arbitrary numbers of groups containing statistically similar members. In this case, it will cluster together schools in local areas with similar IMD and POLAR4 characteristics.
IMD is composed of several different measures: Crime, Education, Employment, Environment, Health, Housing and Income1), to which we have added POLAR4. Some of these correlate strongly with each other, so it makes sense to examine whether these eight parameters can be reduced in order to eliminate those that provide little or no extra information. This process, known as principal component analysis (PCA), is a common prelude to conducting any kind of clustering analysis.
By applying PCA, we find that about 58% of the statistical variation between schools can be accounted for by a single component (Component 1) and a further 20% by a second component (Component 2). Adding a third component accounts for another 8% of the variation, but was ultimately found to have only minor effects on the clustering results, so for the sake of simplicity the rest of this analysis uses clustering based only on Components 1 and 2 – though we will briefly revisit below the implications of adding a third component.
What real-world characteristics do these components represent? Formally, they are just statistical measures, but it is fairly clear when looking at the clustering results that they roughly correspond to poverty and urbanisation, respectively. This shouldn't be too surprising: life is very different between rich and poor communities, and also for residents of urban versus suburban or rural locations. (The real-world correlate of Component 3 is less clear, but inspection of the clusters suggests that, like Component 2, it may be related to population density.)
Having chosen to use two statistical components, we also need to decide how many clusters or segments to apply to the population of schools. Figure 1 is a so-called 'elbow plot' that shows how well the clusters represent their constituent schools as we increase their number. This uses a statistical measure known as 'inertia' (only loosely related to the physical sense of that word), for which lower values indicate better representation. Sometimes the inertia reduces rapidly up to a certain number of clusters, with little improvement above this. In those cases, it is common to choose the number at which the gradient of the line flattens since that provides the best fit to the data with the lowest number of clusters. However, in this case there is no very clear inflection point, or 'elbow', in the line, so we could plausibly choose almost any number of clusters within the range shown, In the analysis that follows we will therefore explore the effects of varying the number of clusters between 2 and 6.
(Hover over the graph to see corresponding data values.)
Figure 1:
K-means inertia measure against number of clusters
Two by two
To give a sense of the effects of clustering, let's first see what two clusters look like. Figure 2 shows how these divide up based on the two principal statistical components, with each dot representing a school. They are mostly separated by Principal Component 1 (horizontal axis) rather than Principal Component 2 (vertical axis). The two clusters are not completely distinct, and there is some intermingling. In that sense, grouping schools in this way is a bit more like slicing pizza (you could choose to cut in a range of different places) rather than breaking off dough balls (which are naturally lumpy). This doesn't make the clusters arbitrary – far from it – but it does mean that we shouldn't read too much significance into which side of the boundary individual schools fall.
(Click on the legend to turn individual clusters on or off; double-click to show one cluster on its own. Hover over the dots to see corresponding school information.)
Figure 2: Two school clusters shown by their two principal statistical components
Figure 3 shows the locations of schools in Cluster 1. In general, these are in poorer areas, including cities in the North and Midlands, but excluding most schools in London. On average, these localities have worse IMD measures, especially higher crime rates, but also lower levels of education, employment, health and income. They also have very low POLAR4 scores (32% versus 52% for Cluster 2). Conversely, the housing indicator is good, probably at least in part because homes are relatively cheap in these areas. Within schools, Pupil Premium rates are high (an average of 30% versus 23%) and almost all educational indicators are below average (eg, Attainment 8 is 44 versus 51 and Progress 8 is -0.16 versus +0.20), with exclusion rates particularly high (fixed-term exclusions 26% versus 12%). Cluster 2 – which of course is composed of all those schools not in Cluster 1 – has precisely complementary characteristics. Note that in both cases the schools are scattered all over the country, albeit unevenly.
(Use the menu to select a cluster. Use the map controls to pan and zoom. Hover over the dots to see corresponding school names.)
Figure 3: Locations of schools in Clusters 1 and 2
Fab four?
But using two clusters achieves little more than recapitulating the traditional disadvantaged / not disadvantaged distinction by other means, so how about dialling it up to four? As we shall see, this starts to separate out different geographical groups too, especially across the urban / suburban divide. Figure 4 shows the resulting cluster memberships plotted against the two principal components.
(Click on the legend to turn individual clusters on or off; double-click to show one cluster on its own. Hover over the dots to see corresponding school information.)
Figure 4: Four school clusters shown by their two principal statistical components
Figure 5 shows the locations of schools in these four clusters.
Cluster 1 schools are overwhelmingly in or around London and tend to have good scores across most IMD measures, though environment is moderately poor and housing deprivation is high (in large part because homes are expensive). POLAR4 scores are very high (average of 52%). Academic attainment and progress rates are high (Attainment 8 is 52, Progress 8 is +0.29) and exclusions are low (fixed-term exclusions are 11%), even though Pupil Premium rates are also moderately high (28%). In other words, poverty and wealth tend to coexist in these areas, and educational engagement and outcomes are generally good, especially given the high incidence of income deprivtion.
Cluster 2 schools tend to be in cities in the Midlands and the North, or in other disadvantaged areas around the coast. Average IMD scores are uniformly bad and POLAR4 scores are low (33%). Pupil Premium rates are high (35%), as are exclusion rates (26%). Academic attainment (44) and progress (-0.14) are both low.
(Use the menu to select a cluster. Use the map controls to pan and zoom. Hover over the dots to see corresponding school names.)
Figure 5: Locations of schools in Clusters 1 to 4
Schools in Cluster 3 are located in towns and around major conurbations such as London, Birmingham and Manchester. They have very low Pupil Premium rates (19%), but only moderately high POLAR4 rates (43%) and only moderately good academic outcomes (Attainment 8 is 49, Progress 8 is +0.08). IMD metrics are generally good with the exception of housing (again, because it is expensive).
Cluster 4, schools are also mostly outside cities, but grouped in particular locations in the North, the Midlands and to the east of London. They have moderate Pupil Premium levels (29%) and IMD scores are mostly mediocre rather than low, though environment and housing are good. However, POLAR4 is very low (29%) and school exclusions are very high (also 29%), while academic attainment (44) and progress (-0.21) are both very low.
What we are starting to see here is that the relationships between poverty and education are more complicated than is often appreciated. True, richer areas tend to do better. But some urban areas – notably London – do well despite having high levels of income deprivation (see Clusters 1 and 2), while suburban and rural areas tend to underperform by comparison, especially after allowing for their relative poverty levels (see Clusters 3 and 4).
V6
Finally, let's crank the cluster number up to six. Among other things, this provides a bit more nuance on the poverty-affluence axis. Figure 6 shows the resulting cluster memberships plotted against the two principal components.
(Click on the legend to turn individual clusters on or off; double-click to show one cluster on its own. Hover over the dots to see corresponding school information.)
Figure 6: Six school clusters shown by their two principal statistical components
Just for fun, Figure 7 shows what six clusters look like when generated using three principal components. The extra dimension aside, the resulting clusters are actually very similar to the corresponding clusters created using just two principle components: overlaps in school membership are all over 85% and most of them (four out of six clusters) are well over 90%. For this reason, we will continue to use the clusters created using two principal components.
(Click on the legend to turn individual clusters on or off. Hover over the dots to see corresponding school information. Click and drag to rotate the figure. Scroll to zoom in or out.)
Figure 7: Six school clusters shown by their three principal statistical components
Figure 8 shows the locations of schools in these six clusters. At the risk of oversimplifying, we will assign each one with a descriptive name in order to distinguish them more easily. It is important to emphasise that these names are not definitions, just labels. The clusters are defined only by the IMD and POLAR4 characteristics of their constituent of the local areas around each of their schools.
We will call Cluster 1 "Suburban". (It includes rural areas too, but these account for only around 7% of the total school population compared to over 40% for the suburbs, so we will use "suburban" in all of our shorthand descriptions.) Broadly speaking, these represent middle England outside cities. IMD indicators and POLAR4 (35%) are mostly unexceptional. The same goes for school measures: the average Pupil Premium rate is 24%, average Attainment 8 is 46 and average Progress 8 is -0.09. The mean fixed-term exclusion rate is 21% and the absence rate is 9% of sessions.
Cluster 2, "Affluent Suburban", represents richer suburban and rural neighbourhoods. IMD measures are uniformly good, with the exception of housing, which is expensive. POLAR4 is quite high (50%). Pupil Premium rates are very low (17%), while academic attainment (51) and progress (+0.14) are reasonably good. Rates of exclusion (13%) and absence (8%) are low. Cluster 3, "Affluent Urban", represents richer areas in cities, especially London. These schools tend to do better than those in Cluster 2 despite having higher levels of income deprivation. With the exception of environment and housing, IMD measures are good, and POLAR4 is exceptionally high (71%). Academic attainment (54) and progress (+0.38) are both very good even though Pupil Premium rates are quite high (27%). Rates of exclusion (11%) and absence (8%) are both low.
(Use the menu to select a cluster. Use the map controls to pan and zoom. Hover over the dots to see corresponding school names.)
Figure 8: Locations of schools in Clusters 1 to 6
Locations in Cluster 4, "Poor Urban" have uniformly poor IMD metrics, with the exception of housing, which is cheap. POLAR4 rates are rather low (33%). Pupil Premium rates are high (35%), while academic attainment (44) and progress (-0.15) are both moderately low. Absence (10%) and exclusion (24%) rates are quite high. Cluster 5, "Poor Suburban", has good environment and housing scores, but poor values for other IMD metrics and very low POLAR4 scores (27%). Pupil Premium rates are high (34%). The same goes for rates of absence (10%) and exclusion (35%), while academic attainment (42) and progress (-0.26) are both low. Taking these two clusters together, we once again see urban areas outperforming suburban and rural ones with similar levels of income deprivation.
Cluster 6, "Urban", has moderate IMD scores, with the exception of housing, which is poor (ie, expensive). POLAR4 (48%) is somewhat high. Pupil Premium rates (32%) are also quite high, but rates of absence (8%) and exclusion (13%) are both low, while academic attainment (49) and progress (+0.21) are both quite good. This is another example of an urban group that appears to outperform its socioeconomic fundamentals in terms of educational outcomes.
What k-means might mean
It is important to emphasise that that the educational metrics mentioned here – attainment, progress, exclusions and absences – were not used to create the clusters, yet they display clear disparities across them. This is a reminder that educational effectiveness and outcomes do indeed correlate with social factors. Furthermore, this way of grouping schools illustrates the importance not just of income but also of place – in terms of the level of urbanisation rather than the name of the region or local authority area. Indeed, there appears to be an interaction between the two, with some urban areas performing exceptionally well despite high levels of income deprivation. From this analysis alone we cannot say exactly why, but perhaps we should be considering the educational impact of the cultural and social capital often associated with more densely populated places, not just the single dimension of affluence and poverty that currently gets all the attention.
Either way, as we have hopefully shown, it makes little sense to segment schools based solely on their FSM or PP rates. As well as being crude threshold measures, they hide important information about different forms of deprivation. Neither is it sensible to simply rely on the part of the country, rather than the type of place, in which the school is located. The approach presented here in effect combines income, geography and other factors. In an age of plentiful data, this more nuanced view is what we should mean by 'similar' schools, at least when it comes to the local socioeconomic environments they experience.
The work presented here is just a start and we hope to continue developing and applying it. In particular, it would be interesting to see how robust the clustering is to changes in the underlying data and algorithmic parameters, as well as to repeated rounds of clustering (given that k-means makes use of random searches to create clusters, which might therefore represent a local optimum rather than a global one). We would also like to further explore the educational and other characterics of these and similar clusters, as well as their possible policy implications. Of course, we can also increase the number of clusters further to cater for situations in which we want to discriminate more at the cost of additional complexity.
In the meantime, subscribers to SchoolDash Insights can explore these trends further, especially in the Schools section. Non-subscribers can request a trial account or demo. To keep up to date with more analyses like this one, sign up for our free monthly-ish newsletter. We also welcome questions or comments; please write to: [email protected].
Footnotes: