This text is based on what I wrote for my Manchester United 2023 Summer data scouting project, and offers an assessment of David De Gea, Man Utd’s main goalkeeper for the past decade (he left this summer), while discussing the difficulties of (standard) data analysis when it comes to goalkeepers’ performance. This is half player analysis, half critical methodological review. (There is a similar problem about defenders and especially centre-backs, as Ben Griffis explains here)
For me it would be wrong to dismiss and disrespect what David (De Gea) has done for this club, which he clearly loves deeply, because of the completely wrongheaded decisions the club made about him over the past five years, i.e. of keeping him year after year as our main goalkeeper and giving him outrageous amounts of money (even in the context of the ridiculously inflated dimensions of modern football economics) despite all footballing sense suggesting the need to move on. Dave has been one of this club’s greats despite overstaying his time here by 5-6 years, and he was undoubtedly one of the best goalkeepers in the world in the 2010’s, leaving aside tactical considerations that more than anything else were undermined by our club’s completely dysfunctional planning and decision-making in the Woodward era. Let’s leave the 2011-2018 period aside, because it’s after his extraordinary 2017/18 season – where he single-handedly launched us to 2nd place (according to xG we would have been fifth) – that it went downhill. The point is not to pretend we should have predicted it would go as it has since that epic season, but that a club with a coherent footballing project and strategy wouldn’t have made this mistake to begin with. So let’s briefly talk about De Gea’s time at Utd since 2017.
DDG has a very peculiar goalkeeping style and technique, which produces some very impressive reflex saves but simply doesn’t suit the skillset required of a top 3 PL team GK since perhaps the mid 2010s (let’s presume that he was able to “get away with it” until then), i.e. that of a so-called “modern goalkeeper”. DDG’s profile is that of an epic shot-stopper bouncing at the last second from his line, with uncanny reflexes and heroic saves. Even when his overall shot-stopping is unconvincing, he’s able to produce these remarkable saves. But the problem is first, that good shot-stopping is just as much about saving what you’re expected to save, i.e. “easier shots”, as pulling out extraordinary highlights-reel-worthy skills. So at times when the Utd defence has let tons of attacks slip through the back four, DDG has often looked good by stopping shots that would have been expected to go in. But since 2018 he’s also let many goals go in because of moments of poor concentration, positioning, and so on. The occasional GK mistake is always gonna happen once in a while, even at the very top level; e.g. Neuer has had his fair share of poor decisions or slips in concentration over the years, but they came far apart in between long periods of utterly dominant and focused performance. But the norm since 2017/18 has been for DDG to be quite average in terms of standard shot-stopping/saving stats, despite making flashy saves regularly (but as we’ll see below, this isn’t necessarily a sign of good goalkeeping).
We will talk about more precise data analytics approaches later below, but as you can see since the peak of 2017/18 (98th perc.) his PSxG-GA (post-shot expected goals minus goals allowed) has only jumped above the 60s once, it was last year during our worst PL season on record. And that’s without looking at his percentile rankings for distribution and proactive shot-preventing actions (graph below), which are pretty much an insult to our club. Simply put, “de Gea’s shot stopping peaks have coincided with #MUFC’s worst defensive forms (17/18 & 21/22) & he’s been average even at stopping at other times. His claiming, distribution & sweeping are consistently rock bottom” (Varun Vasudevan).
Despite winning the Golden Glove – which is a misleading award rewarding a single player (GK) for the defensive solidity of the whole team (to be clear a GK might play a big role but he’s never gonna be the only reason for a team getting most clean sheets in a given season) -, this 2022/23 season exemplified the massive issues with DDG. His inability for distribution not only undermines our team’s tactical progression in terms of implementing the type of build up and progression that ETH wants, it also costs us many goals that result directly from misplaced and poorly selected or executed passes from the GK. He was also the worst PL goalkeeper when it comes to claiming crosses, as shown in this map from Statsbomb:
The defensive AND in-possession flaws of DDG, which in 2022/23 included underperforming at shot-stopping (Post-Shot Expected Goals), are reflected quite strikingly in the following radars and stats mapping percentile ranks:
However, it’s important to say here that conventional approaches for judging goalkeeping performance – including standard metrics provided by Opta and Wyscout, such as post-shot xG or save% – have some serious flaws and limitations, and fail to really examine the nuances and complexity of what a GK has to do for preventing, approaching and stopping opponents’ shots. Although as shown above DDG doesn’t rank particularly well in terms of post-shot expected goals, it’s not even a good or reliable way to examine GK performance, because as Smarterscout explain in their FAQ, “they assume the goalkeeper is not part of the duel with the striker until the shot is struck”. The issue with this is that obviously, an efficient high-level GK has a lot more to do than just wait for a shot and save it, from preventing the shot from happening in the first place – sweeping, cross-claiming, and other defensive actions, but also by not giving the ball away cheaply during build up – to getting prepared for the shot-stopping via smart positioning. Talking about DDG’s 2021/2022 season, Smarterscout therefore questioned the claims that he was “back to his best” on the basis of PSxG alone:
In a nutshell, we think that post-shot xG makes a GK look good when he or she takes the wrong approach to a shot and then miraculously saves it. Do this enough times, and post-shot xG will make you look like a phenomenon.
And again after the FA cup final loss against City, they stated that “he makes saves more difficult before he makes saves”. As a whole, their model shows a decline in general shot-stopping (taking into account pre-shot approaching/preventing) since the very high peak of 2017/2018:
Currently, John Harrison’s “Goalkeeper xG model” is probably the best attempt at constructing a coherent and meaningful data analytics approach for assessing goalkeeping performance: it “looks at every shot faced, cross faced, through ball faced & distribution attempted and calculates the probability of a goal occurring for & against a GK’s team before the event & after the event occurs” and gives each goalkeeper an “Overall Value in Goals” score which makes it possible to compare GKs of different styles (and from teams w/ different tactics that impact what the GK is asked to do). Now what’s a bit confusing here is that they did a first ranking of PL GKs at the half-way point of the season, that shows DDG’s level to be frankly mediocre (for a team like Man Utd, that aims to be at the top):
But before the FA cup final, while assessing his overall season, their model ranked him (slightly) above average, with an ‘overall goalkeeping value’ (the third column in the table above) of +1.90. If I understood their explanation correctly, what changed was that in a couple of games during the second half of the season (home fixtures against Fulham, Leicester City and West Ham United), he made some low-probability saves (i.e. shots that are more likely to be scored than saved) that basically moved him from being worth -2.37 to +1.90 goals to his team in terms of OVG. It’s likely not only these standout saves that explain this statistical change, because their model takes into account a wide range of variables, but they certainly contributed to it. In any case, I think the most generous thing that can be said about (post-2018) DDG in terms of what data/statistical analysis shows, is that he does have a certain ability to occasionally make epic last-ditch saves that wouldn’t be expected from most goalkeepers, but that generally speaking his shot-preventing, his defensive actions and proactive cross-claiming and sweeping, and his technical/distribution ability, is nowhere good enough for a title- and trophy-challenging a club like Man Utd (or like what it’s supposed to be).
As we saw, most existing stats and data approaches for judging goalkeeping performance are either flawed or very tricky to use, let me give you a few more examples. It’s beyond obvious that DDG is very poor at distribution, and it’s equally clear that Diogo Costa is one of the very best in the world at this. Unai Simon, David Raya and Alisson are also all clearly more reliable and skilled than DDG, yet if you use @DataMB_’s Plot comparison tool with simple metrics like passes per 90 x %accurate passes, you wouldn’t see a massive difference between all of them. It’s particularly striking that according to this data – from Wyscout – Costa and DDG have a very similar percentage of accurate passes (around 86%); and the fact Raya has a lower % than DDG would be shocking if you took stats at face value (i.e. as themselves constituting a definitive indication of who’s better without any context or further analysis).
Likewise, although Ederson certainly didn’t have his best season at Man City in 2022/23, the Radar comparison between him and DDG might lead one to think there isn’t much of a difference in their performance and abilities. Ederson even ranks quite a bit below DDG for prevented goals, according to this Wyscout data!
Wyscout data – according to @DataMB_’s website – also puts Dean Henderson in the 95th percentile for short and medium passes completion rate (%), higher than Ederson, Alisson, Costa and Onana. You hopefully get my point: it’s not enough to just pick a couple of seemingly-useful metrics and jump to conclusions, as if for instance DDG’s distribution ability was anywhere near Costa’s, or that Henderson was some kind of short pass genius. For passing, ball control and technique, this trap is pretty easily overcome if you start analysing clips and matches. And of course you have to put any piece of data back into the context of the player’s team’s tactics, overall league and team performance levels, and so on. So for instance, Raya has a somewhat lower passing accuracy than some of the top GKs across Europe; but that’s largely due to Thomas Frank’s peculiar tactics at Brentford, which since they were promoted to the PL rely on a lot of long balls from the GK. And long passes from GKs are obviously harder to complete (to be accurate most often), while also minimising the amount of shorter and medium-range passes that a GK with Raya’s ball-playing ability is very comfortable doing.