Second Serves: The Need for Speed

As players progress in their tennis careers, arguably the hardest transition they go through is jumping from high level junior tennis to the professional ranks. The world of fancy hotels, top notch facilities, and feeling of proximity to the top of our sport is replaced by the reality of Futures – the minor leagues of professional tennis. You need to climb the rankings ladder again by competing in remote tennis clubs, where time on the practice court often constitutes wishful thinking, and for the first time you’re playing against opponents literally fighting for their livelihood.

Besides needing to adjust to this new reality mentally and emotionally, I often get asked what the main differences are in the actual game, on-court, between high level junior and high level professional players. One area is the speed of the game, and I’d like to illustrate this point using second serve speed data from the men’s and boys’ draws from this year’s Wimbledon.

Below is a table that has the average second serve speeds from the boys’ draw on the left, and the men’s draw on the right. I looked at the round of 16 onwards in both cases, for 30 total data points each.

Juniors: Avg 2nd Serve Speed (mph)RoundMen: Avg 2nd Serve Speed (mph)Round
91R16104R16
103R16106R16
92R16107R16
100R16104R16
98R1697R16
103R1697R16
96R16100R16
91R1699R16
97R16103R16
100R1698R16
95R16101R16
101R1698R16
91R16106R16
93R1691R16
97R1697R16
98R16104R16
97QF104QF
96QF110QF
104QF100QF
93QF101QF
103QF106QF
100QF103QF
88QF96QF
88QF107QF
97SF110SF
104SF105SF
99SF107SF
91SF106SF
95F108F
91F105F
96.4Mean102.67
22.52Variance20.99
Data Courtesy of Wimbledon.com

Without going too deep into statistics, we would use something called an unpaired t-test to determine whether the difference in means between the two groups is statistically significant. In our particular case – we have 30 observation in each category, with the means and variances listed above – we get a p-value of less than 0.0001. We can conclude, with a high degree of certainty (in our case more than 99.999%) that we do have a statistically significant result. In our case, in layman’s terms, we can be fairly certain that the top pros hit their second serves at a higher speed than the top juniors.

I would argue that there are three components that go into an effective second serve:

  • Speed
  • Placement/depth
  • Variety

All three play a role in keeping the opponent off-balance, and starting the point at a roughly equal footing. A greater speed gives a greater margin for error; all else equal, I’d rather miss my spot with a 100-mph second serve than with a 85-mph one.

The sample that I used has obvious limitations: I didn’t look at first serves, I didn’t look at the whole draw, and I only considered one tournament, on grass. But I think that it still illustrates, or at least suggests, that the speed of the game is different between the pro game and the juniors. If I was coaching a junior with professional tennis aspirations right now, getting his average second serve speed in the neighbourhood of 100mph would certainly be one of the metrics on my list.

Does Success at Les Petits As Predict Success in the Pros?

Les Petits As is widely considered as one of the most important junior tennis tournaments in the world. Held in Tarbes, France, it is the unofficial world indoor championship in the Under-14 category. The list of singles winners, on the men’s side, includes well-known legends of our sport such as Rafael Nadal, Richard Gasquet, or Juan Carlos Ferrero. But rather than relying on just a couple of names, I wanted to take a wider view, and see, whether success at Les Petits As can serve as a predictor of success on the professional level.

Before going over the findings, there are a couple of choices I had to make that limit the number of players I considered. First, I decided to only look at the singles draws of Les Petits As, and the career high ATP singles rankings of the athletes in question. Second, I have defined success at Les Petits As as making the quarter-finals or better. This would give me 8 players per year, and with looking at 11 historic draws (years 2007-2017), 88 players total. A good enough sample size, but not overwhelming to do in an afternoon. Finally, I defined success on the professional tour as making the ATP Top 100 or better.

In short, is making the quarter-final of Les Petits As a sure-fire predictor of future professional success? It is not. Out of the 85 unique players – Francis Tiafoe, Borna Coric, and Nikolay Vylegzhanin made the quarter-final or better twice – 16 have made the ATP Top 100 as of April 14th, 2025. Here is the full list:

NameNationalityLPA ResultYear of ResultCareer High ATP Rank
Holger RuneDENSF20174
Luca NardiITAW201767
Hamad MedjedovicSRBF201771
Chun Hsin TsengTPEW201583
Miomir KecmanovicSRBF201327
Alex De MinaurAUSQF20136
Alexei PopyrinAUSQF201323
Francis TiafoeUSAW201210
Mikael YmerSWEQF201250
Michael MmohUSASF201281
Hubert HurkaczPOLQF20116
Alexander ZverevGERSF20112
Quentin HalysFRAW201054
Borna CoricCROSF201012
Liam BroadyGBRF200893
Jiri VeselyCZEQF200735

16 out of 85 is a success rate of less than 20%. There are a few players, especially from the more recent editions of Les Petits As, who might still crack the Top 100; Harold Mayot, for example, who made the semifinals of the 2016 edition of Les Petits As, has a career high of 103. But the overall picture stays the same. If you pick a player at random, and all you know about him is that he made the quarter-finals of Les Petits As, there is a less than 1 in 5 chance that he’ll make it to the Top 100, based on this particular set of data.

In fact, picking a random quarter-finalist from the 11 editions of Les Petits As from 2007-2017, you’d be almost as likely to find someone, who didn’t earn an ATP ranking at all, as you would a future Top 100 player. 14 out of the 85 players have not achieved an ATP ranking in their career as of yet.

Finally, if we assign a ranking value of 0 to the players, who have not earned an ATP ranking, the median career best ranking for the 85 players is #215; roughly speaking a challenger-level player.

What is there to learn from the above exercise? If a player reaches the quarter-finals, or better, at Les Petits As, he will most likely earn at ATP ranking. But the likelihood of that player cracking the Top 100 is about 1 in 5. It’s a long and winding road to the top, even in tennis. Even the best 14 year olds are just at the start of their journey.

The Predictable Carlos Alcaraz

One of the things that makes watching Carlos Alcaraz play tennis so exciting is the variety in his game. When his feet are set for a forehand, he is equally adept at blasting a 100mph missile, or use his touch and send a soft drop shot barely over the net. He can slice, come to the net, lob, defend exceptionally well..there are not many holes to exploit in his game. However, during his 2025 Australian Open quarter final defeat at the hands of Novak Djokovic, he was too predictable on one particular shot. And in a match decided by a few points, that proved too costly.

Here are the serving statistics from their match; Djokovic on the left hand side, Alcaraz on the right.

Courtesy Ausopen.com

Alcaraz had the upper hand when the first serve was put in play, winning 67% of those points compared to 63% for Djokovic. But Alcaraz couldn’t get anything going behind his second serve, winning only 33% of the points, well below the 58% of Djokovic. Djokovic is one of the best returners in the history of our sport, but Alcaraz’ predictable second serve placement helped him out last night.

Here is where Novak Djokovic aimed his second serves against Alcaraz:

Courtesy Ausopen.com

In the deuce side, Djokovic served wide on 48% of his second serves, 24% into the body, and 28% into the Alcaraz backhand. In the ad side, the breakdown was 31%/46%/23%. The lowest overall percentage is 23% – out wide in the ad side. That is still almost one out of every four second serves in the ad side, and forces the returner to cover the whole box.

By contrast, here is the second serve placement map for Alcaraz:

Courtesy Ausopen.com

In the deuce court, Djokovic could have eliminated the wide third of the box completely when getting ready to receive Alcaraz’ second serve. In the ad side, only 9% of Alcaraz’ second serves went down the T, into the Djokovic forehand.

By completely avoiding the Djokovic forehand return, Alcaraz gives Djokovic an invitation to set up the point however he likes. Djokovic can either step in, and take his backhand return early – made easier by the fact that he can wait in his backhand grip and give up a third of the box. Or, he can back up to get a forehand return anyway – here he can start moving early, while Alcaraz’ ball toss is still in the air, because he knows that the serve will be aimed toward a particular area of the box.

Alcaraz has made some changes to his serve technique before the Australian Open already. He is one of the most electric players on tour, and the scary thought is that, at such a young age, he still has room to improve. Mixing up his second serve placement, making it more unpredictable, would complement the rest of his game, and help even out the second serve battleground in his next encounter with Novak Djokovic.

Not a Fan of Dirt

Clay is certainly not Daniil Medvedev’s favorite surface, and he’ll happily tell you so. Despite winning the Masters 1000 in Rome this year, Medvedev’s preferred surface is hard. He’s won his only Grand Slam title on the hard courts of US Open in 2021, made the finals in New York in 2019, as well as making the finals at the 2021 and 2022 Australian Open championships.

After making the finals of the 2023 ATP 1000 in Indian Wells, and taking home the title in Miami, Medvedev lost in the quarter finals of Monte Carlo to Holger Rune, and in the round of 16 in Madrid to Aslan Karatsev. Before going on to win the title in Rome, he stated that he wanted to focus on more spin and better sliding on clay.

Without watching the matches, it is impossible to judge whether Medvedev looked more comfortable sliding. I did, however, wanted to check whether there was any significant difference in the spin rates of Medvedev’s groundstrokes on the hard courts as opposed to the clay courts.

In order to do that, I looked at Medvedev’s forehand and backhand spin rates in Indian Wells and Miami this year (hard court events), and compared them to the spin rates of his groundstrokes from Monte Carlo, Madrid, and Rome (clay court events). I ended up having 10 data points for the hard courts, and 12 data points for the clay courts. To test for statistical significance, I used the unpaired t test for both the forehands and backhands, with a 95% confidence interval.

Forehands

FH Hard CourtsFH Clay Courts
Mean2,361.60 rpm2,297.92 rpm
St. Deviation98.20 rpm161.47 rpm
Sample Size1012
p-value0.2894
Data courtesy of ATPTour.com

On the forehand side, there was no significant difference in the spin rates of that stroke on the hard courts and the clays. Overall, Medvedev’s forehand was the “spinniest” during his R32 match against Ilya Ivashka in the R32 match in Indian Wells, where it averaged 2,511 rpm. On the other hand, it was the “flattest” in his R32 match on the clay courts of Madrid against Alexander Shevchenko, with a 1,992 rpm average.

Backhands

BH Hard CourtsBH Clay Courts
Mean1,486.10 rpm1,387.25 rpm
St. Deviation107.15 rpm77.21 rpm
Sample Size1012
p-value0.0207
Data courtesy of ATPTour.com

On the backhand side, there is a statistically significant difference between the two data sets, but it is in the opposite direction that Medvedev intended. In these particular samples, his backhand ended up being flatter on the clay courts than on the hards.

Medvedev is not known for using a lot of slice backhands at all, so I doubt that the difference in spin rates on the backhand side would be due to him hitting more slices on the hards or the clays. Adding more spin to a stroke would usually involve a change to the grip or the racket path of a stroke (or both). Backhand is arguably one of Medvedev’s biggest weapons, and maybe he wasn’t comfortable changing any of those variables to generate more spin.

It’s hard to say whether Medvedev’s flat backhand is one of the reasons for his relative clay court struggles. He hits one of the flattest backhands on the tour, and on the hard courts his stroke will stay low after the bounce, making an aggressive response from the opponent challenging. On clay courts however, the backhand will bounce a little higher, closer to the preferred strike zone of the opponent around his waist, making it easier to attack. In the end, regardless how Medvedev planned on adding more spin to his groundstrokes, his strategy didn’t work this year. With the grass court and hard court swings yet to come, Medvedev can at least look forward to his favorite part of the year.

Beneath The Surface Of Tennis

When trying to forecast the outcomes of our decisions, one of the more widely used mental models differentiates between first-order and second-order consequences. First-order consequences are the more obvious, surface-level ones. They also tend occur relatively early after the decision is made. Understanding second-order consequences requires a deeper analysis, looking beyond the obvious. Second-order consequences also tend to manifest themselves after a certain period of time has elapsed following the decision has been made.

For example, let’s say a tennis player decides to skip a warm-up at the beginning of a training session. Through a first-order lens, this might be a positive; the player has an extra 10 or 15 minutes to work on her strokes. But through a second-order lens, the skipped warm-up looks more like a negative; the player increases her chances of injury, is not physically prepared for the first few drills of the session, and might learn to place a lesser importance on her physical preparation overall.

A few examples of first-order thinking in the tennis world that have recently caught my eye are:

Don’t work on overheads, you never hit them in matches. This statement is true on the surface. In matches, we hit way more groundstrokes, serves, and returns than we do overheads. Practice time is limited, so why waste it on a stroke that we might hit once a set. If we look a bit deeper though, we realize that a player who doesn’t spend any time on her overheads will not be confident in that stroke. As a result, that player won’t be comfortable coming to the net during the matches. Spending time on the overhead, even though it is a relatively infrequently hit shot, is an important aspect of developing attacking tennis players with a well-rounded game.

Focus only on singles. When playing tournaments, doubles can often be an afterthought. After all, the majority of practice time is traditionally spent on singles, and very few juniors initially aspire to be outstanding doubles players. When looking a bit deeper though, if doubles is not taken seriously, we miss out on an opportunity to work on serves and returns – the two most important strokes in tennis – in a match situation. We don’t take advantage of thinking through movement, angles, and positioning in a different way than in singles. Finally, we don’t learn how to win and lose as a team, and how to communicate with our partner – all skills that extend beyond the tennis court.

Only practice with players better than you. Initially, that seems to make sense. When training with somebody who is better than us – hits the ball harder, is in better shape, can exploit our weaknesses effectively – we can clearly see which areas of our game need improvement. In a group setting, if we are one of the weaker players, it forces us to focus harder and push ourselves more than if we were one of the stronger players. Upon a deeper examination though, playing with weaker players has a lot of positives too. We can adjust our game style to work on things that are outside of our comfort zone. For example, serve and volley once a game. Play only through the middle. Tell yourself that you’re not allowed to hit winners and instead outlast your opponent. Play with only one serve. Start every game at 0-30. There are countless ways how to make practice productive when we are the stronger player.

I’m sure there are a lot of other examples of first and second order thinking in tennis. If you think of some good ones, leave them in the comments!

Nick Kyrgios The Tactician

It would be an understatement to say that Nick Kyrgios is a polarizing figure in the world of professional tennis. He is a Grand Slam champion, and one of the most talented athletes in our sport. At the same time, one can never be quite sure what is going to happen when watching one of his matches. Will we see a brilliant performance, worthy of a Top 10 player? Will we see indifference, broken rackets, and arguments with the chair umpire? One is as likely as the other.

In the second round of the Miami Open, we got to see the former. Kyrgios demolished Andrey Rublev, currently ranked #7 in the world, in 52 minutes by the score of 6-3 6-0. During the course of the match, we caught a glimpse of a side of Nick Kyrgios that doesn’t get spoken about very often: Kyrgios the tactician, executing a game plan to perfection.

Going into the match, Kyrgios knew that “…he’s (Rublev) a player who relies on a bit of rhythm, so I just tried to keep the points short and sharp, just play aggressive.” I want to highlight one way that Kyrgios set himself up to do just that.

If there is a knock on Rublev’s game, it is that his second serve is relatively slow. Below is a chart comparing the average serve speeds between Kyrgios and Rublev from their match:

Average Serve SpeedNick KyrgiosAndrey RublevDifference
1st serve207 km/h193.6 km/h13.4 km/h
2nd serve171.9 km/h143.8 km/h28.1 km/h
Courtesy ATPTour.com

Attacking Rublev’s second serve would be a great starting point to keep the rallies short and putting immediate pressure on the Russian. And Kyrgios did just that, with his positioning and impact point. This is the average return position of Kyrgios in the match when returning Rublev’s second serve:

Courtesy ATPTour.com

Two meters inside the baseline is about as aggressive as you can be with hitting the return on the rise and taking time away from the opponent. The beauty of this return strategy – when looking at it through the lens of Kyrgios’ overall tactic of keeping the points short – is that there is very little downside. If you hit a quality return, great! You’re most likely ahead in the rally, as there is a high chance that Rublev is rushed and off-balance on the first groundstroke after the serve. If you miss the return – the rally length was 1, and Rublev doesn’t get a chance to get into any sort of rhythm.

The strategy worked beautifully. Rublev won only 4 out of the 19 points played on his second serve (that’s 21%; generally speaking, we would like the server to be around the 50% range). Furthermore, almost three quarters of all points contested in the match were played in the 0-4 rally range:

Rally LengthKyrgios WonRublev WonTotal% of Total
0-442236573%
5-8971618%
9+2689%
Courtesy ATPTour.com

Just for comparison, here is Kyrgios’ second serve return position in his third round match against Fabio Fognini. He still returned from inside the baseline, but he toned down the aggressiveness:

Courtesy ATPTour.com

In the round of 16 match, Kyrgios is set to face Jannik Sinner. I would assume that Kyrgios won’t be looking to engage in prolonged baseline exchanges in that match either. His second serve return positioning will give you a hint as to the quality of Sinner’s second serve, as well as the style of match that Kyrgios will want this to be.

Carlos Alcaraz and Second Serve Placement

After the conclusion of the 2022 Australian Open, all the talk on the men’s side – and rightfully so – has been about Rafael Nadal and his 21st Grand Slam title. As much as Nadal has meant to the sport, it is no secret that the proverbial clock is ticking, and he is at the very tail end of his magnificent career. As one of the legends of tennis nears retirement, Spanish tennis has another young superstar waiting in the wings. I would never put the label of “heir apparent” to Nadal on anyone; Nadal’s achievements are unique and unmatched. But Carlos Alcaraz might be the next Spanish player to sit on the throne of men’s professional tennis.

Alcaraz is 18 years old, and ranked #29 in the world as of January 31st, 2022. If you haven’t seen him play, he is far from your typical Spanish clay court specialist. His is a complete attacking game, with the power to dial up a first serve in the 130mph range, control the rallies with his massive forehand, but also transition to the net and hit the occasional drop shot . The scary thing for the rest of the professional tennis world is that at 18 years of age, Alcaraz can only get better. And one area, where Alcaraz can elevate his game, is his 2nd serve placement.

Let’s use Alcaraz’ 3rd round match at the Australian Open against Matteo Berrettini as an example. A cursory look at the surface statistics doesn’t raise any red flags: Alcaraz won an exceptional 61% of his second serve points (anything over 50% is considered a good day in the office). Could this number have been even higher? This is the placement of all of Alcaraz’ 2nd serves from the match:

Courtesy Ausopen.com

On both sides of the court, Berrettini could have eliminated a third of the service box when he was receiving Alcaraz’ second serve. Alcaraz hit 51 second serves in the match, and only two of them were directed at Berrettini’s forehand. On the one hand, the tactic was working – as demonstrated by the 61% second serve winning percentage. On the other hand, it will free Berrettini to do something like this (at the 1 minute 46 second mark):

Berrettini breaks Alcaraz in the first game of the second set by forcing an error behind an aggressive inside-in forehand. What allowed Berrettini to set his feet and really unload on that forehand was the depth and quality of his backhand return. If Berrettini knows that Alcaraz will not serve his forehand in that situation, he can either run around the second serve to start the rally with a forehand return, or – like he did here – step inside the court, wait for the serve with a backhand grip, pick a return target before Alcaraz even hits the serve, and get on offense right away.

Was this maybe just a tactic against Berrettini? Is Alcaraz more unpredictable with his second serve location in other matches? Here is the placement of his second serves from this 2nd round match against Dusan Lajovic:

Courtesy Ausopen.com

Even in this match, Alcaraz served exclusively into the backhand and body of Lajovic. Alcaraz ended up winning this match rather easily in three sets, but even then the location of his second serves, from both sides of the court, was predictable.

How can a player get more comfortable with hitting a particular area of the service box with his second serve? The first step, as it usually does, happens in practice. You could play a practice set where you’re only allowed to hit second serves out wide in the deuce side, and down the T in the ad side. The second step would include hitting the second serve to that area in matches, in low leverage situations. For example, if you’re up 40-0 in a game, go for the second serve down the T. Even if you lose the point, you’re still heavily favored to win the game, serving at 40-15. Alcaraz won the first two sets against Lajovic 6-1 6-2. Firmly in control of the match, he still didn’t hit any second serves to the Lajovic forehand in the third set. Finally, the third step is hitting that serve in matches, in high leverage situations: tie-breaks, when facing break points etc.

For Carlos Alcaraz, the sky is the limit. He is one of my favorite players to watch and will be the face of men’s professional tennis for years to come. He already plays with an aggressive mentality from the baseline; harnessing that same mindset with his second serve locations should not be too difficult of a task.

Predicting wOBA Using Process-based Statistics

When trying to determine a batter’s overall offensive value using a single statistic, one of the most popular metrics to use is the weighted on-base average (wOBA). wOBA is calculated as a ratio of a linear combination of “outcome” statistics (unintentional walks, hit-by-pitches, singles, doubles, triples, and home runs) divided by, essentially, the number of plate appearances.

With that being said, could one predict whether a given player’s wOBA will be above a certain threshold using “process” statistics such as plate discipline and batted ball parameters? In particular, if we know a player’s, say, zone contact rate, chase rate, and average exit velocity, could we predict with any confidence whether that particular player’s wOBA will be above, say, .320?

Using Statcast data and a bit of machine learning, I have decided to train a shallow neural network to try to do just that. I will post snapshots of the Jupyter Notebook throughout the analysis to make it a little easier to follow.

Dataset

My dataset was downloaded from Statcast (creating a custom leaderboard) and included all qualified batter-seasons from 2015 until June 29th, 2021 – 2015 being the first year for which Statcast data are available. This resulted in a set of 989 player-seasons.

For every player-season, I collected the following nine statistics: wOBA, exit velocity average, barrel batted rate, zone swing percentage, zone swing and miss percentage, zone contact percentage, out-of-zone swing percentage, out-of-zone swing and miss percentage, and out-of-zone contact percentage.

I also created a column that was either “1” if wOBA was >= .320 for that particular player-season, and “0” otherwise. This would be the “true label” the neural network would try to predict. I picked the .320 threshold for wOBA, as that is roughly the league-average. In effect, the network would learn to differentiate between a below-average offensive performer, and an above-average one. Finally, I normalized all the column inputs, as, for example, the exit velocities are on a different scale than the statistics expressed in percentages.

Network Architecture

After a bit of trial and error, I settled on the following network architecture. The input layer had either six, seven, or eight units, depending on how many of the features I used in that particular scenario (this will make more sense further along in the analysis). Following the input layer, there were three fully connected layers with 8 units each, and a single-unit output layer, making the prediction. This is a binary classification problem – i.e. the network will make a prediction of either “1” if it thinks the wOBA of the batter will be greater than or equal to .320 given the input data, or “0” if it thinks the wOBA will be less than .320 – and so a single neuron in the output layer is sufficient. Below is a visual representation with six units in the input layer.

How did I arrive at 8 units in a hidden layer? Since at most I would use 8 input features, I picked that as the number of units in the first hidden layer. I wanted to keep the number of units consistent across layers for simplicity. And how did I decide on three hidden layers? I simply did a run with two hidden layers, and then one with three, and I got better results with three. Going to four started overfitting the training data, and so I settled on three hidden layers.

(For the sake of brevity I won’t go into detail of activation functions, regularization, loss functions etc. here in the body of the article. I will link to the code at the bottom, and feel free to hit me up for additional details.)

With the network architecture in place, I ran through four different scenarios, or four different combinations of input features, while keeping the network architecture constant. I’ll outline the results first, followed by a brief discussion.

Scenario #1: Plate Discipline Only

With what probability could we predict whether someone’s wOBA is over .320 using only plate discipline statistics, while knowing nothing at all about what happens when bat meets ball? This was my first scenario. In particular, the input features used in the training set were – all normalized – zone contact rate, zone swing rate, zone swing and miss rate, outside zone contact rate, outside zone swing rate, and outside zone swing and miss rate.

I had 80% of my overall dataset in the training set, and 20% in the test set. The network is trained on the training set, and the test set is used to gauge the accuracy of the network on data it hasn’t seen before. This resulted in 791 items in the training set, and 198 items in the test set. Here are the results after the network has learned its parameters following 15 passes through the training set:

Test set performance for scenario #1:

That’s about a 67% prediction accuracy on the training set, and about 69% on the test set. In other words, the probability that the network will be able to correctly predict whether a hitter’s wOBA will be above .320, using nothing more than their plate discipline statistics, is about 0.7. The fact that the training and test set accuracies are reasonably close – the test set accuracy actually being a bit higher – means that the network is not overfitting the training set either.

Scenario #2: Plate Discipline + Exit Velocity

While 70% is not a bad starting point, how much more accurate could the predictions of the network get if I added a feature with some actual batted ball information? For the second scenario, I added a seventh feature – the average normalized exit velocity. Here is the performance on the training set.

(As a side note, the training and test set splits were fixed for all the different scenarios. What this means is that the same 791 player-seasons were used in the training set every time.)

Test set performance for scenario #2:

The accuracy increased on both the training set and the test set; we’re now in the ballpark of 0.7 – 0.75 probability of the network making the correct prediction as to whether someone’s wOBA will be above .320 or not. Intuitively this makes sense: wOBA is calculated based on batted ball outcomes (and walks), and so adding a relevant batted ball parameter as a feature – such as exit velocity – should increase the accuracy of any wOBA prediction.

Scenario #3: Plate Discipline + Barrel Rate

Would using a barrel rate instead of the exit velocity lead to more accurate predictions? After all, the barrel rate combines two batted ball features – exit velocity and launch angle. Maybe the addition of the launch angle component would help improve accuracy. For scenario #3, I used seven features in the input layer again: the six plate discipline statistics, and the average normalized barrel rate. Here is the performance on the training set:

Test set performance for scenario #3:

The predictions of the network using the barrel rate as the seventh feature increased the accuracy of predictions compared to just using the plate discipline statistics alone, but they were less accurate than the predictions generated using the average exit velocity as the seventh feature. As to why average exit velocity led to better predictions than barrel rate – I’m guessing it’s because it is a more granular feature.

Let’s say Batter A hits three balls – a “barrel” at 97mph, and two “non-barrels” at 92mph. And let’s say Batter B hits three balls – a “barrel” at 97mph, and two “non-barrels” at 82mph. Their barrel rate will be the same, yet the average exit velocity will be different. Either way, the exit velocity provided the network with “more useful” information than the barrel rate did.

Scenario #4: Plate Discipline + Exit Velocity + Barrel Rate

For the final scenario, I used eight input features: the six plate discipline measures, the average normalized exit velocity, and the normalized barrel rate. Theoretically, this should lead to the most accurate prediction, as we’re adding the most detailed batted ball information to the plate discipline measures. This is the performance of the network on the training set:

Test set performance for scenario #4:

Utilizing all eight of the available features puts us in the ballpark of 80% accuracy of predictions. The fact that adding the barrel rate increased the accuracy as compared to the exit velocity alone, passes the smell test: while barrel rate contains some of the exit velocity information in it, it is sufficiently distinct from exit velocity so that it proved useful having it as a separate feature.

Summary & Discussion

ScenarioTraining Set AccuracyTest Set Accuracy
Plate Discipline66.92%69.19%
Plate Discipline + EV73.52%75.25%
Plate Discipline + Barrel70.16%71.72%
Plate Discipline + EV + Barrel77.57%81.31%

As it turns one can get about 80% of the way towards predicting whether someone will be an above average offensive contributor using their plate discipline statistics, their average exit velocity, and their barrel rate in this particular setup. One of the advantages of using a neural network is that the network is able to learn the various non-linear interplays between the input features. For example: let’s say a player has a relatively high out-of-zone chase rate. How high of an outside-of-zone contact rate would he need to have, keeping everything else constant, to get his wOBA over .320? Is it realistic? Or let’s say a player is currently sitting at a wOBA of .310. If we keep his plate discipline statistics constant, how much harder would he have to hit the ball to get his wOBA over .320? There are usually multiple avenues to improve a batter’s performance. Once the network is trained, its predictions can serve as a starting point in evaluating which of the avenues to explore, and which would require an improvement that might be beyond the batter’s reach.

To further improve the performance of the network past the 80% accuracy, there are two ways that one could take. Either change the network architecture, such as the number of hidden layers, the number of units in a layer, the activation functions etc. Or, use additional features that the network could find useful. For example, one could incorporate the percentage breakdown of pull-straight-opposite field hits for a batter. A batted ball with a certain exit velocity and launch angle hit directly over second base could be a single, while a batted ball with the same characteristics hit down the line could go for extra bases. Furthermore, since the test set accuracy actually exceeds the training set accuracy in all four scenarios, simply obtaining additional data is not likely to improve the network’s performance.

Finally, I’m sure that this is a baby version of what major league teams use. If the network’s output and the actual wOBA of a player disagree, the player could be candidate for regression, warrant a deeper dive into their data, or an additional look by the scouts. It would also be interesting to see, how effective minor league plate discipline and exit velocity data would be in predicting major league wOBA using a setup similar to this one.

For those interested, code for the neural network here.

Forehand: Offense, Backhand: Defense

By now it has been well established, that the majority of points in men’s professional tennis are shorter than 4 total shots, or 2 per player. Yet the most exciting spectacle in a tennis match is the extended rally, with players exchanging groundstrokes, battling for court position, and looking to exploit any small opening to gain the upper hand. Unreturned serves might be more common, but the long points earn the standing ovations and make the highlight reels.

With that being said, I wanted to see which groundstroke statistics correlated the most with actually winning the match in the 2021 French Open men’s singles main draw. Let’s use Novak Djokovic’s finals victory over Stefanos Tsitsipas as an example.

Courtesy RolandGarros.com

For the purpose of this analysis, I will group winners and forcing shots into one category : “offense.”

Looking at the table above, we see that Djokovic hit more forehand and less backhand groundstrokes overall than Tsitsipas did. Digging a bit deeper, Djokovic led the forehand offense category 47-32, but he also made more unforced errors on the forehand than Tsitsipas, 20-19. Switching over to the backhand wing, Djokovic is once again better than Tsitsipas in the offense category, 10-6, and he also made less backhand unforced errors than Tsitsipas, 10-13.

Which of these statistics have the most predictive power? In particular, if you didn’t know the outcome of the match, and could only pick a few of these comparisons to help you make an educated guess, which ones should you look at?

The groundstroke statistics are only available for 55 of the men’s singles matches contested in the French Open main draw, less than a half of the matches played. Despite the limited sample size, there are some interesting patterns worth exploring.

Forehand Raw Count

Let’s start with just looking at the number of forehands hit in a match. The forehand is understood to be the “sword” – an attacking weapon, while the backhand is more of a “shield” – a predominantly defensive tool in a player’s arsenal. Would it be enough just to see which player used the sword more?

Winner More Forehands HitLoser More Forehands HitEqual
Match Count30241

It turns out that just knowing who hit more forehands in a match provides little information about the outcome. In only about 55% of the matches in the dataset did the eventual winner hit more forehands than their opponent; slightly better than a coin flip. Intuitively, this makes sense: one, we don’t know anything about the outcome of those forehands. And two, some players are more comfortable with their backhand than others, utilizing, for example, the run-around forehand less, thus decreasing their forehand count. Daniil Medvedev would be an example of such a player.

To get more predictive utility, we’ll need to incorporate the outcome of the rally into the analysis.

Forehand Offense and Unforced Errors

Looking at the forehand outcomes first, this is how often the winner of the match amassed more combined winners and forcing shots than the opponent.

Winner More FH OffenseLoser More FH OffenseEqual
Match Count41113

Conversely, this is how often the winner of the match made less forehand unforced errors than the opponent.

Winner Less FH UELoser Less FH UEEqual
Match Count35173

It is not surprising that taking into account the outcome of the rally improves the predictive power of the statistic. Both the offense and the unforced errors are a significant improvement over the raw count of forehands. However, it is the offensive component that correlates more strongly with winning the match. About 75% of the time, the winner of the match amassed more winners and forcing shots on the forehand side, compared to about 64% of the time the winner made less unforced errors.

Why is it slightly easier to overcome a higher rate of unforced errors on the forehand and still win the match, as opposed to overcoming a lower offensive rate? I think that it has to do with controlling the rally. If I make unforced errors on the forehand, on at least some of those my feet are set, I am on offense, and I just go for too much and miss by a close margin. I can “get away” with those mistakes, as long as I keep generating the offense, getting ahead in the rally, and accumulating winners and forcing errors. Looking at the finals match, Djokovic made more unforced errors on the forehand than Tsitsipas; yet he led the offensive category by a significant margin, signaling that he was controlling more of the rallies with his forehand. It is this control that ultimately helped tilt the match in his favor.

Backhand Offense and Unforced Errors

Shifting gears to the backhand side, here is how often the winner of the match led in the offensive category.

Winner More BH OffenseLoser More BH OffenseEqual
Match Count28225

This was probably the biggest surprise for me in the dataset. Knowing, which player generated more winners and forcing shots on the backhand side told you almost nothing about the outcome in this particular set of matches. The winner of the match had more backhand winners and forcing shots only in about 51% of the matches. You would actually have a better chance predicting the winner of the match using the raw forehand count – a process statistic, as opposed to a backhand outcome statistic.

Why is that the case? My guess is that backhand winners and forcing shots are relatively infrequent events, similar to aces. They simply don’t account for a large enough percentage of points to bear a significant weight on the outcome of the match. In the present dataset, the match winners averaged about 20 forehand winners and forcing shots per match, but only about 10 backhand winners and forcing shots. Looking once again at the Djokovic vs Tsitsipas statistics, Djokovic had 47 winners and forcing shots on the forehand, compared to just 10 on the backhand.

If looking at the backhand offense tells us little, how about looking at backhand unforced errors?

Winner Less BH UELoser Less BH UEEqual
Match Count39142

The backhand unforced error rate was the second best predictor of the ultimate match winner, behind only the forehand offense. This is a great illustration of the backhand’s function as a shield. Since players in general don’t finish a ton of rallies with their backhand (as seen in the backhand offense table), it is important that the stroke be dependable in a neutral rally – i.e. limiting the unforced errors. The backhand is used more as a “linking” shot, a bridge between defense and neutral, and as a transition from neutral to offense before the forehand is used to finish the job. Its effectiveness as a link is better highlighted when looking at unforced errors – when the stroke is not under pressure – as opposed to looking at finishing statistics such as winners and forcing shots.

It would be great to see if these patterns held over all of the 120+ singles matches played at this year’s French Open. Regardless, the results of the analysis pass the smell test: to guess, who might have won a particular tennis match, look at who inflicted more damage with their forehand, and who hurt themselves less with their backhand. More often than not, you’ll be on the right track.

Ruminating on the Underhand Serve

Almost every sport has a set of unwritten rules that the competitors are expected to abide by and follow. In baseball, baserunners are discouraged from stealing bases in the late innings of a blowout. In American football, teams will often take a knee instead of running an offensive play if the game is already decided late in the fourth quarter. In soccer, a team in possession of the ball is expected to kick it out of bounds in order to allow medical treatment of an injured opponent. The application of the unwritten rules is nuanced and not universally agreed upon, even among the competitors themselves. The common thread among all these “agreements” though, is showing a level of respect for the opponent and the sport.

There are numerous examples of unwritten rules in tennis. For example, tennis players are expected to avoid trying to aim at their opponent with the ball during a rally, if there is an option to go around them. They are expected to apologize after they hit the net tape with a stroke, and the ball rolls over onto the opponent’s side. Players shake hands at the end of the match, regardless of how heated the competition might have gotten.

And then there is the underhand serve.

This is Dominic Thiem’s second serve return position in his third round match in the ATP Masters 1000 in Rome against Lorenzo Sonego:

Courtesy ATPTour.com

I’m certainly not picking on Thiem. There are plenty of other players, who prefer to hit their returns from way behind the baseline: Rafael Nadal, Daniil Medvedev, and Stefanos Tsitsipas all come to mind. The reason for this is purely tactical; it gives them more time to react to the serve, potentially hit a forehand on the return, and gain the upper hand in the rally.

If you knew nothing about tennis, you might think to yourself: if my opponent wants to return from that far back, it would seem logical for me to try and entice them to return from closer to the baseline. Just like if my opponent wants to hit more forehands, I will aim at her backhand. I am trying to get my opponent to do the things they are not comfortable doing on the court.

You do have a potential weapon in your arsenal to help you accomplish just that: the underhand serve. But you’re not supposed to use it. Take a look at the reaction Nick Kyrgios got from Nadal when he used the underhand serve during their Wimbledon encounter in 2019:

Regardless of the history between Nadal and Kyrgios, the underhand serve is regarded as a sign of disrespect towards the opponent; they are “not worthy” of you hitting a “proper” serve against them. I would argue that the underhand serve could become a legitimate tactic in an era where, especially on clay, many players choose to set up for the return way back behind the baseline. What would have to happen for the underhand second serve to become less taboo? I think either of the following three developments would speed up the process.

A Respected Ambassador

If someone like Nick Kyrgios or Alexander Bublik decides to serve underhand, the stigma associated with that shot is reinforced. They are “young and brash,” “disrespecting the opponent and the game,” and “immature.” Both Bublik and Kyrgios are known for having tanked matches, and making eyebrow-raising statements in some of their post-match press conferences; their underhand second serves are then viewed through that same lens.

If Roger Federer or Serena Williams decide to serve underhand, the lens would change. Both Federer and Williams have amassed so much credit over their respective years of dominating the sport, that their reputations are bulletproof. Do you still remember the SABR?

Contrast the crowd’s reaction to Federer’s unconventional tactic to the reaction Kyrgios got when hitting the underhand serve. Admittedly, it is not an apples to apples comparison. In the SABR, Federer is trying to get Djokovic out of his rhythm, sure. Yet he still puts himself at a bit of a disadvantage by decreasing the time he has to react to Djokovic’s serve by returning so close to the service line. In other words, during the SABR, you react. During the underhand serve, you are completely in control of the shot. Regardless of that particular difference between the SABR and the underhand serve, just listen to the different reactions of the commentators and the crowd. What would the reaction be if Federer served underhand to Nadal at this year’s Wimbledon? You tell me.

Genuine Tactic

There is one more subtle difference between the two tactics, and that is the timing of their deployment in the videos above. Notice the score when Kyrgios serves underhand to Nadal: 2-5, 40:0. Serving at 40:0 on grass, Kyrgios is an overwhelming favorite to win the game. Even if he loses the point, at 40:15 he is still well ahead in the game. By the same token, once he wins the point, Nadal is clearly favored to win the set when serving at 5-3. In a way, that 2-5 40:0 point is largely irrelevant to the outcome of the first set.

Contrast that with the timing of Federer’s SABRs in the video above. First one at 1-1 15:30, and the other up 3:1 in the tiebreak. Both are tremendously important points. If Federer goes up to 15:40 in the third game, he has a good chance of going up a break in the match. Similarly, to go up 4:1 in the tiebreak, and serving, puts Federer well on his way to securing the first set 7:6.

Looking at their respective strategies from this angle, it really does seem like Federer’s SABR was a legitimate tactic deployed to surprise Djokovic and gain an advantage in the match. Kyrgios’ underhand serve was used in a situation that really didn’t matter.

I think that if the underhand second serve is used as a real, genuine tactic, some of the players’ and fans aversion would be muted. What is a “genuine tactic?” That is hard to pinpoint, but some things to look out for would be:

  • Used early in the match to force the opponent to adjust
  • Used throughout the match if the opponent doesn’t adjust; for example, using the underhand serve twice a game instead of twice a set
  • Used on important points
  • Used in multiple matches against a variety of opponents

Returner is Not a Victim

This last point falls more on the fans and members of the media rather than the players themselves. And that is simply to recognize that the returner can adjust their position, if they don’t like being served to underhanded. Just like if Djokovic didn’t like Federer’s SABRs, he could hit his second serve a little harder, and aim it at the body of Federer. Once Federer sees that his strategy is not having the desired effect, he’ll stop doing it. If I don’t like somebody slicing backhands, I can hit through their forehand. Tennis is a game of adjustments, and allowing that one’s return position is a variable that the opponent might want to exploit would go a long way towards freeing up some players to hit more underhand serves without fearing the crowd’s reaction.

There are plenty of examples of tactics evolving in various sports around the world. In basketball, once the sport has recognized and embraced the value of the three point shot, the game has evolved into a wide-open, pace and space sprint, as opposed to the slow slog of yesteryear when games were dominated by battles in the post. In American football, the multiple wide receiver formations are much more prevalent in today’s era of the pass than the run-heavy, multiple tight end sets of the years past. In baseball, you might see the shortstop lineup anywhere on the infield on defense these days, depending on who the batter is. Seeing a little more of the underhand serve would be a welcome sight for yours truly; a wrinkle, and a new tactical element in a game that has recently been a little lacking in variety for my taste.