The Batting Average is an obviously flawed metric. To demonstrate this, I will compare the career numbers of six batters. These are batters with widely varying 'Not out %' values.
batter | Innings | Not outs | NO % | Runs | Average |
---|---|---|---|---|---|
Brian Lara | 232 | 6 | 2.6% | 11953 | 52.89 |
Saeed Anwar | 91 | 2 | 2.2% | 4052 | 45.53 |
Sachin Tendulkar | 329 | 33 | 10.0% | 15921 | 53.79 |
Herb Sutcliffe | 84 | 9 | 10.7% | 4555 | 60.73 |
Steve Waugh | 260 | 46 | 17.7% | 10927 | 51.06 |
Andy Flower | 112 | 19 | 17.0% | 4794 | 51.55 |
Steve Waugh and Andy Flower have way-too-high averages. Lara and Saeed Anwar have way-too-low averages. Tendulkar and Sutcliffe have acceptable values. Since these are base numbers, contextual factors do not come into the picture.
The Batting Average metric is intrinsically unfair to batters with low 'Not out %' values. The alternative plain-vanilla RpI (Runs per Innings) would swing the pendulum the other way - It would be grossly unfair to Steve Waugh and Andy Flower - 16* and 8* would be taken as completed innings. What is needed is something in the middle - logical, fair and accurate.
Hence, I have developed a new metric - the Weighted Batting Average (WBA). In order to negate the huge disadvantage faced by top-order batters, the WBA is calculated as explained below.
1. The methodology should be to assign each of the Not Out innings an Innings count (IC). 2. The Innings count should vary between 0.0 and 1.0. It should follow a steady rising graph. 3. Effort should be made to avoid very small IC values such as 0.05. 0.1 etc. Low not outs should be accorded due recognition as far as the IC is concerned. 4. The IC calculation should be simple and easy to understand. Any reader of mine, even without extensive Excel knowledge, should be able to work it out.
So, after looking at all the batter-related variables, I have come to the conclusion that the Highest Score is the constant denominator as far as IC is concerned. It represents the best that the batter has achieved, it is a high spot in terms of his career, is higher than (or equal to) all Not-out-runs, and will work well in almost all situations.
However, when I did my first run using the HS as the denominator, I ran into a problem. There is no problem with the high Not out scores. However, because of the high value associated with HS, the low Not out scores had very low IC values, as shown below. This is not correct.
NOI HS IC 30 334 0.09 37 334 0.11 56 334 0.17 57 334 0.17 102 334 0.31 103 334 0.31 127 334 0.38 144 334 0.43 173 334 0.52 299 334 0.90 Total 3.39I did not want to give up the good-as-gold HS value in the denominator. How do I dampen the effect of the Low Not-out-runs and high HS. As usual, at 3 Am, I got the solution. Why would I not use the Square Root values. It is a standard dampening tool. I tried it and the results for Bradman are given below.
NOI HS IC 30 334 0.30 37 334 0.33 56 334 0.41 57 334 0.41 102 334 0.55 103 334 0.56 127 334 0.62 144 334 0.66 173 334 0.72 299 334 0.94 Total 5.50Note the dramatic improvement in the four low sub-100 not outs. 0.3 to 0.4 is a great improvement to the earlier low values. Now, let us look at a couple of other situations.
153/400 is 0.38 and is clearly quite low for this significant innings. Sqrt(153/400) works out to 0.62, eminently acceptable.
73/281 is 0.26 and is clearly quite low for this match-winning innings. Sqrt(73/281) works out to 0.51, more acceptable.
It is true that Mark Waugh's HS is 153 and his Not out inns are likely to have higher IC values. It is also true that Wasim Akram's HS of 277 will result in lower IC values. But let us also not forget that this is what the batters themselves achieved. And they pay for it or benefit by it. Now for a table which lists the current WBA and revised WBA for a few top batters.
batter Prev Curr Bradman 89.55 92.74 Lara 51.86 52.25 Tendulkar 49.51 50.82 Hayden 48.58 49.66 SR Waugh 43.47 45.71 Sangakkara 54.40 55.41 Steve Smith 51.94 54.62 Labuschagne 50.76 51.38It can be seen that there is around 2-3% increase in the value. Hence, I have developed a new metric - the Weighted Batting Average (WBA). In order to negate the huge disadvantage faced by top-order batters, the WBA is calculated as explained below. For the past few years, I have used Runs per Innings to determine the WBA. However, I have now reverted back to the most accurate way of working out the WBA. Runs per Dismissal is a 100% accurate measure.
- All dismissals, irrespective of the score, have an innings count of 1.0.
- All not-out innings above the average Runs per Dismissal have an innings count of 1.0.
- All not-out innings below the average Runs per Dismissal are assigned proportionate innings values between 0.0 and 1.0.
- The WBA is then calculated dividing the Runs scored by the derived Weighted Innings count.
Let me illustrate this concept with Don Bradman's career. He played 80 innings and was undefeated 10 times - 30*, 37*, 56*, 57r, 102*, 103*, 127*, 144*, 173* and 299*. His career Runs per Dismissal is 83.83 (=5868/70). The last six innings are above the RpD and are considered as completed innings. Only the italicised first four innings are to be prorated. The Weighted Batting Average is 89.55 (=6996/78.15). To explain, 78.15 = 70 (Dismissed inns) + 6 (Above RpD) + 2.15 (Pro-rated: (30+37+56+57=180/83.83)).
Nothing can be fairer. Bradman does not lose in unbeaten innings such as 299* or 173* since those are counted only as 1.0. He does not lose out on innings such as 30* since that innings count is taken as 0.358, which is very fair. Of course, when he was dismissed at 12, his innings count is taken as 1.0. That is how it should be since he was dismissed.
On an average, the WBA to Batting Average ratio ranges from 100.0% for Labuschagne/Kaushal Silva with zero not outs to 78.3% for Shaun Pollock (25.5% not outs).
Going back to our examples, let us see how the WBA works out for the selected batters.
batter | Batting Average | W B A | WBA-Ave % | NO % |
---|---|---|---|---|
Brian Lara | 52.89 | 51.86 | 98.0% | 2.6% |
Saeed Anwar | 45.53 | 44.53 | 97.8% | 2.2% |
Sachin Tendulkar | 53.79 | 49.51 | 92.0% | 10.0% |
Herb Sutcliffe | 60.73 | 56.21 | 92.6% | 10.7% |
Steve Waugh | 51.06 | 43.47 | 85.1% | 17.7% |
Andy Flower | 51.55 | 44.18 | 85.7% | 17.0% |
It is clear that these numbers are very fair and equitable. Look at how the maximum benefits accrue to those batters with fewer not outs. There seems to be a clear inverse correlation between "Not outs %" and "WBA-Ave %" values. Those with high number of not outs do not lose out - rather, they do not gain in an undeserved manner, as was happening with Batting Average. The WBA value is always lower than the Batting Average. The relevant factor is the extent of drop.
The PQI (Pitch Quality Index) is one of the most powerful and solid of my derived measures. The PQI is an excellent measure to determine the true nature of the pitch. Because the expected performances of the players is built in, PQI allows for normalization across players at the extreme ends of the spectrum. There will not be dumping of Bradman/Basher into the same basket, nor Barnes/Boje.
The PQI is dependent on six values:
- The Expected runs from the batters (based on Career-Location WBA values),
- Actual runs of the batters,
- Expected wickets from the bowlers (based on Career-Location BpW figures),
- Actual wickets of the bowlers,
- Base RpW (for the concerned period) and
- Actual RpW values.
Actual Career figures are used rather than CTD figures since this is an estimate value and I wanted some stability in the calculations. All these figures are used to derive the PQI. To start with, the following ratios are determined.
- Actual runs / Expected runs.
- Expected wickets / Actual wickets.
- Actual RpW / Base RpW (For the concerned period).
Each ratio is capped at 3.0. A Ratio value greater than 1.0 indicates a batting-friendly pitch and lower than 1.0 indicates a bowling-friendly pitch. To explain, let us say that the expected runs are 250 and the actual runs scored are 350. The Ratio is 1.4, indicating batting was easier. If 200 runs are scored, the Ratio is 0.8 - tougher batting conditions. Let us say that the expected wickets are 6.0 and the actual wickets captured are 4.5. The Ratio is 1.333, indicating batting was easier. If 8 wickets are captured, the Ratio is 0.750. RpW is self-explanatory. A Ratio greater than 1.0 indicates that more runs have been scored per wicket than the base.
Since each of these PQI-R values has a range of 0 to 3.0, the sum of the PQI-R values, which is the first-level PQI, has a range of 0.0 to 9.0. However, since the Median of this distribution is around 3.0, a mapping is done to work out a 100-based PQI. The raw PQI of 3.0 maps on to 50.0. Extrapolation is done either side on ranges of 0.0 to 3.0 and 3.0 to 9.0 respectively. A low PQI indicates a bowler-friendly pitch and a high PQI indicates a batting pitch. During the past 147 years, the extreme PQI values are 93.4 (Test #418-1955. Scores Nz 450/2, Ind 531/7 and Nz 112/1) and 14.5 (Test #216-1932. Scores Sa: 36, Aus: 153 and Sa: 45).
My first cut was limited to the Match PQI. However, after many requests, consultations and discussions, I have established that the pitch changes significantly during the course of the match and I need to develop individual PQI values for the first half of the match and second half of the match, where appropriate. These are called PQI-12 and PQI-34 respectively.
To give the readers an idea of the variations that can occur in the PQI values across a match, I worked out a ratio called the PQI-Ratio (PQI-R) which is PQI-3,4/ PQI-1,2. The highest value for this ratio is 3.322, in Melbourne in 1901-02: PQI-1,2 = 17.2, PQI-3,4 = 57.1. The scores were: Australia 112, England 61 and Australia 353, England 175. The lowest value is 0.418, at Trent Bridge, 1951: PQI-1,2 = 75.7, PQI-3,4 = 31.7. The scores were: South Africa 483 for 9, England 419 for 9 and South Africa 121, England 114.
These points are based on a complex performance analyses of the scores. A detailed description of the process will take too much space. It is a "Contribution" exercise jointly undertaken by me and Milind Pandit. Suffice to mention that a very complex five-stage allocation process allocates points for a Match, Teams, Innings, Functions and Players. We move from the Match level in a top-down approach. The overall guidelines and the cornerstones of allocation are given below.
- All results (excluding #1483 and #1814 for obvious reasons) will get allocated 100 points.
- The two teams playing a tied match will be allocated 50.0 TPP each. These are considered as results.
- The drawn matches will get allocations below 100, depending on the extent of completion of the match. The points allocated range from 99.8 to 0.4. The team which is ahead in the match will get more points. Please see Test #616 below. South Africa got close to the points for an Innings win. A two-innings draw will get 50 points if the lead has been taken.
- All innings wins are allotted 75 or more points. However, depending on the margins and match scoring patterns, wins by runs could get more than 75 points. For instance, Australia received 75.1 points for their win by an innings and 5 runs over Pakistan (Test #2368) while they secured 76.0 points for their 296-runs win over New Zealand (Test #2374).
-The actual point allocations depend on the scoring patterns in the Test. Two 5-wicket wins may get differing points allocations. A win by an innings and 100 runs could get more points than a win by an innings and 200 runs.
A few point allocations are provided below.
- 50.0 : 50.0 for a tie (the anchor value - Tests #498/1052),
- 50.1 : 49.9 for a 1-run win (Test #1210),
- 50.4 : 49.6 for a 1-wicket win (Test #2347),
- 53.9 : 46.1 for a 13-run win (Test #25),
- 58.1 : 41.9 for a 5-wicket win (Test #2303),
- 71.2 : 28.8 for a 9-wicket win (Test #2248),
- 90.6 : 9.4 for a win by an innings and 198 runs (Test #1617),
- 94.2 : 5.8 for a 675-run win (Test #176),
- 50.6 : 49.2 for India for the draw with scores level and 9 wickets down (Test #2019),
- 49.0 : 46.5 for Australia for the recent fighting draw by India at SCG (Test #2402),
- 74.5 : 23.1 for South Africa for the near-win draw at Wanderer's (Test #616 - Aus 143 & 148/8, Saf 332),
- 43.4 : 30.9 for a three-innings draw (Test #2335) and finally,
- 0.3 : 0.1 for a 10-ball draw (Test #1907).
The HSI is composite of the high-score element and the support available. The purpose of this index is to recognize the batter's contributions in differing situations. Let me outline below various situations enveloping a score of 100. A 100 as the top score does not provide enough information by itself. It could be out of a team score of 200 or 500. It could be supported by an innings close to 100, by a 50 or by a 10. It could be part of 300 for 1 or 400 for 5 or 200 all out.
- batter scores 100 out of 180. Next highest score is 20.
- batter scores 100 out of 220. Next highest score is 40.
- batter scores 100 out of 300. Next highest score is 60.
- batter scores 100 out of 300. Next highest score is 95.
- batter scores 100 out of 450. Next highest score is 75.
- batter scores 100 out of 450. The highest score is 120.
- batter scores 100 out of 650. The highest score is 200.
In each case, the batter's contribution is different and moves from an outstanding contribution, in the first instance to a rather stress-free supporting one, in the last case. The idea behind HSI is to create an Index which recognizes this fact. At the end of this section, I will post the HSI values for these innings.
With this background let me show the working of the HSI.
Top batter HSI = (batter score / Next batter score) * (batter score / Team score). This incorporates both components.
Other batters HSI = (batter score / Team score).
Let me try to describe the HSI in a visual manner. If we represent the numbers on a linear/vertical scale, the team score is at the top. The batter score is in the middle and the next highest score is below this. The HSI value increases as the distance between the batter score and the team score decreases. Similarly the HSI value increases as the distance between the batter score and the next highest score increases. Thus the HSI is dependent on how far away these two values are from the batter score.
A reader could very well ask why it is necessary to include the multiplication factor (batter score/Team score), in effect reducing the HSI value. This is to recognize situations such as "100 out of 200 with next best score of 50" and "100 out of 400 with next best score of 50". The two HSI values cannot be the same.
One major problem, specifically related to Tests, has to be addressed and solved. It is best explained with an example. Let us say that Australia needs 50 to win and they reach 50 for 1 with Warner scoring 40, Harris 5 and there are 5 extras. Warner's innings will get a HSI of 6.4 (8.0*0.8). This is totally outrageous, incorrect and unrealistic. This is almost at the same level as the current highest HSI value. But there are also situations such as Hutton scoring 30 out of 52 all out or Kohli scoring 105 out of 166/3 or McCabe scoring 189 out of 274/3 and so on. All these have to be taken care of. In the same example I have taken, what if Australia collapsed but still won the match by 3 wickets scoring 50 for 7 and Labuschagne scoring 30, with the next highest score being 5. He would have a correct HSI of 3.6 (6.0*0.6). All these situations have to be taken care of.
I analyzed this problem in many ways and tried various options. Only when I did an analysis of all innings in which fewer than 10 wickets were lost did I realize that loss of 5 wickets was the separation point. Loss of 5 wickets meant that the top order had their say and all support innings would be from lower-order. So I decided that all innings of 5 wickets or below would have their HSI values reduced by a factor. But what about 274 for 3 or 450 for 2 and so on. So I set a limit of 200 runs to apply this adjustment. It has worked very well.
In the previous examples, Warner's HSI would be multiplied by 0.167(1/6) and Kohli's by 0.5(3/6). Hutton, McCabe and Labuschagne would retain their values. This is exactly as it should be.
Now for the HSI values for all the innings shown above.
- batter scores 100 out of 180. Next highest score is 20. HSI is 2.78.
- batter scores 100 out of 180. Next highest score is 35. HSI is 1.59
- batter scores 100 out of 300. Next highest score is 60. HSI is 0.56.
- batter scores 100 out of 300. Next highest score is 95. HSI is 0.35.
- batter scores 100 out of 450. Next highest score is 75. HSI is 0.29.
- batter scores 100 out of 450. The highest score is 120. HSI is 0.22.
- batter scores 100 out of 550. The highest score is 150. HSI is 0.18.
The highest HSI value is that of Charles Bannerman, who scored the first century ever on the second day of Test cricket. He top-scored with 165 out of a team score of 245 (8 extras). The next highest score is 18. The HSI worked out to 6.38 (165/18 * 165/237). The second highest value of 4.74 is for the innings of Alan Border, who scored 163 out of 308 (28 extras), with the next highest score of 20. An equal HSI value is for Yallop's 121, which was made out of 193 and a supporting innings of 16. Very close is Kapil Dev's 129 out of 207 with the next highest score of 17. That secured a HSI of 4.73. Sehwag's 151 out of 243 with a supporting innings of 20, fetched a HSI of 4.69.
HSI enables us to quantify the support received very effectively. It enables us to recognize the value of a lone ranger effort such as Laxman's 167 at SCG against his own 281, but this time with excellent support from Dravid. The comparisons could not have been more dramatic. The 167 got a HSI of 4.64 and the 281, 0.70.
I have highlighted two recent Travis Head classics to illustrate the HSI concept further. His 140 at Adelaide had a HSI of 0.97 (140/64) * (140/316). His equally invaluable 152 at Brisbane had a HSI of 0.56 (152/101) * (152/410). Note the subtle manner in which the the HSI for 140 is nearly double that of the other innings.
This is a peer comparison index for batters, within a match. The index is IPV (Innings Peer Value). It is based on Team-based computations. The IPV refers to a single batter innings.
The basic premise is simple. Let us say the batter has scored 120 runs in one innings. If his team-mates average 20 runs per innings, he has performed 6 times better. If they have scored 48 runs per innings, he has performed 2.5 times better. If his score was 10, he would have performed half as good (0.5) as his team-mates in the first case and only around one fifth as good (0.208) in the second case.
No issues with the numerator: the runs scored by the batters. The denominator is tricky. Let me first explain what I have used and then provide the reasons. I have taken the runs scored by the team in both team innings, subtract the batter's score and divided that by the number of innings minus 1. In other words,
Other batter RpI = (Team Runs for both innings - Team extras for both innings - batter score) /
(No of players who batted in both team innings - 1)
IPV = batter score / Other batter RpI.
Why RpI and not RpW? If say, the five scores in an innings (Test #450) are 260, 25, 365*, 39 and 88*, the average is 155.4 (777/5). It would be quite silly to use 259.0 (777/3). Sobers scored 365 and remained not out. That innings better be considered. Recently in Test #2049, the scores were 131, 0, 311* and 182*. The average is 156 (624/4) and certainly not 312 (624/2).
Thus it can be seen that the concept of peer comparison is implemented properly by excluding the subject innings from the total runs. What about the subject batter's other innings? No problems at all. As far as this analysis is concerned, the other innings is what it is: Another innings made at a different time, albeit by the same batter. There is no conflict. Each innings stands independently.
The working out of IPV values is very easy. Just to show how easy this is, I will take the reader through a recent Test: #2402, the SCG Test between Australia and India.
First, for one Indian batter, say, Rishabh Pant's 97 in the second innings. India scored 244 and 334. A total of 18 innings were played. Pant's IPV works out to 3.57 = 97 / 27.12, 27.12 being the "Other batters RpI". = {(244-9)+(334-11)-97)} / 17.
Now let us take an Australian batters, say, Steven Smith's 131 in the first innings. Australia scored 338 and 312. A total of 18 innings were played. Smith's IPV works out to 4.45 = 131 / 29.47, 29.47 being the "Other batters RpI". = (338-10-312-8-131) / 17. Note how Smith's 'peers' have batted slightly better than Pant's 'peers'.
This measure is totally dynamic in that each innings has different sets of figures to work with. For the second innings, the IPV for Smith's 81 is based on taking off the 81 off the two totals and the IPV for that innings is 2.50.