Baseball
players’ compensation statistical analysis.
Introduction.
Statistical concepts can be
applied to the game of baseball, and as such, one is able to measure the
individual productivity of the players in relation to their compensation. This statistical
appraisal is facilitated by the availability of ample data about the
performance and productivity of an individual. Moreover, the marginal revenue
product (MRP) of an individual baseball player is relatively independent, and
hence the contribution of this player to the team can be easily evaluated. It
thus follows that one can estimate the marginal revenue product of each
baseball player, and then make a comparison between it and the salary that the
player receives (Fields 3).
MRP
and the Team Model.
An additional unit of labor
employed has the effect of generating additional income for a business entity,
since the services offered by the additional labor will be used to create a
product that will be sold. Hence, this marginal income that has been generated
by the additional input of labor is correlated to two variables: marginal product (change in the production in
physical output) and the marginal revenue that is generated from each unit of
the physical output. This correlation is as per the equation below.
Marginal income = K (constant)
× marginal product × marginal revenue.
The marginal revenue product
(MRP) refers to the additional income that is derived from employing an
additional worker. Theoretically, a business entity is able to pay the employee
a wage that roughly equals his MRP. The theoretical dynamics of competitive
labor markets postulates that the remunerations of an employee must equal his
MRP (Fields 7).
In a baseball team, the MRP of
a player is dependent on his playing skills and performance which determine the
resultant input of the player to the team in terms of improvements of team
performance and its effect on the team revenue. The effect of the MRP of a
player to the team is either direct or indirect as is explained hereafter. Superb
baseball skills of a player contribute to an improvement in team performance,
and this increases the number of victories for the team. This translates to an
increase in broadcast revenues and gate receipts. Thus, it is apparent that the
market worth of an individual player is defined by the amount of revenue that
the team accrues from such a player (Fields 8).
The MRP of a player is also
related to his contribution to the performance variables of the team. These
performance variables determine the winning percentages. Winning percentages
affect the team revenue. It is thus apparent that the team revenue is related
(albeit indirectly) to performance variables. Hence, the assumption that the
production function of a team is linear can be expressed mathematically as
(Fields 8):
WINPCT = α0 + β1RC + β2ERA + β3NATLG + β4CONT + β5OUT + e.................................... (1)
Where :-
-WINPCT is the percentage
of games that the team wins.
-RC is the total runs that
the team has created for a particular season
-ERA is the average run
that the teams earns per 9-inning game
-NATLG is
considered to be 1 if team is in the National League; otherwise it is
considered to be 0.
-CONT is
considered to be 1 if the team finished within 5 games out of first place in
the division; otherwise it is considered
to be 0.
-OUT is considered to be 1
if the team finished 20 or more games out of first place in the division; otherwise
it is considered to be 0.
The numbers
of runs that have been created are a useful and reliable measure of the overall
offensive production. ERA serves as the most apposite defensive measure
(especially for team pitching), since it reflects the number of runs that have
been prevented from scoring. However, ERA never accounts for any errors. The
performance variables are as follows: hitting, walks, slugging averages, stolen
bases, pitching and runs (Fields 8).
Usually,
two runs are sufficient to enable a team win many of its games in the season;
and this obviates the contribution made by hitting and pitching performance to
the outcome of such a game. This necessitates the need for dummy variables that
would account for these factors. The dummy variables that have been introduced
in equation (1) are OUT and CONT. The CONT variable accounts for
the team morale. Team morale is determined by the quality of team management, instantaneous
decision making and composure of players; and as such, it significantly
influences the probability of a team winning a majority of close matches. The
OUT variable accounts for price of disappointment after loses, and the cost of
buying minor league players. NATLG accounts for the quality of the play.
Usually, the American League has more runs than the National League (Fields 8).
Meta-analysis
has shown that the variation in the total team revenue is a linear function of
team characteristics and WINPCT; and,
it is thus expressed mathematically as follows (Fields 9):
TOTREV = α0 + σ1WINPCT σ2NATLG gTEAM gYEAR e…………………................
(2).
Where:
-TOTREV is the total operating revenues
of the team.
-WINPCT is the winning percentage of the
team.
- NATLG is considered to be 1 if the team
is in the National League; otherwise it is
considered to be 0.
- TEAM is the vector of the team dummies.
- YEAR is the vector of the year dummies.
The equation above (Equation 2) is
based on the following sets of hypotheses (Fields 9):
i.
Fan attendance is directly
related to the TOTREV.
ii.
Both fan attendance and
TOTREV are positively influenced by the number of team wins, since fans do
respond positively to winning teams.
iii.
Partial coefficient of TOTREV
with regards to the WINPCT provides a measure of team marginal revenue.
iv.
The variables of team
dummies adjust elements within inter-team differences, while variables of year
dummies accounts for differences in total revenue across years.
Projected MRP for baseball players is
calculated using both equation 1 and 2 (Fields 9).
Statistics and variable creation.
The data
used cover an entire decade (1990-1999). However, random sampling has been used
to create abridged version of the statistics collected within this period. The
remunerations have been adjusted using the appropriate weights into their
current dollar values. Variable creation is used to adjust for differences
among different measures, such as hitting and pitching. Hence, variable
creation would enable individual performance of hitting to be measured using
Runs Created (RC) as per the equation below (Fields 10):
RC = Totalbases (Hits + Walks)………………………………………………
(3).
Walks + Atbats
The above equation would enable a team to compute the
aggregate total runs that each player has created for it, while concurrently
eliminating dependency. An RBI provides an effective measure of the offense
capabilities of a player, but they are influenced by dependency. This is due to
the fact that it provides a measure of the number of chances that a player had
to drive in the runners (Fields 10).
Individual
performance of pitching is measured using Earned Runs Average (ERA) as per the
equation below (Fields 11):
ERA = 9 × Earned Runs……………………………………………………
(4).
InningsPitched.
The
descriptive statistics of the relevant baseball games have been collected,
collated and analyzed by various statistical agencies. Random sample have been
extracted from the collated descriptive statistics, and tabulated as per the
tables below (Fields 12).
Table 1
below has utilized variable creation to adjust errors, dependencies and other
anomalies that exists in the baseball games (Fields 21).
Table 1: RBI, Runs, and Runs Created during 1990 and 1999
by selected RBI Leaders (Fields 21)
PLAYER
|
RBI
|
R
|
RC
|
Selected 1990
Scores
|
|||
Bonilla, B
|
120
|
112
|
129
|
Bonds, B
|
114
|
104
|
191
|
McGwire, M
|
108
|
87
|
142
|
Sandberg, R
|
100
|
116
|
156
|
Selected 1999
Scores
|
|||
Palmero, C
|
142
|
96
|
204
|
Williams, M
|
142
|
98
|
138
|
DelGado, C
|
134
|
113
|
157
|
Guerrero, V
|
131
|
102
|
173
|
Note:
-
RBI is the Runs Batted In
-
R refers to Runs Scored
-
RC is the Runs Created as
calculated using equation (3) above.
Table 2: Team Revenues and Statistics from 1990-99 (An
abridged version) (Fields 23).
TEAM
|
REV
|
WINPCT
|
RC
|
ERA
|
OUT
|
CONT
|
Baltimore
|
97.5
|
.513
|
792.8
|
4.392
|
.300
|
.200
|
California
|
61.9
|
.473
|
730.5
|
4.471
|
.400
|
.300
|
Cleveland
|
83.2
|
.534
|
851.8
|
4.380
|
.200
|
.600
|
Detroit
|
54.6
|
.452
|
775.9
|
5.011
|
.500
|
0
|
Minnesota
|
47.8
|
.462
|
758.9
|
4.776
|
.500
|
.100
|
Oakland
|
61.1
|
.496
|
766.7
|
4.639
|
.200
|
.300
|
Tampa Bay
|
79.2
|
.407
|
760.8
|
4.705
|
.100
|
0
|
Atlanta
|
85.9
|
.596
|
759.8
|
3.498
|
.100
|
.800
|
Colorado
|
91.9
|
.478
|
875.8
|
5.344
|
.429
|
.286
|
Philadelphia
|
61.3
|
.471
|
712.7
|
4.298
|
.700
|
.100
|
Florida
|
60.4
|
.442
|
683.5
|
4.397
|
.571
|
0
|
San Diego
|
56.9
|
.484
|
712.1
|
4.007
|
.200
|
.200
|
Los Angeles
|
91.9
|
.513
|
705.9
|
3.692
|
.300
|
.600
|
Note:
-
Revenues are in millions
of 1999 dollars using the CPI Index.
-
WINPCT is
winning percentage.
-
RC is
Runs Created.
-
ERA is Earned Run Average.
-
OUT is
a dummy variable for teams finishing twenty or more games out of first place in
their respective division.
-
CONT is a dummy variable for teams finishing within five games
of first place in the division.
Table 3: Estimates of MRP, Actual Salaries, and Production
Statistics (Fields 28).
PITCHERS
|
|||||
PLAYER
|
ERA
|
IP%
|
MRP
|
SALARY
|
|
Roger Clemmons
|
2.41
|
0.19
|
7,487,971
|
3,180,323
|
|
John Smiley
|
3.86
|
0.11
|
3,627,553
|
5,592,679
|
|
Denny Neagle
|
2.97
|
0.16
|
4,777,413
|
2,442,192
|
|
HITTERS
|
|||||
PLAYER
|
ERA
|
MRP
|
SALARY
|
||
Javier Lopez
|
38.4
|
871,398
|
125,533
|
||
Bill Spiers
|
37.8
|
858, 150
|
424,729
|
||
Ozzie Guillen
|
28.1
|
657,806
|
500,000
|
||
Note: - ERA is earned run average
-
IP% is percentage of team innings pitched.
-
RC is runs created
-
MRP is estimated marginal
revenue product in real 1999 dollars,
-
SALARY is the actual
salary in real 1999 dollars.
The following two
tables show descriptive statistics that have been collated and analyzed.
Table 4: Compilation of the Average Total Team Revenues and
other Descriptive Statistics by Year (Fields 24).
YEAR
|
REV
|
RC
|
ERA
|
OUT
|
CONT
|
1990
|
66.1
|
725.6
|
3.86
|
.308
|
.269
|
1991
|
70.8
|
720.5
|
3.91
|
.346
|
.192
|
1992
|
72.4
|
706.1
|
3.74
|
.423
|
.192
|
1993
|
73.1
|
776.7
|
4.18
|
.393
|
.214
|
1994
|
45.4
|
590.9
|
4.51
|
.071
|
.464
|
1995
|
55.1
|
728.3
|
4.45
|
.428
|
.321
|
1996
|
70.0
|
848.1
|
4.62
|
.214
|
.418
|
1997
|
82.1
|
817.7
|
4.39
|
.179
|
.321
|
1998
|
84.5
|
815.0
|
4.43
|
.433
|
.233
|
1999
|
94.6
|
865.0
|
4.71
|
.533
|
.267
|
Table 5: Average Salary among Professional Baseball
Players in Constant 1999 Dollars (Fields 25).
YEAR
|
HITTERS
|
PITCHERS
|
1990
|
$827,507 (745,348)
|
$763,856 (687,340)
|
1991
|
$1,122,047 (1,123,599)
|
$1,149,394
(1,135,239)
|
1992
|
$1,277,637 (1,411,698)
|
$1,320,195
(1,417,248)
|
1993
|
$1,257,828 (1,504,463)
|
$1,259,855
(1,492,623)
|
1994
|
$1,313,949 (1,545,810)
|
$1,249,382
(1,502,080)
|
1995
|
$1,213,318 (1,784,463)
|
$1,062,461(1,643,588)
|
1996
|
$1,086,363 (1,623,599)
|
$858,016
(1,312,878)
|
1997
|
$1,304,520 (1,805,201)
|
$1,060,511(1,512,825)
|
1998
|
$1,468,324 (1,989,561)
|
$1,221,230
(1,620,667)
|
1999
|
$1,748,757 (2,124,505)
|
$1,656,944
(2,004,616)
|
Note: - Standard
deviations are in parentheses.
Statistical analysis.
The winning
percentage function for a baseball team was calculated using the equation below
(Fields 13):
WINPCT = 0.547 + 0.000235RC – 0.052ERA – 0.004NATLG – 0.046 CONT – 0.043 OUT...(4)
(0.021) (0.000024)
(.004) (.004) (.005) (.005)
Hence it
is apparent from equation (4) above that the coefficients of performance
variables increase the winning percentage (WINPCT) in thousandths of a unit.
For instance, a single run created (RC) increases the winning percentage by a
value of 0.000235. Hence, the difference in WINPCT between the official leagues
is non-significant. Based on the equation (4) above, the contender team is
likely to finish 0.046 above other teams that have an equivalent player
performance. Table 3 above shows that an increase of 0.1 WINPCT increases the
total team revenue by $ 9,630,000. NATLG shows that there is no statistically
significant difference between total team revenues collected in the American
League and the National League. Year dummies show that the 1994-1995 strike did
have an adverse effect on total team revenue (Fields 14).
MRP
calculation assumes that no externalities influence individual performance, and
as such the linear summation of the respective individual performance in a team
provides a measure of team performance. Calculation of an individual MRP
utilizes equation 1 and equation 2, as follows. For a single baseball player,
an increase of 1.0 WINPCT raises his TOTREV by $96,306. RC is the most apposite
measure for hitter, while ERA is the most apposite measure for a pitcher. An increase
of a single unit in RC increases the team’s WINPCT by 0.00023, while a decrease
of a single unit in ERA increases the team’s WINPCT by 0.054. Thus, the MRP for
a hitter is (Fields 14):
MRP of a
hitter = 0.000235× annual RC × $96,306 ………………………… (5)
Hence, a high RC increases the value
of MRP.
The total
ERA of a team is the weighted average of the pitchers’ ERA. A pitcher ERA is
weighted as a share of the IP % (team innings pitched), and as such, the
individual ERA productivity function must be multiplied by his IP% so as to
obtain (Fields 15):
MRP of a pitcher =
$96,306 × IP% × {.547 – (.054 ×
ERA)}……………….. (6)
Thus, a high MRA reduces the MRP.
Studies
have shown that miscellaneous inputs such as trading capabilities, managerial
performance and investment in stadiums, do not influence the MRP (Fields 15).
Statistical regression analysis.
Statistical analysis of the compensation of baseball players utilizes
linear regression, and linear regression models as is explained below. The
salary is interrelated to the MRP by regressing salary on projected MRP as
follows (Fields 15):
Salaryi = α + ei +
µMRP……………………………………………………………… (7)
Where:
i denotes individual player, while µMRP is the projected MRP.
If the
player is paid his full marginal revenue product, then µ=1. Thus, based on
table 3, µ is biased downwards, and as such the players are underpaid. For
instance, the co-efficient (µ) of MRP for Ozzie Guillen is 0.76, and this
implies that he was paid 76% of his actual value. µ has a statistical
difference of 0.34 from 1 (1- 0.76 = 0.34), and this shows that the player is
underpaid. For the perspective of the performance of the baseball team, such a
baseball player is more than worth his remuneration.
For
pitchers like Roger Clemmons, the µ is:
µ=3,180,323/7,487,971= 0.42.
This implies that the pitcher
is underpaid. Moreover, table 3 shows that pitchers are more underpaid as
compared to hitters.
Conclusion.
Statistical concepts can be
applied to the game of baseball, and as such, one is able to measure the
individual productivity of the players in relation to their compensation. The
MRP of an individual baseball player is relatively independent, and hence the
contribution of this player to the team can be easily evaluated. Theoretically,
a business entity is able to pay the employee a wage that roughly equals his
MRP. If the player is paid his full marginal
revenue product, then
µMRP is equal to the
MRP. However, µMRP is less than MRP, and this shows that the baseball players
are underpaid. Also, pitchers are more underpaid as compared to hitters.
Works
cited.
Fields, Brian. "Estimating the
value of Major League Baseball players." PhD Thesis, Greenville:
East Carolina University Press, 2007. Print.
No comments:
Post a Comment
Only comments that conform to the natural laws of decency and formal language will be displayed on this blog.