Monday, August 28, 2017

Finding complexity in analysis: sports (American football) examples

This post is a bit disjointed.  Mostly I'm interested in some sports stuff I was talking about with my nephew, but in the context of this blog, I thought it might be interesting because it shows some of the complexity that comes in analysis of various ideas--a complexity that gets a lot of researchers stuck.  Unfortunately, this blog post doesn't offer any real answers for researchers. 
My nephew and I were discussing football. He said in reference to quarterback play in particular  that talent basically comes in two tiers—those who have it and those who don’t. There’s a part of me that has sympathy with this argument: I do think that there are some guys who “get it” and other who don’t—guys who make the right plays at the right time and guys who don’t.
There’s another part of me that thinks it’s a lot more complex.
I am wary, for example, of psychological effects, like the influence of one strikingly important play. Tony Romo, for example, will never live down the fumbled kick hold against Seattle. Is that one play representative of Romo’s ability, or does the psychological impact of this striking event lead to over-emphasizing the one play in the larger evaluation of Romo?
The question of sports evaluation is one that interests me a lot and one that I have thought about discussing in this blog, because it’s a process of research and there are interesting issues that come up. Interesting to me, anyway.  Questions of evaluation and measurement are crucial to many areas of research, and both evaluation and measurement often involve questions of definition that are generally interesting. 
I don’t really agree with my nephew’s assertion, and I’m just going to work through that a little bit, partly as an illustration of general questions of reasoning.

Generally speaking, a lot of scholarship and research grows out of an assertion that is interesting and seems problematic.  There is the claim that there are basically two tiers of QB talent, and we might ask if there are any reasons that an observer would think this even if it’s not the case. 
We might observe two tiers of quarterback play for (at least) two reasons: one is that there actually are two tiers of talent, another is that there is some sort of threshold effect—a level of play over which QBs can be effective, and below which they are not.  Such a performance threshold would divide QBs into those who succeeded and those who didn’t, and would lead to two apparent tiers of talent.

But all of that depends on implicitly thinking about talent as a single unified construct.  But maybe “talent” is complex? This is similar to “intelligence”: we discuss it as a general ability, but maybe it’s actually a complex of different abilities?
When we look at quarterback play, there are actually several distinct dimensions of significance. We might say there is size, strength, speed, and intelligence. We might get a finer set of criteria:
  • arm strength
  • strength to withstand contact
  • running speed
  • running quickness/direction change
  • footwork
  • decision making
Some of these may themselves be complex: arm strength, for example, might be seen as reducing the range of concerns for throwing the ball: velocity, placement, touch (ability to suit velocity and angle to circumstance). Or for decision-making, there are concerns for reading the defense, for making adjustments/audibles, for making choices of when and where to throw the ball or to run.  Other ideas from sports might also be worthy of consideration: consistency, response to pressure (choker?), field vision.
It makes sense to talk about “talent” as a unified thing, because often that’s all that casual conversation needs.  But when you start to look at things as a researcher, and examine things, often a lot of complexity emerges that muddies the waters of simple ideas like talent.

One thing that muddies the waters in trying to evaluate “talent” is the fact that a lot of talent manifests in uncertain ways. A good freethrow shooter in basketball doesn’t hit 100% of freethrows. A good quarterback connects on maybe 70% of his passes.  How we evaluate an athlete often depends on a small sample of performances that might be indicative of the underlying talent.  A quarterback who complete 70% of passes might complete 22 of 25 one night, and 13 of 25 on another.  The actual performance is not a perfect reflection of underlying talent.  This is particularly true of football, where the coordination of a team is of crucial importance.  Tony Romo, mentioned above, may partly be forever remembered as having muffed the kick snap because of the highly controversial Dez Bryant incompletion against the Packers: had Bryant held the ball more firmly, or had it been ruled a completion, the Cowboys might well have taken the lead and won, and suddenly Romo would be remembered for a big fourth down throw and a comeback playoff win. Romo made a good pass on that play. It wasn’t the best pass ever, but it was a good pass.

The gap between underlying ability and actual results can make it difficult to assess underlying talent. For example, the question of Teddy Bridgewater’s ability to make it as an NFL quarterback may be moot: with the injury Bridgewater may never return to the level he previously attained.  Bridgewater was on the verge of making it as a young quarterback. His arm strength was suspect, but the rest of his ability set seemed more than adequate. If he never gets back to the starting position he once held, we can never know whether he had talent or not: would he have grown enough to excel, or would he have always played at the edge of competence, no matter his development?
Bridgewater is a good example of why I don’t think the two tiers of talent theory holds up, but maybe also why it seems effective: Bridgewater was generally suspect for two reasons: his arm strength, and his more general physical stature/strength. If things went right for Bridgewater, he had more than sufficient talent. But in moments when he was stretched, he had to play at the edge of his ability, and that can cause breakdowns that look glaring, especially when we take into account that football results often depend on only a small number of plays.  Suppose Bridgewater is able to complete 70% of passes below 15 yards downfield, but, due to his ‘poor’ arm strength, he can only hit 50% on passes over 15 yards, where a comparably accurate strong-armed passer who hits 70% of short passes, only drops off to 60% on the longer throws. In contexts where the game may ride on one play, that 10% will probably lead to significantly different won-loss records.  Indeed, that difference might well be the difference between holding a job as an NFL starter, but it’s hard to look at the two quarterbacks who are in many respects comparable in performance, and say that one guy has NFL talent while Bridgewater doesn’t.

An example that my nephew and I discussed was Eli Manning. Manning clearly has the ability to make big throws in big games. Manning is also notoriously inconsistent in his play (he is consistent in playing every game—he doesn’t miss starts), and this has worked against him throughout his career. showing up in high interception rates, in particular. My nephew suggested that talent and consistency were separate. But I wonder whether consistency is part of talent. If Manning threw a few more interceptions, he would not be able to hold a starting job. If Manning had not won two Super Bowls, he might have lost his job before now.  
The problem in trying to analyze this, again, is that our data is probabilistic: Manning’s actual performance is our guide to guessing his underlying talent, but to what extent is that assessment based on some element of luck—on the receiver making the good catch in the big game and the drop in the unimportant game, rather than the other way around? David Tyree didn’t make a lot of big catches in small games, but he made the helmet catch, and as a result, Eli has one of his Super Bowl MVP trophies (Manning deserved it, but if Tyree doesn’t hold that ball, Manning doesn’t win it).

Anyway, this blog post is really a bunch of rather confused notes, but I wanted to try to sort out some of the different ideas I was thinking while I was texting with my nephew. And it seems to me that it might serve as an example for more general issues for research—in particular the questions of how to define terms, and the ways that under analysis terms often reveal unanticipated complexity, but also some of the other ideas that researchers want to take into account: are observations characteristic, or are they the result of some random variation? Can we understand an apparent observation in terms of some complex behavior (i.e., does a “two tiers of talent” theory come out of some threshold effect)?

The complexity that is revealed to close examination is often frustrating and intimidating: if you want a simple answer, it is frustrating, and if you want to do research, it can be intimidating because of all that needs to be done.

No comments: