Numbers (still) don't speak for themselves
Photo by Liam McGarry on Unsplash (https://bit.ly/4aYIrb4)

Numbers (still) don't speak for themselves

May 2024 Issue


I'm a psychology professor, and when I teach introductory psychology and lifespan development, we talk about egocentrism, which refers to a person's assumption that other people have the same perspective that they do. Sometimes this is literal: a person may mistakenly assume that "right" and "left" always mean from their perspective and not consider that those change according to the direction that a person is facing. That's why (a) English theatre directions specify "stage left" and "stage right," which are from the perspective of an actor or dancer facing the audience; (b) watercraft, aircraft, and spacecraft specify "port" (left) and "starboard" (right) from the perspective of a person on board and facing forward to the bow; and (c) before surgery, medical personnel will typically mark the correct side of a patient's body with a Sharpie – and specifically a Sharpie, due to its alcohol-based ink – as there are more left/right confusions, or "wrong-site adverse events," than I would have thought possible (with estimates of over 1,300 per year in the US).

But aside from the literal "your left or my left?" confusions, there is the matter of metaphorical egocentrism, or situations where one person assumes that another has the same information or reaches the same conclusions that they do. In developmental psychology, this kind of egocentrism is more common among young children, who might be disappointed when they're talking with you on a voice call and you don't comment on their new shoes that you can't see. Children eventually develop and learn enough to get past this naïve form of egocentrism, but it still shows up with adults in a surprising number of situations. Most adults know that their cars need oil and that it needs to be topped off or replaced from time to time – although my electric Fiat 500e thankfully doesn't need any oil – and they may assume this is common knowledge. On the other hand, my wife and I know from our own personal experiences and our children's experiences that this is not common knowledge, and we have the repair receipts to prove it. And then, two years ago, I managed to ride my bicycle 100 miles through Utah's West Desert, to the edge of the Bonneville Salt Flats, before I learned the hard way what everybody else already seemed to know: drinking gallons of water is not enough, you also need electrolytes. Fortunately, my wife was monitoring my ride and was able to pick me up after I bonked. I'm much better at proper nutrition and hydration now, and I hope to finish that ride this summer.

[As a note, the developmental version of egocentrism is not to be confused with being a vain, self-centered, narcissistic "egotist." That's closer to a personality disorder and not a developmental phenomenon.]

Egocentrism can show up in data work in several ways:

  • The assumption that everybody is familiar with the same software, languages, and procedures that you are

  • The assumption that everybody values the same software, languages, and procedures that you do

  • The assumption that everybody gets the same insights from a data analysis that you do

I'll talk about each of these in turn.

Assumption of familiarity

Photo by Redd F on Unsplash (https://bit.ly/3VcTGHe)

Many years ago I went to weekly data visualization brown bag meetings sponsored by a computer science professor at another university. This person was extraordinarily talented and well-respected within the field. During one meeting, I mentioned that a particular data problem could be analyzed with a t-test, to which the professor responded "what's a t-test?" Given my background in experimental psychology, where the t-tests are among the most basic and common inferential tests used, this felt a little like somebody saying, for example, "what's data?" I had assumed that this professor knew everything that I knew and a lot more beyond that. That was a foolish assumption: we were from completely different backgrounds, with different methods and different goals. Of course our knowledge would be different. I should have known better.

The assumption of familiarity as a form of egocentrism can lead to an endless number of miscommunications and misjudgments in data work. My main data tools, in order of complexity and frequency of use are Google Sheets, jamovi, SPSS, and R. I have limited experience with SAS, Python, and other data tools. I also focus on basic descriptive analyses, and, where necessary, basic inferential procedures like correlation, regression, and ANOVA. My work falls squarely in the analytical paradigm. I have limited need for – and hence limited experience with – procedures that are common in machine learning or AI, such as random forests, neural networks, and transformers. It would be thoughtless of me to assume that people who specialize in machine learning would know what I know about, say, scale construction and internal consistency measures, just as it would be thoughtless of them to assume that I know what they work on daily in their projects.

Assumption of values

Photo by Resume Genius on Unsplash (https://bit.ly/3KyFG5v)

Probably even more important than the assumption of familiarity – "I know how to work with R, so I assume that you do, too" – is the assumption of values – "I think R is the best data tool there is, so I assume that you do, too." Actually, there are two versions of this assumption:

  • A descriptive assumption, which simply make assumptions about what the state of affairs currently is, like the example above

  • A prescriptive assumption, which takes the form of "I love R so you should love R, too"

That second version, which is about what I think you should do, is problematic. That's what makes politics and religion such uncomfortable topics at family gatherings; it's not just that I think or believe X, but the unstated corollary is that if you don't, then it's because you're either poorly informed about the superior nature of X (and I need to set you straight) or that you're a pigheaded, belligerent person (and I need to put you in your place). Neither situation ends well.

I hope that this would be the obvious solution: people have their own preferences and, whenever possible, those preferences should be respected. There may be exceptions, such as when a client requests that you work with a specific tool, or your employer requires a tool to facilitate collaboration or prohibits a tool because of security issues. Fair enough. But, otherwise, let people do their work.

Assumption of insights

Photo by Jen Theodore on Unsplash (https://bit.ly/4bVgzWD)

This last assumption gets directly to the title of this issue of the #DataIsForDoing newsletter: As much as it might make our data work easier, numbers (still) don't speak for themselves. Their relevance, their meaning, their insights are not self-evident. If they were, then all of us who work with data would be out of jobs.

The place where this assumption is most obvious is in presentations that don't include specific, actionable insights based on the analysis. If your client needs a yes or no – "Should we open a new location in this area?" or "Should we cancel this project that we've sunk enormous amounts of money into?" – then you owe them the courtesy of a recognizable yes or no response. Showing them pages and pages of tables and graphs so they can "form their own impressions" is just handing off the work that you were hired for, as is the assumption that they see the same things in the data that you do. It reminds me of when my students have submitted papers with extended quotes from the results section of another paper: the quote means that they didn't understand what they read, and they're just letting me deal with it. In the student's case, that misses the point of the paper, and in the analyst's case, that misses the point of the project.

Overcoming egocentrism

Photo by Mika Baumeister on Unsplash (https://bit.ly/3XgTvx0)

So, what do we do about it? The first step should be to check our work and make sure that (a) we understand the question that the client wanted answered; and (b) there is a clearly labeled section in our presentation or report that has the actionable insights. Ideally, those two sections would be on slides or pages titled "Research Question" and "Actionable Insights," respectively. Really, there's no such thing as being too clear about these. Don't assume that your client will follow your line of reasoning; map it out for them. Don't assume that they will see the implications of your analysis; spell them out for them. Close the gaps and, most likely, your work will have greater impact and your clients will be more grateful for your special skills and insights.

Bruce Ellis

Missionary at the Welfare Square Employment Resource Services at The Church of Jesus Christ of Latter-day Saints

2mo

In other words, you might have to speak for the numbers after the clarification of the question and then again after what the numbers say the game plan should be. You have to quantify the qualitative. But I do not understand statistics very well. I took one stat class that was required to graduate. It totally confused me because all the variables in statistics mean something totally different in chemistry and physics...thus the difference in which right is right or which left is left.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics