You're Probably Not Popular, and Why That Helps People
Model Building #1 - Social Networks and Centrality
The transition from sports analytics to some of the other fields I’m interested in, like economics, can be explained with a simple sentence: they’re all roughly the same. If you had asked me about this a few months ago, I wouldn’t have given this answer, but alas, with more experience comes better insight. At its core, sports analytics boils down to human decision making: what factors can force humans to stray away from otherwise optimal decisions, and how teams can best optimize the value of a single possession. This is why the sports world, even with its relatively “low stakes”1, has gargantuan thought experiments that attract bright minds.My interest in human decision making has compelled me to write about some of the ways humans can build models to solve bigger problems.
The first concept I’ll discuss is a counterintuitive - albeit elegant - rule for social networks. Feld’s friendship paradox can be described as the phenomenon where, on average, your friends have more friends than you do. I describe this paradox as counterintuitive because most of you probably don’t feel less connected than your peers, and in addition, the mathematics behind the concept are explained by sampling bias.
This sampling bias arises because individuals with more social connections are more likely to befriend others with similar levels of connectivity, leading to a skewed perception of social popularity. Essentially, when we consider our friends, we often overlook the highly connected individuals who may disproportionately influence our social circles. As a result, the average person tends to underestimate their own social standing compared to the broader network. An example of this can be explained below.
Zachary’s Karate Club is a famous example of a social network. In this network there are n = 34 nodes with a mean degree of:
The degree of each node can be considered as the amount of friends an individual has. We can consider the average number of friends of friends as:
The counterintuitive aspect of the friendship paradox arises from the use of proportions in calculating expected values.
When assessing centrality or influence within a social network, it's crucial to avoid skewing results with less connected individuals, as their lower degree can distort the overall understanding of social dynamics. For instance, if we simply average the degrees of friends without accounting for their relative connectivity, we may underestimate the impact that highly connected individuals (or hubs) have on the network.
In this context, determining the likelihood of selecting a specific node is vital. Each node's contribution to the average should reflect its probability of being chosen, which is directly related to its degree. Therefore, we calculate the probability as:
Experiment 1.1: Social Networks
To demonstrate this (and impose the harsh truth), I created a computer simulation. This simulation explores the dynamics of friendship formation within a large social network, focusing on the influence of personality traits, specifically extraversion, on social connections. By modeling how individuals interact based on their levels of extraversion, the simulation illustrates the principles behind Feld's friendship paradox, which suggests that people often find that their friends have more friends than they do.
Methodology:
The simulation begins by creating a large social network comprised of numerous individuals. Each individual is assigned an extraversion score, which reflects their propensity to form connections with others. These scores are generated using a power-law distribution2, leading to a wide range of connectivity potential among the individuals in the network.
Friendship formation occurs based on these extraversion scores, where individuals with higher scores are more likely to forge new friendships. However, the probability of forming additional connections diminishes as a person’s number of existing friendships increases. This reflects a realistic social dynamic where individuals may have limits on the number of friends they can effectively manage.
As friendships are formed, the simulation calculates metrics that reveal the average number of friends individuals have compared to the average number of friends their friends possess. This part captures the essence of the friendship paradox: while individuals may have a certain number of friends, their friends are likely to be more connected within the network.
Throughout the simulation, the evolving social network is visualized, allowing observers to see how connections grow and change over time. The results are tracked through metrics that display trends in average individual degrees, average friend degrees, and the friendship paradox ratio.
Besides this morose conclusion, the applications of this paradox are broad. In calculating influential people in an online social network, one’s influence can be found through seeing how many friends of friends one has. An “unpopular” person can be suddenly quite useful.
However, when calculating influence in-person (a much more important use case in epidemiological theory), one’s closeness with their friends should be examined. For one thing, a person with many acquaintances (in the case of a pandemic like COVID) does not pose as much risk as someone with more close friends but significantly less acquaintances. A rudimentary approach to this involves weighting strength of friendships.
Experiment 1.2: Weighted Friendship Network
The simulation incorporates weights on friendships to represent the strength or significance of each connection. Initially, individuals can establish up to three strong friendships, indicated by higher weights, which acknowledges the deeper social ties formed during these initial connections. After reaching three friendships, the weights decrease substantially, reflecting the common social phenomenon where additional friendships might be perceived as less significant.
The quick decline in the friendship paradox ratio illustrates the effect of weights on the network. By adding weights of closeness, we can help less connected individuals regain some of their influence. Quality over quantity.
Most publicly available data does not provide insights into relationship strength, which could be gleaned from interactions such as messages sent between users, likes, comments, and other engagement metrics. Despite this limitation, the practice of weighting networks can be invaluable across various fields. Rather than merely seeking to "flatten the curve," we should recognize that the complexity introduced by weighting relationships can lead to more robust and insightful analyses of social dynamics.
This concept might seem trivial for a first topic, but the elegance of logic in the background reveals profound insights about human behavior and social dynamics. Most of all though, the simplicity makes it a worthy starting point.
There are arguments to be made that it is ‘high stakes’, but I’ll stick with my interpretation of impact on the world as a whole.
While not entirely realistic, the power law distribution helped keep connectivity lower so to not result in an equilibrium early on in the iterations.
I’m enjoying this foray into non-sports topics. Good stuff! Looking forward to more.