Disecting the Anatomy of Twitter Elites
The ease of access to information on social ties through online social networks and advances in computing in the past 15 years has allowed researchers to study large social networks. However, our fascination with big data has lead to us to neglect one import aspect of social networks, namely the elite network. While social scientist has spent a great deal of attention in the past century, not much research has been done on this topic recently.
In this research we analyze the structural properties of Twitter elite network and its sub-structures. We define elites as accounts with largest number of followers. We use the induced follow and retweet graph of Twitter elite in our analysis. We start with the induced graph of 500 elite accounts, and study the structural changes as the network expands to more elites. We identify groups of tightly connected accounts, using consensus clustering based on the many runs of non-deterministic community detection algorithm. These identified elements show substantial social coherence, and many of them could have not been identified using well known community detection algorithms. Moreover, we found that although expansion coincide with increase in density, the sub-structural properties of the network remain stable. Finally, we pay great detail to individual accounts, as we identify important elites, using follow relations and retweets. We find these super-elites to be of different types when focusing on different relations.
When examining graph representations of networked systems (e.g., social networks, biological networks), a popular approach is to try and identify different “communities” (i.e., groups of nodes that satisfy certain inter- and intra-group connectivity properties) in an attempt to glean insight into the fine structure of the systems under study. This approach is greatly facilitated by a number of popular community detection (CD) algorithms. A curious feature of many of these CD algorithms is that different runs of a single algorithm typically results in often vastly different sets of communities. This intriguing and at the same time troubling observation raises the question whether these detected communities have any “meaning” beyond satisfying certain level of goodness with respect to connectivity.
To minimize the effect of non-deterministic variations in the identified communities, we adopt the following strategy: We run Combo community detection technique on each view of the elite network n times and determine the communities that individual nodes are mapped to in each run in a vector with n values, called ‘community vector’. Then, we group all the nodes that are consistently (i.e., all n times) mapped to the same community (i.e., have the same community vector) and refer to such a group as a Stable Element. Clearly increasing n is more restrictive which may lead to smaller stable elements since more runs can simply split an element to two (or more) smaller ones. We conservatively consider n = 100 in our analysis. This process also results in groups of nodes for which no other node has the same community vector. We group to this last set of nodes and nodes in stable elements with a size smaller than 10 and refer to them as Unstable Element.
We also consider the effect of expansion of the community level structure of elite network. Therefore, we keep track of the communities' individual nodes in each view. This in turn reveals the overlapping users between two communities in consecutive views and shows the similarity of two communities in different views. Our visualization using Sankey Diagram show that the majority of nodes in communities of each view stick together to make up communities in the next view. The social and geo footprint of related communities in different views show that the theme of most communities remains the same across different views. while the theme for some other communities slightly evolves as more nodes join the community or two communities merge together.
Of coarse, the evolving theme for some communities are due to the arrival of many new nodes in each view of the elite network. To verify this issue, we expanded the views of the elite network by 20% in each step and observed that any change in the theme of individual community occurs very slowly. This visualization nicely shows the effect of network expansion on identified communities.
Use the table of influence measures to brows through our findings. Click on the following plots to interact with the graph.Technical Report for more information.