초록

본 연구는 SNS 사회적 네트워크에서 보다 긴밀한 관계를 바탕으로 형성된 소비자 클러스터인 하위네트워크(sub-network)들 중에서 정보 확산의 최종 결과를 예측하는데 핵심이 되는 하위네트워크를 예측집단(predictable cluster)로 선정하고, 네트워크적 방법론을 사용하여 이 예측집단의 네트워크적 특성을 살펴보고자 하였다. 이를 위해 우리나라 SNS 사이트의 빅데이타를 사용하여 이 사이트에서 나타나는 사회적 네트워크를 도출하고 이 네트워크에서 발생하는 정보 확산을 예측하기위한 예측집단인 하위집단을 선정하기위해 사회적 네트워크에서의 총확산과 개별 하위네트워크 확산간의 상관관계를 분석하였다. 2개의 상관관계 분석을 바탕으로 예측집단(predictable cluster)을 선정하고 이들의 네트워크적 특성을 분석하였으며, 또한 이 예측집단의 네트워크에서의 정보 확산 특성과 정보수용 특성을 분석하였다.

그 결과, 먼저, 사회적 네트워크에서 추출된 100개의 하위네트워크들 중에서 예측집단(predictable cluster)로 선정된 하위네트워크들의 확산 특성에서는 q값, takeoff시점, peak시점이 유의하게 나타났다. 이것은 예측집단(predictable cluster)으로 선정된 하위네트워크들은 q값이 작아질수록 100개 하위네트워크들의 총확산과 상관관계가 높아져 전체 확산을 예측하기에 적절하며, 또한 takeoff시점은 짧을수록, peak 시점은 길수록 예측집단(predictable cluster)의 특성이 더 높게 나타나고 있다는 것을 알 수 있었다. 다음으로, 예측집단(predictable cluster)으로 선정된 하위네트워크들의 수용 특성에서는 수용양과 유의한 차이가 있는 것으로 나타났다. 이것은 예측집단(predictable cluster)으로 선정된 하위네트워크들의 수용양이 클수록 100개 하위네트워크들의 총확산이 더 높아지는 것으로, 총확산의 성공적인 양적 증가를 예측하는 방향으로 예측집단(predictable cluster)들의 특성이 나타나는 것이라 할 수 있다. 이상의 연구결과를 통해 몇 가지 시사점을 도출하였으며, 한계점도 제시하였다.

In this study, the authors attempt to explore the existence of a predictable cluster and identify the properties of the predictable cluster. For this purpose, we first select a certain number of individuals randomly and identify their local networks. Each selected individual and his/her local network becomes a sub-network in a large network of the total population. Then, we select those sub-networks whose adoption behaviors correlates highly with the total population. And, we run a split-half test to investigate whether those selected sub-networks performs well in predicting the adoption behavior of the total population. Finally, we identify the properties of those selected sub-networks-we call the selected sub-network “predictable cluster”-with respect to the network characteristics following the social network theory. For measuring the property of sub-network in this study, we used four network indexes which are path length, clustering coefficient, betweenness centrality, and closeness centrality. Path Length is geodesic distance from node i to node j. Clustering coefficient is the proportion of number of triangles in the network-sets of three nodes each of which is connected to each of the others. Betweeness centrality is the proportion of all geodesics between pairs of other nodes that include this node. And, closeness centrality is the number of other nodes divided by the sum of all distance between the node and all others. In particular, we show in this study the process of identifying predictable clusters by social network approach.

The major findings of this study are as follows. First, the greater the q-value or peak time, the greater the possibility the sub-network becomes a predictable cluster. Second, the smaller takeoff time, the greater the possibility the sub-network becomes a predictable cluster. Third, the greater the adoption volume, the greater the possibility the sub-network becomes a predictable cluster.

The implication of this study is that predictive clusters can be a reliable predictor of overall information diffusion, and by identifying the predictable clusters, managers can develop more efficient communication strategy in the early time of the diffusion process. Limitations and future directions are discussed.