초록

소셜 미디어, 웹 서비스, IoT 응용에서 생성되는 대규모 그래프를 효과적으로 표현하고 처리하기 위해 그래프 요약 기법들이 사용되고 있다. 그래프 요약은 그래프 구조와 의미를 표현할 수 있으며 데이터 크기를 감소시켜 질의 처리 및 그래프 분석 성능을 향상 시킬 수 있다. 하지만 그래프가 시간이 지남에 따라 점점 커지면서 매번 그래프를 요약하는 데 많은 시간이 소요된다. 본 논문에서는 정점의 연결성, 이웃 정점의 수 및 정점 차수의 비율을 고려한 패턴 생성을 통해 그래프를 요약하는 기법을 제안한다. 제안하는 기법은 그래프 스트림에 따라 변화하는 그래프를 요약하기 위해 1-홉의 빈발 패턴을 찾아 DSMatrix로 표현하고 그 빈발 패턴을 N-홉의 패턴으로 확장하여 빈발 패턴 관리 테이블에 저장한다. 저장된 빈발 패턴들을 이웃 점수를 사용하여 요약할 그래프에 적용하여 요약 그래프를 생성한다. 다음 윈도우에서 생성된 N-홉의 빈발 패턴과 빈발 패턴 테이블에 저장된 패턴을 비교하여 빈발 여부를 판단함으로써 패턴 검출 시간을 감소 시킨다. 다양한 성능 평가를 통해 제안하는 기법이 기존 기법에 비해 처리 시간과 정확도 측면에서 성능이 우수함을 보인다.

Graph summarization schemes are being used to effectively represent and process large-scale graphs generated from social media, web services, and IoT applications. Graph summarization allows for the representation of graph structure and semantics while reducing data size, thereby improving query processing and graph analysis performance. However, as graphs grow larger over time, it becomes time-consuming to summarize the graph every time. In this paper, we propose a scheme for summarizing large-scale graphs through pattern generation considering vertex connectivity, the number of adjacent vertices, and the ratio of vertex degrees. In order to summarize the graph that changes with the graph stream, the proposed scheme finds 1-hop frequent patterns and expresses them as DSMatrix, and expands the frequent patterns into N-hop patterns and stores them in the frequent pattern management table. It generates a summarized graph by applying the stored frequent patterns to a graph to be summarized using adjacent scores. The proposed scheme reduces pattern detection time by comparing the frequent pattern of the N-hop generated in the next window with the pattern stored in the frequent pattern table to determine whether it is frequent. It is shown through various performance evaluations that the proposed scheme has superior performance in terms of processing time and accuracy compared to existing schemes.