Abstract

We introduce a new method for vertex clustering or community detection on directed graphs (digraphs). The new method is an extension of the BlueRed method introduced initially for undirected graphs. Complementary to supervised or semisupervised classification, graph clustering is indispensable to exploratory data analysis and knowledge discovery. Conventional graph clustering methods are fundamentally hindered in effectiveness and efficiency by either the resolution limit or various problems with resolution parameter selection. BlueRed is originative in analysis, modeling and solution approach. Its clustering process is simple and fully autonomous, free of parameter tuning/selection. We report two benchmark studies for evaluating the new method. The clustering results are in remarkable agreement with the ground truth labels. We also present an important application of the new method to a U.S. patent citation graph CITE75_99. More than 1 million patents have no electronic records of patent class indices, they are cited by the other 2.7 million patents with class indices. We are able to efficiently and economically give a knowledge/semantic presentation of the patents without semantic information.