Abstract
We address the problem of encoding a graph of order \(\mathsf {n}\) into a graph of order \(\mathsf {k}<\mathsf {n}\) in a way to minimize reconstruction error. This encoding is characterized in terms of a particular factorization of the adjacency matrix of the original graph. The factorization is determined as the solution of a discrete optimization problem, which is for convenience relaxed into a continuous, but equivalent, one. Our formulation does not require to have the full graph, but it can factorize the graph also in the presence of partial information. We propose a multiplicative update rule for the optimization task resembling the ones introduced for nonnegative matrix factorization, and convergence properties are proven. Experiments are conducted to assess the effectiveness of the proposed approach.
Similar content being viewed by others
Notes
There is no explicit update rule in the paper to find the specific relaxed factorization, but it could be obtained by using Theorem 1.
This however should not be taken as a truth holding in general, for it depends also on the structure of the input graph.
For NMF and GNMF this coincides with the number of latent dimensions.
References
Airoldi EM, Blei DM, Fienberg SE, Xing EP (2008) Mixed membership stochastic blockmodels. J Mach Learn Res 9:1981–2014
Baum LE, Eagon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull Am Math Soc 73:360–363
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J. Mach Learn Res 3:993–1022
Cai D, He X, Han J, Huang TS (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Choi Y, Szpankowski W (2012) Compression of graphical structures: fundamental limits, algorithms, and experiments. IEEE Trans Inf Theory 58(2):620–638
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Dhillon IS, Guan Y, Kulis B (2004) Kernel k-means: spectral clustering and normalized cuts. Int Conf Knowl Discov Data Min 10:551–556
Ding C, He, X, Simon HD (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: SIAM data mining conference, pp 606–610
Ding C, Li, T, Jordan MI (2008) Nonnegative matrix factorization for combinatorial optimization: spectral clustering, graph matching and clique finding. In: IEEE international conference on data mining, pp 183–192
Ding C, Li T, Peng W (2008) On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput Stat Data Anal 52(8):3913–3927
Goldenberg A, Zheng AX, Fienberg SE, Airoldi EM (2010) A survey of statistical network models. Found Trends Mach Learn 2(2):129–233
Hofmann T (2000) Learning the similarity of documents: an information-geometric approach to document retrieval and categorization. Adv Neural Inf Process Syst 12:914–920
Holland PW, Laskey KB, Leinhardt S (1983) Stochastic blockmodels: first steps. Soc Netw 5(2):109–137
Horn RA, Johnson CR (1985) Matrix analysis. Cambridge University Press, Cambridge
Hubbard TJP, Murzin AG, Brenner SE, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540
Jolliffe I (1987) Principal component analysis. Springer, New York
Kuang D, Park H, Ding C (2012) Symmetric nonnegative matrix factorization for graph clustering. In: SIAM international conference data mining, pp 106–117
Lakshminarayanan B, Raich R (2010) Non-negative matrix factorization for parameter estimation in hidden Markov models. In: IEEE international workshop on machine learning for signal processing, pp 89–94
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562
Li P, Bu J, Yang Y, Ji R, Chen C, Cai D (2014) Discriminative orthogonal nonnegative matrix factorization with flexibility for data representation. Exp Syst Appl 41(4):1283–1293
Li P, Chen C, Bu J (2012) Clustering analysis using manifold kernel concept factorization. Neural Comput 87:120–131
Lorrain F, White HC (1971) Structural equivalence of individuals in social networks. J Math Sociol 1:49–80
Mørup M, Schmidt M (2012) Bayesian community detection. Neural Comput 24(9):2434–2456
Navlakha S, Rastogi R, Shrivastava N (2008) Graph summarization with bounded error. In: ACM SIGMOD international conference on management of data, pp 419–432
Nepusz T, Petróczi A, Négyessy L, Bazsó F (2008) Fuzzy communities and the concept of bridgeness in complex networks. Phys Rev E 77(1):016107
Nourbakhsh F, Rota Bulò, S, Pelillo M (2014) A matrix factorization approach to graph compression. In: 22nd international conference on pattern recognition. IEEE, Stockholm, Sweden, 24–28 Aug 2014
Paatero P, Tapper AU (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5:111–126
Psorakis I, Roberts S, Ebden M, Sheldon B (2011) Overlapping community detection using nonnegative matrix factorization. Phys Rev E 83(6):066114
Rota Bulò S, Lourenço A, Fred, ALN, Pelillo M (2010) Pairwise probabilistic clustering using evidence accumulation. In: International workshop on structure and synthesis pattern recognition, pp 395–404
Rota Bulò S, Pelillo M (2010) Probabilistic clustering using the baum-eagon inequality. In: International conference on pattern recognition, pp 1429–1432
Schölkopf B, Smola A, M\(\ddot{\rm l}\)ler KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22:888–905
Sperotto A, Pelillo M (2007) Szemerédis regularity lemma and its applications to pairwise clustering and segmentation. In: Energy minimization methods in computer vision and pattern recognition, pp 13–27
Szemerédi, E (1978) Regular partitions of graphs. In: Problèmes combinatoires et thorie des graphes. CNRS, Paris, pp 399–401
Toivonen H, Zhou F, Hartikainen A, Hinkka A (2011) Compression of weighted graphs. In: International conference on knowledge discovery and data mining, pp 965–973
Verma D, Meila M (2003) Comparison of spectral clustering methods, Technical report. University of Washington
Xu W, Gong, Y (2004) Document clustering by concept factorization. In: Proceedings of 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp 202–209
Yang Z, Oja E (2012) Quadratic nonnegative matrix factorization. Pattern Recognit 45(4):1500–1510
Zhang H, Yang Z, Oja E (2014) Adaptive multiplicative updates for quadratic nonnegative matrix factorization. Neural Comput 134:206–213
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nourbakhsh, F., Rota Bulò, S. & Pelillo, M. A matrix factorization approach to graph compression with partial information. Int. J. Mach. Learn. & Cyber. 6, 523–536 (2015). https://doi.org/10.1007/s13042-014-0286-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-014-0286-5