Plenary talk

Binary data mining



Professor Václav Snášel,

Dean of FIT, VSB-Technical University of Ostrava -

 Department of Computer Science, Faculty of Electrical Engineering and Computer Science,

VSB-Technical University of Ostrava, 17. listopadu 15,

708 33 Ostrava - Poruba, Czech Republic,



Binary data have been occupying a special place in the domain of data analysis. Analysis of binary data sets, however, generally leads to NP-complete/hard problems. Consequently, the focus here is on effective heuristics for reducing the problem size. Matrix factorization or factor analysis is an important task helpful in the analysis of high dimensional real world data. There are several well known methods and algorithms for factorization of real data but many application areas including information retrieval, pattern recognition and data mining require processing of binary rather than real data see [4],[5],[7],[8],[11],[14]. Unfortunately, the methods used for real matrix factorization fail in the latter case. In this paper we introduce background for binary matrix factorization.In order to perform object recognition (no matter which one) it is necessary to learn representations of the underlying characteristic components. Such components correspond to object-parts, or features [10]. These data sets may comprise discrete attributes, such as those from market basket analysis, information retrieval, and bioinformatics, as well as continuous attributes such as those in scientific simulations, astrophysical measurements, and sensor networks. The feature extraction if applied on binary datasets, addresses many research and application fields, such as association rule mining [1], market basket analysis [2], discovery of regulation patterns in DNA microarray experiments [12], etc. Many of these problem areas have been described in tests of PROXIMUS framework (e.g. [7]). So called bars problem [13] is used as the benchmark. Set of artificial signals generated as a Boolean sum of given number of bars is analyzed by these methods. Here we will concentrate on the case of black and white pictures of bars combinations represented as binary vectors, so the complex feature extraction methods are unnecessary [6]. Many applications in computer and system science involve analysis of large scale and often high dimensional data. When dealing with such extensive information collections, it is usually very computationally expensive to perform some operations on the raw form of the data. Therefore, suitable methods approximating the data in lower dimensions or with lower rank are needed. In the following, we focus on the factorization of hight-dimensional binary data or high order binary tensors [3].  


[1]     R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases. In: VLDB ’94: Proceedings of the 20th International Conference on Very Large Data Bases, San Francisco, CA, USA, Morgan Kaufmann Publishers Inc. (1994) Pages 487-499

[2]     S. Brin, R. Motwani, J.D. Ullman, S. Tsur, Dynamic itemset counting and implication rules for market basket data. In: SIGMOD ’97: Proceedings of the 1997 ACM SIGMOD international conference on Management of data, New York, NY, USA, ACM Press (1997) Pages 255-264

[3]     L. Elden. Matrix Methods in Data Mining and Pattern Recognition. SIAM 2007.

[4]     A.A. Frolov, D. Husek, P. Muravjev, P. Polyakov, Boolean Factor Analysis by Attractor Neural Network. Neural Networks, IEEE Transactions 18(3) (2007)  Pages 698-670

[5]     H. Lu, J. Vaidya and V. Atluri, Optimal Boolean Matrix Decomposition: Application to Role Engineering, ICDE 2008, in print.

[6]     D. Húsek, P. Moravec, V. Snásel, A.A. Frolov, H. Rezanková, P. Polyakov: Comparison of Neural Network Boolean Factor Analysis Method with Some Other Dimension Reduction Methods on Bars Problem. Springer, LNCS 4815, PReMI 2007: 235-243

[7]     M. Koyuturk, A. Grama, N. Ramakrishnan, Nonorthogonal decomposition of binary matrices for bounded-error data compression and analysis.  ACM Trans. Math. Softw. 32(1) (2006) Pages 33-69

[8]     P. Miettinen, T. Mielikäinen, A. Gionis, G. Das, H. Mannila: The Discrete Basis Problem. PKDD 2006: 335-346


[9]     P. Moravec and V. Snášel. Dimension Reduction Methods for Image Retrieval. In Proceedings of the Conference on Intelligent Systems Design and Applications (ISDA2006), 6 pages, Jinan, Shandong, China, October 2006. IEEE Press.

[10]  V. Snášel, P. Moravec, and J. Pokorny. Using BFA with WordNet Based Model for Web Retrieval. Journal of Digital Information Management, 4(2):107-111, 2006.

[11]  V. Snášel, D. Húsek, Alexander A. Frolov, H. Řezanková, P. Moravec, P. Polyakov: Bars Problem Solving - New Neural Network Method and Comparison. Lecture Notes in Computer Science 4827, MICAI 2007: 671-682

[12]  Spellman, P.T., Sherlock, G., Zhang, M.Q., Anders, V.I.K., Eisen, M.B., Brown, P., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. In: Molecular Biology of the Cell. (1998) Pages 3273-3297

[13]  M. W. Spratling: Learning Image Components for Object Recognition. Journal of Machine Learning Research 7 (2006) 793–815.

[14]  Z. Zhang, T. Li, Ch. Ding, X. Zhang, Binary Matrix Factorization with Applications, ICDM 2007


Plenary talk


Building and using  virtual environments




Professor   Michael R. M. Jenkin

Computer Science and Engineering
Faculty of Science and Engineering,
York University,
4700 Keele Street, Toronto, Ontario


      Abstract: Virtual reality and immersive environments have been proposed for a range of tasks, from training to entertainment. In this talk I will describe the development of three large-scale virtual reality devices; IVY - a six-sided immersive projective environment, MOOG - a stereo head mounted display equipped visual display coupled with a physical motion base, and the Active Desktop - a large scale immersive desk. Although each of these devices provides a compelling visual display, they do so in rather different ways and combine this visual display with other input modalities. Underlying these rather different display technologies is a common software infrastructure that allows software to be moved between the devices in a relatively straightforward manner and allows software development to take place using standard computer hardware. At York University one of the applications of virtual reality is to the generation of conflicting sensory inputs to aid in the study of basic perceptual processes with particular emphasis on the perception of self-motion and self-orientation. These are important questions on Earth where people make predictable errors in judgement given limited cues to their motion and orientation, and have applications in other domains including underwater and in outer space. I will conclude the talk with a review of some recent research into these questions and a discussion of how the virtual reality devices described in the talk (and other similar devices at York) are being used to investigate these questions in both 1g and in microgravity.


Michael Jenkin is a Professor of Computer Science and Engineering, and a member of the Centre for Vision Research at York University, Canada. Working in the fields of visually guided autonomous robots and virtual reality, he has published over 150 research papers including co-authoring Computational Principles of Mobile Robotics with Gregory Dudek and a series of co-edited books on human and machine vision with Laurence Harris. Michael Jenkin's current research intrests include work on sensing strategies for AQUA, an amphibious autonomous robot being developed as a collaboration between Dalhousie University, McGill University and York University; the development of tools and techniques to support crime scene investigation; and the understanding of the perception of self-motion and orientation in unusual environments including microgravity.


Plenary talk

Intelligent Optimization








Dr. Crina D. Grosan


Department of Computer Science

Babes-Bolyai University

Cluj-Napoca, Romania

Abstract: Optimization problems are encountered daily in each of our lives. While most of us may fail to recognize the structure of these problems, they exist at many levels of complexity. Such optimization problems can vary from relatively simple, single input variable, single objective (SO) problems to multivariate, multiobjective optimization problems (MOPs) of great complexity. While obtaining the optimal solution to an MOP and hence solving it is the ultimate goal of any attempt to optimize an MOP, the desire of most researchers is to find an acceptable solution to MOPs. Since many real world problems are MOPs, this talk concentrates on finding acceptable solutions to MOPs using a relatively new, innovative, search approach.



Crina D. Grosan received BS and MS degrees in mathematics and a Ph.D. in computer science from  Babes-Bolyai University, Cluj-Napoca, Romania in 2005. She is currently a Lecturer in the Department of Computer Science, Babes-Bolyai University. Her recent research interests include optimization, mathematical programming, numerical analysis, computational intelligence, computational biology. Dr. Grosan has over 80 scientific publications including over 25 journal articles/book chapters and 6 books written or edited. She serves on the editorial board of a number of journals and on the program committee of several international conferences. More information at:


