Workshop on Nonstationary Models of Pattern Recognition and Classifier Combinations (NMRPC) Session 2
Time and Date: 13:25 - 15:05 on 8th June 2016
Room: Boardroom East
Chair: Michal Wozniak
332 | GPU-Accelerated Extreme Learning Machines for Imbalanced Data Streams with Concept Drift [abstract] Abstract: Mining data streams is one of the most vital fields in the current era of big data. Continuously arriving data may pose various problems, connected to their volume, variety or velocity. In this paper we focus on two important difficulties embedded in the nature of data streams: non-stationary nature and skewed class distributions. Such a scenario requires a classifier that is able to rapidly adapt itself to concept drift and displays robustness to class imbalance problem. We propose to use online version of Extreme Learning Machine that is enhanced by an efficient drift detector and method to alleviate the bias towards the majority class. We investigate three approaches based on undersampling, oversampling and cost-sensitive adaptation. Additionally, to allow for a rapid updating of the proposed classifier we show how to implement online Extreme Learning Machines with the usage of GPU. The proposed approach allows for a highly efficient mining of high-speed, drifting and imbalanced data streams with significant acceleration offered by GPU processing. |
Bartosz Krawczyk |
397 | Efficient Computation of the Tensor Chordal Kernel [abstract] Abstract: In this paper new methods for fast computation of the chordal kernels are proposed. Two versions of the chordal kernels for tensor data are discussed. These are based on different projectors of the flattened matrices obtained from the input tensors. A direct transformation of multidimensional objects into the kernel feature space leads to better data separation which can result with a higher classification accuracy. Our approach to more efficient computation of the chordal distances between tensors is based on an analysis of the tensor projectors which exhibit different properties. Thanks to this an efficient eigen-decomposition becomes possible which is done with a version of the fixed-point algorithm. Experimental results show that our method allows significant speed-up factors, depending mostly on tensor dimensions. |
Bogusław Cyganek, Michal Wozniak |
375 | A New Design Based-SVM of the CNN Classifier Architecture with Dropout for Offline Arabic Handwritten Recognition [abstract] Abstract: In this paper we explore a new model focused on integrating two classifiers; Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for offline Arabic handwriting recognition (OAHR) on which the dropout technique was applied. The suggested system altered the trainable classifier of the CNN by the SVM classifier. A convolutional network is beneficial for extracting features information and SVM functions as a recognizer. It was found that this model both automatically extracts features from the raw images and performs classification. Additionally, we protected our model against over-fitting due to the powerful performance of dropout. In this work, the recognition on the handwritten Arabic characters was evaluated; the training and test sets were taken from the HACDB and IFN/ENIT databases. Simulation results proved that the new design based-SVM of the CNN classifier architecture with dropout performs significantly more efficiently than CNN based-SVM model without dropout and the standard CNN classifier. The performance of our model is compared with character recognition accuracies gained from state-of-the-art Arabic Optical Character Recognition, producing favorable results. |
Mohamed Elleuch, Rania Maalej, Monji Kherallah |
363 | Active Learning Classification of Drifted Streaming Data [abstract] Abstract: Contemporary classification systems have to make a decision not only on the basis of the static data, but on the data in motion as well. Objects being recognized may arrive continuously to a classifier in the form of data stream. Usually, we would like to start exploitation of the classifier as soon as possible, the models which can improve their models during exportation are very desirable. Basically, we produce the model on the basis a few object learning objects and then we use and improve the classifier when new data comes. This concept is still vibrant and may be used in the plethora of practical cases. Constructing such a system we have to realize that we have the limited resources (as memory and computational power) at our disposal. Nevertheless, during the exploitation of a classifier system the chosen characteristic of the classifier model may change within a time. This phenomena is called \textit{concept drift} and may lead the deep deterioration of the classification performance. This work deals with the data stream classification with the presence of \textit{concept drift}. We propose a novel classifier training algorithm based on the sliding windows approach which allows us to implement forgetting mechanism, i.e., that old objects came from outdated model will not be taken into consideration during the classifier updating and on the other hand we assume that all arriving examples can not be labeled, because we assume that we have a limited budget for labeling. We will employ active learning paradigm to choose an "interesting" object to be be labeled. The proposed approach has been evaluated on the basis of the computer experiments carried out on the data streams. Obtained results confirmed the usability of proposed method to the smoothly drifted data stream classification. |
Michal Wozniak, Pawel Ksieniewicz, Bogusław Cyganek, Andrzej Kasprzak, Krzysztof Walkowiak |