Your browser version is outdated. We recommend that you update your browser to the latest version.

Special Issue on Multi Strategy Learning Analytics for Big Data

 

PREFACE


In recent years, there are emerging needs for Soft Computing systems evolving from the field of science and engineering with growing focus on Big Data computing. Big Data is the key enabler that will bring about tremendous benefits to a country; helps uncover greater insights from existing information and become a useful tool for advancing technology, from empowering businesses to lifting the social and economic standards of its people.

This special issue presents the multi strategy learning and adaptation of advanced soft computing techniques in data mining, in order to address the issues and challenges of Big Data. It brings together nine selected papers presented at the International Workshop of Big Data Analytics (IWBDA2015) organized by UTM Big Data Centre from 17 to 18 August 2015 at Universiti Teknologi Malaysia, Kuala Lumpur. The international workshop served as a platform for integrated big data analytics and data science research for understanding problems and investigating the feasible solution in solving Big Data issues in real world problem with graduate training focusing on the scientific and economic transformations shaping the global future direction.

In the article by Maryam Mousavi et al. titled “Data Stream Clustering Algorithms: A Review”, is a survey paper on five different types of data stream clustering approaches; partitioning, hierarchical, density-based, grid-based and model-based. This paper also discussed the problem and challenges in data stream clustering, along with the strength and weaknesses of each approach.

The second article by Noraini Abdullah et al. entitled “Data Quality in Big Data: A Review” presented a review on characteristics of big data quality, the managing process involved and challenges and issues in preserving quality of data. In addition, the paper also described how such problem affects business organization and corporate resource.

The next article titled “Social Network Analysis for Political Blogsphere Dataset” by Nor Amalina Abdul Rahim and Sarina Sulaiman brings upon an analysis on relationships and interactions between actors in the network for political blogosphere using three different social network analysis tools; ORA, NodeXL and UCINET. Analysis study discovered important actors in the network can be identified using measures of centralities. With more than 10000 links in the network, the authors also showed how such analysis could be extend into big data analytics.

Majid Bakhtiari et al. the authors of article titled “Lightweight Symmetric Encryption Algorithm In Big Data” described a proposed algorithm for symmetric lightweight encryption to achieve faster encryption and decryption. The paper demonstrated how the proposed model outperformed other stream cipher algorithms, and how the model could be adopted to support the encryption for multitude of data in huge shared environment.

The work by Shakirah Mohd Taib et al. with the titled “Classifying Weather Time Series Using Feature-based Approach” proposed for an integration between Symbolic Aggregation approximation (SAX) for clustering and feature based approach; A Series Bag-Of-Features (TSBF) which retrieves feature vectors to classify time series weather data sets. A comparison study between the proposed method and Random Forest was carried out to evaluate the classifier performance is also presented.

The article titled “Enhancing Security and Privacy Protection for MapReduce Processing: The Initial Simulation Work Flow” presented by Nurulhuda Firdaus et al. discussed their work on simulating improvement of security and privacy access control element for MapReduce processing on a Hadoop platform using Whitelists. This paper also highlights the expected input and output for the planned simulation in ensuring data integrity in big data clusters.

The next article entitled “Predicting the Relevance of Search Result for E-Commerce Systems” by Mohammed Zuhair Al-Taie et al. proposed an open source model to estimate the relevance of search results for online business. The proposed model incorporated machine learning methods; SVM and Random Forest with two different features schemes; word match counting and tf-idf; and tested on a real data, with the latter approach being the better of the two. A discussion on how such model could be extended into big data implementation is also presented.

Finally, the work entitled “Multi Strategy Approach for E-Learning Data Analytics” by Nor Bahiah Ahmad et al., proposed for a framework of adaptive learning environment for data analytics and domain model content using Self Organizing Map. This is done to better representing learning material in order to adapt with behavior and knowledge level of students. Experimental results on UTM Moodle E-learning environment data reveals that the framework give promising output.

 

 

Guest Editor:
Dr Aida Ali
UTM Big Data Centre
Ibnu Sina Institute for Scientific and Industrial Research
Universiti Teknologi Malaysia
E-mail: aida@utm.my