Enhanced Ensemble Mechanisms for Time Series Data Mining
In the last decade, the increasing use of temporal data, especially time series data, has initiated a great deal of research and development attempts in the field of data mining. Time series data which is chronological sequences of observations is one of the important class of temporal data. Time series data is ubiquitous and increasingly prevalent type of data. For example, a random sample of 4000 graphics from 15 of world’s newspapers and magazines from 1974-1980 showed that 75% of graphics published were time series. Many data sources in different fields, such as in medicine, education and finance, naturally generate time series data (e.g. electrocardiogram (ECG), daily temperature, weekly sales totals, and prices of mutual funds and stocks). However, there are critical challenges in extracting meaningful information from time series when the data is large and dirty, there are multiple data types (e.g. text) and additional non-time series information. Fusion of the data from multiple sources is crucial for many applications. Moreover, modeling the relation between the time series might be required depending on the application and nonlinear relations are not easy to model. Therefore, robust approaches are required to handle such complex time series data. The research focus will be on the development of algorithms that allow for data mining of the time series under these critical challenges. The approach will benefit from tree-based ensemble learning strategies to handle a number of statistical learning tasks directly (e.g., supervised learning, clustering, anomaly detection).
Keywords: time series, data mining, data fusion, ensemble learning, decision trees