Data

Žliobaitė, I. (2011). Combining similarity in time and space for training set formation under concept drift. Intelligent Data Analysis 15(4), p. 589-611. 


I have prepared these datasets for evaluating adaptive classifiers. Feel free to use them for research purposes. Note, that ESS requires to give a certain acknowledgement to the data source.

Luxembourg dataset is constructing using European Social Survey data. Each instance is an individual. The attributes are formed from answers to the survey questionnaire. The labels indicate high or low internet usage. The dataset has time stamps, the questionnaires are collected over 5 years period. It is expected that internet usage is changing over time (concept drift). 
DATA description
Citation: please cite BOTH references [1] as the data source and [2] for creating this dataset. Also, please acknowledge Norwegian Social Science Data Services (NSD) as required by ESS policy and inform them when the paper is published.
[1] R. Jowell and the Central Coordinating Team. European social survey 2002/2003; 2004/2005; 2006/2007. Technical Reports, London: Centre for Comparative Social Surveys, City University, 2003, 2005, 2007.
[2] Žliobaitė, I. (2011). Combining similarity in time and space for training set formation under concept drift. Intelligent Data Analysis 15(4), p. 589-611.

Chess.com dataset is constructed using the data from chess.com portal. The data consists of game records of one player over a period from 2007 December to 2010 March. A player has a rating, which changes depending on his/her results achieved (the higher is the rating, the stronger is the player). A payer is developing skills over time, besides engages into different types of tournaments and competitions. The rating and the type of game determine how the system selects an opponent. This is where the concept drift is expected. The task is to predict if the player will win or lose based on the setting. There is natural problem of delayed labeling, the winner is known only after the game is .finished. In turn based chess one game might last even for several months.
DATA description
Citation: please cite reference [3] for creating this dataset.
[3] Žliobaitė, I. (2011). Combining similarity in time and space for training set formation under concept drift. Intelligent Data Analysis 15(4), p. 589-611.