Influence of the virtual machine manager on the data mining system performance

Dariusz Czerwiński


This paper presents a comparative analysis of the impact of the virtual machine manager on the data mining systems performance. Discussion is based on the results obtained in a test environment based on the Cloudera Hadoop distribution which is used as personal cluster. The main focus is the hypervisor impact on the typical operations in data mining system, such as parallelized calculation, memory operations and the use of CPU resources.


systems efficiency; virtualization; data mining systems; Cloudera Hadoop

