Accuracy on Data Streams

We compare the accuracy of Sketch with that of the alternatives by measuring statistical distance for various types of stationary and non-stationary data streams. Moreover, we compare the characteristics for a non-stationary case of Sketch with that of the alternatives. The results show that Sketch has its advantages and disadvantages in estimating the distribution of various types of stationary data streams. Sketch and SPDTw are more useful for a non-stationary data stream as they have superior accuracy compared with oKDE, except for the case of sudden concept drift.

Note that the parameters of SPDTw are deliberately selected so as to have similar accuracy as Sketch when concept drift occur. We observe that SPDTw is 2.0–3.2$\times$ inaccurate than Sketch when it estimates PDF for a stationary data stream. Moreover, this results in a significant increase in the throughput and memory usage of SPDTw.