Decaying Telco Big Data with Data Postdiction
Mokbel, Mohamed F.
Source2018 19th IEEE International Conference on Mobile Data Management (MDM)
Google Scholar check
MetadataShow full item record
In this paper, we present a novel decaying operator for Telco Big Data (TBD), coined TBD-DP (Data Postdiction). Unlike data prediction, which aims to make a statement about the future value of some tuple, our formulated data postdiction term, aims to make a statement about the past value of some tuple, which does not exist anymore as it had to be deleted to free up disk space. TBD-DP relies on existing Machine Learning (ML) algorithms to abstract TBD into compact models that can be stored and queried when necessary. Our proposed TBD-DP operator has the following two conceptual phases: (i) in an offline phase, it utilizes a LSTM-based hierarchical ML algorithm to learn a tree of models (coined TBD-DP tree) over time and space(ii) in an online phase, it uses the TBD-DP tree to recover data within a certain accuracy. In our experimental setup, we measure the efficiency of the proposed operator using a 10GB anonymized real telco network trace and our experimental results in Tensorflow over HDFS are extremely encouraging as they show that TBD-DP saves an order of magnitude storage space while maintaining a high accuracy on the recovered data.