Panthera: holistic memory management for big data processing over hybrid memories

Wang, Chenxi; Cui, Huimin; Cao, Ting; Zigman, John; Volos, Haris; Mutlu, Onur; Lv, Fang; Feng, Xiaobing; Xu, Guoqing Harry

doi:10.1145/3314221.3314650

dc.contributor.author	Wang, Chenxi	en
dc.contributor.author	Cui, Huimin	en
dc.contributor.author	Cao, Ting	en
dc.contributor.author	Zigman, John	en
dc.contributor.author	Volos, Haris	en
dc.contributor.author	Mutlu, Onur	en
dc.contributor.author	Lv, Fang	en
dc.contributor.author	Feng, Xiaobing	en
dc.contributor.author	Xu, Guoqing Harry	en
dc.coverage.spatial	Phoenix, AZ, USA	en
dc.creator	Wang, Chenxi	en
dc.creator	Cui, Huimin	en
dc.creator	Cao, Ting	en
dc.creator	Zigman, John	en
dc.creator	Volos, Haris	en
dc.creator	Mutlu, Onur	en
dc.creator	Lv, Fang	en
dc.creator	Feng, Xiaobing	en
dc.creator	Xu, Guoqing Harry	en
dc.date.accessioned	2021-01-22T10:47:34Z
dc.date.available	2021-01-22T10:47:34Z
dc.date.issued	2019
dc.identifier.isbn	978-1-4503-6712-7
dc.identifier.uri	http://gnosis.library.ucy.ac.cy/handle/7/62339
dc.description.abstract	Modern data-parallel systems such as Spark rely increasingly on in-memory computing that can significantly improve the efficiency of iterative algorithms. To process real-world datasets, modern data-parallel systems often require extremely large amounts of memory, which are both costly and energy-inefficient. Emerging non-volatile memory (NVM) technologies offers high capacity compared to DRAM and low energy compared to SSDs. Hence, NVMs have the potential to fundamentally change the dichotomy between DRAM and durable storage in Big Data processing. However, most Big Data applications are written in managed languages (e.g., Scala and Java) and executed on top of a managed runtime (e.g., the Java Virtual Machine) that already performs various dimensions of memory management. Supporting hybrid physical memories adds in a new dimension, creating unique challenges in data replacement and migration. This paper proposes Panthera, a semantics-aware, fully automated memory management technique for Big Data processing over hybrid memories. Panthera analyzes user programs on a Big Data system to infer their coarse-grained access patterns, which are then passed down to the Panthera runtime for efficient data placement and migration. For Big Data applications, the coarse-grained data division is accurate enough to guide GC for data layout, which hardly incurs data monitoring and moving overhead. We have implemented Panthera in OpenJDK and Apache Spark. An extensive evaluation with various datasets and applications demonstrates that Panthera reduces energy by 32 – 52% at only a 1 – 9% execution time overhead.	en
dc.publisher	Association for Computing Machinery	en
dc.source	Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation	en
dc.source.uri	https://doi.org/10.1145/3314221.3314650
dc.title	Panthera: holistic memory management for big data processing over hybrid memories	en
dc.type	info:eu-repo/semantics/conferenceObject
dc.identifier.doi	10.1145/3314221.3314650
dc.author.faculty	002 Σχολή Θετικών και Εφαρμοσμένων Επιστημών / Faculty of Pure and Applied Sciences
dc.author.department	Τμήμα Πληροφορικής / Department of Computer Science
dc.type.uhtype	Conference Object	en

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Τμήμα Πληροφορικής / Department of Computer Science [1952]

Show simple item record