Minersoft: Software retrieval in grid and cloud computing infrastructures
SourceACM Transactions on Internet Technology
Google Scholar check
MetadataShow full item record
One of the main goals of Cloud and Grid infrastructures is to make their services easily accessible and attractive to end-users. In this article we investigate the problem of supporting keyword-based searching for the discovery of software files that are installed on the nodes of large-scale, federated Grid and Cloud computing infrastructures. We address a number of challenges that arise from the unstructured nature of software and the unavailability of software-related metadata on large-scale networked environments. We present Minersoft, a harvester that visits Grid/Cloud infrastructures, crawls their file systems, identifies and classifies software files, and discovers implicit associations between them. The results of Minersoft harvesting are encoded in a weighted, typed graph, called the Software Graph. A number of information retrieval (IR) algorithms are used to enrich this graph with structural and content associations, to annotate software files with keywords and build inverted indexes to support keyword-based searching for software. Using a real testbed, we present an evaluation study of our approach, using data extracted from productionquality Grid and Cloud computing infrastructures. Experimental results show that Minersoft is a powerful tool for software search and discovery. © 2012 ACM.
Showing items related by title, author, creator and subject.
Kapitsaki, Georgia M.; Kramer, F.; Tselikas, N. D. (2017)Free and Open Source Software (FOSS) promotes software reuse and distribution at different levels for both creator and users, but at the same time imposes some challenges in terms of FOSS licenses that can be selected and ...
Paschalides, D.; Kapitsaki, Georgia M. (Association for Computing Machinery, 2016)Licensing decisions for new Open Source Software are not al-ways straightforward. However, the license that accompanies the software is important as it largely affects its subsequent distribution and reuse. License information ...
Kapitsaki, Georgia M.; Kramer, F. (2014)The Open Source Software development model has gained a lot of momentum in the latest years providing organizations and software engineers with a variety of software, components and libraries that can be exploited in the ...