Header menu link for other important links
X

Blahval:Cloud Based Big Data Analytics

Published in
2014
Volume: 3
   
Abstract

The production of best results require the accumulation and analysis of humongous quantities of data be it structured, semi structured or even unstructured. Data that exceed 100 petabytes in RDBMS most often need specific hardware and underperforms complicated computing. Management and analysis of data this varied and large data cannot be done with ease using tools like Excel, minitab, and SPSS prompting the researcher to switch to a scalable distributed computing environment. Apache Hadoop an open source cluster computing framework that uses commodity hardware, HDFS and Map Reduce to compress and make the data compact, is an attractive solution to the situation as it is scalable to need. The purpose of this project is the proposition of development of an analytical platform with open source Apache Hadoop for improvement in the state of management of large quantity data and the usage of open source R software for statistical analysis. For cost parity, this platform can later be extended to a Cloud based cluster computing environments. The benefits of this cloud based big data analytics service are that it is extremely user friendly and made by the open-source software. It allows users to browse through the cloud environment, analyze only the required material on their datasets and store the data back to the cloud. Enterprise with a cloud based environment can save by not having to invest in hardware, upgrading software, maintenance or network configuration, thus making it economical. Keywords: Dynamic RDBMS- Relational Database Management System, minitab, SPSS- Statistical Package for the Social Science, Apache Hadoop; _____________________

About the journal
JournalProceedings of Third Post Graduate Conference on “Computer Engineering“ cPGCON 2014
Open AccessNo