QBoard » Big Data » Big Data on Cloud » what is the minimum configuration required to do a Hadoop Proof of Concept in Cloud?

what is the minimum configuration required to do a Hadoop Proof of Concept in Cloud?

  • I am looking for so guidance and tips in understanding what would it take to do a reasonable Hadoop Proof of Concept in the Cloud? I am a complete noob to the Big Data Analytics world and I will be more than happy for some suggestions that you might have based on your experience?

     
      October 9, 2021 1:26 PM IST
    0
  • In our workbook, “How to Run a Big Data POC in Six Weeks,” we show you how to build a successful big data experiment that is a foundation from which to grow. Download the workbook today and discover:

    • The five imperatives for successful big data POCs
    • What a six-week big data POC schedule looks like
    • The role of big data management and security
      November 13, 2021 2:25 PM IST
    0
  • Regarding your question, there are two important pieces:

    1. Setting up a hadoop cluster requires installing some sort of hadoop distibution (apache open source or cloudera or IBM). This would give you the opportunity to learn about Hadoop configuration, space allocation, performance and all type of administration for your cluster. To do this, you need your own hardware/cloud and install any of the above mentioned hadoop distribution yourself.

    2. Secondly, you would want to learn about map reduce framework and multiple hadoop components like hive, hbase etc. To do this, you can go to Bluemix Its a good place to start hadoop on cloud service.

    Both these skills are important and required to work with hadoop.

      October 21, 2021 2:22 PM IST
    0
  • It is possible to deliver a Hadoop POC within 1 month. This can be carried out by following the steps below.

     
     
     

    Hadoop Distribution: Select a distribution from one of the enterprise providers – Cloudera, Hortonworks or MapR.

     
     
     

    Infrastructure: Deploy on one of the major cloud infrastructure providers – Azure or AWS – and use a virtualised environment for the POC. The BM Cloudburst product will deploy a fully kerberised cluster on Azure in less than 1 hour, allowing you a platform to develop on.

     
     
     

    Use Case: Focus all of your energies on developing the application to substantiate the use case.

     
    Focus all of your energies on developing the application to substantiate the use case.
     

    Data Ingest: Use BM Data Ingest for ingestion of data onto your cluster. It has multiple connectors for different data sources and converts all of the data to work in Hadoop. This automatically generates the ingest code and has a drag and drop interface that can be easily understood and used by non-Hadoop experts. It is available to purchase on a monthly-use basis and data can be ingested in less than 1 day.

     
     
     

    Data Transformation: Use BM Data Transformer to combine and manipulate the data so it is available on Hive for your use case. All transformations are carried out in Spark using an extensive library, with a simple easy to use drag and drop interface requiring no Hadoop knowledge. All of the underlying code is developed automatically. Most data transformations can be created and deployed in minutes.

     
     
     

    Following the above 5 steps will get a cluster deployed and operational with data ingested and manipulated within a matter of days, allowing you to spend the rest of the month working on your use case application.

     
     
     

    Apart from being the fastest solution on the market for a Hadoop POC deployment, it has extraordinary cost savings. It uses low-cost tools to automate the process and removes the need for any skilled Hadoop knowledge. Using this methodology, any Data Science team can prove the business for a Hadoop Big Data project without ever having to be Hadoop experts.

      December 16, 2021 12:42 PM IST
    0
  • Step by step guide to create HDFS cluster and start using it. Hope it helps you. http://www.edureka.co/blog/install-apache-hadoop-cluster/
      October 28, 2021 4:30 PM IST
    0