It is possible to deliver a Hadoop POC within 1 month. This can be carried out by following the steps below.
Hadoop Distribution: Select a distribution from one of the enterprise providers – Cloudera, Hortonworks or MapR.
Infrastructure: Deploy on one of the major cloud infrastructure providers – Azure or AWS – and use a virtualised environment for the POC. The BM Cloudburst product will deploy a fully kerberised cluster on Azure in less than 1 hour, allowing you a platform to develop on.
Use Case: Focus all of your energies on developing the application to substantiate the use case.
Focus all of your energies on developing the application to substantiate the use case.
Data Ingest: Use BM Data Ingest for ingestion of data onto your cluster. It has multiple connectors for different data sources and converts all of the data to work in Hadoop. This automatically generates the ingest code and has a drag and drop interface that can be easily understood and used by non-Hadoop experts. It is available to purchase on a monthly-use basis and data can be ingested in less than 1 day.
Data Transformation: Use BM Data Transformer to combine and manipulate the data so it is available on Hive for your use case. All transformations are carried out in Spark using an extensive library, with a simple easy to use drag and drop interface requiring no Hadoop knowledge. All of the underlying code is developed automatically. Most data transformations can be created and deployed in minutes.
Following the above 5 steps will get a cluster deployed and operational with data ingested and manipulated within a matter of days, allowing you to spend the rest of the month working on your use case application.
Apart from being the fastest solution on the market for a Hadoop POC deployment, it has extraordinary cost savings. It uses low-cost tools to automate the process and removes the need for any skilled Hadoop knowledge. Using this methodology, any Data Science team can prove the business for a Hadoop Big Data project without ever having to be Hadoop experts.