Since big data first entered the tech scene, the concept, strategy, and use cases for it have evolved significantly across different industries.
Particularly with innovations like the cloud, edge computing, Internet of Things (IoT) devices, and streaming, big data has become more prevalent for organizations that want to better understand their customers and operational potential.
Big data comes into organizations from many different directions, and with the growth of tech, such as streaming data, observational data, or data unrelated to transactions, and increased knowledge of how disparate data types can be used strategically, big data storage capacity is an issue.
In most businesses, traditional on-premises data storage no longer suffices for the terabytes and petabytes of data flowing into the organization. Cloud and hybrid cloud solutions are increasingly being chosen for their simplified storage infrastructure and scalability.
Ben Gitenstein, VP of product at Qumulo, an unstructured data management platform, believes cloud migration brings storage and additional benefits to corporate big data:
“Cloud solutions are now the name of the game, particularly hybrid cloud solutions for workloads that demand multiple storage environments,” Gitenstein said. “And as data continues to inevitably grow, enterprises require the flexibility and scalability only cloud services currently provide.
To deal with the inexorable increase in data generation, organizations are spending more of their resources storing this data in a range of cloud-based and hybrid cloud systems optimized for all the V's of big data. In previous decades, organizations handled their own storage infrastructure, resulting in massive data centers that enterprises had to manage, secure and operate. The move to cloud computing changed that dynamic. By shifting the responsibility to cloud infrastructure providers -- such as AWS, Google, Microsoft and IBM -- organizations can deal with almost limitless amounts of new data and pay for storage and compute capability on demand without having to maintain their own large and complex data centers.
Some industries are challenged in their use of cloud infrastructure due to regulatory or technical limitations. For example, heavily regulated industries -- such as healthcare, financial services and government -- have restrictions that prevent the use of public cloud infrastructure. As such, in the past decade, cloud providers have developed ways to provide more regulatory-friendly infrastructure as well as hybrid approaches that combine aspects of third-party cloud systems with on-premises computing and storage to meet critical infrastructure needs. The evolution of both public cloud and hybrid cloud infrastructures will no doubt progress as organizations seek the economic and technical advantages of cloud computing.
In addition to innovations in cloud storage and processing, enterprises are shifting toward new data architecture approaches that allow them to handle the variety, veracity and volume challenges of big data. Rather than trying to centralize data storage in a data warehouse that requires complex and time-intensive data extraction, transformation and loading, enterprises are evolving the concept of the data lake. Data lakes store structured and unstructured data sets in their native format. This approach shifts the responsibility of transformation and processing to end points that have different data needs. The data lake can also provide shared services for data analysis and processing.