|By Bob Gourley||
|November 7, 2015 07:00 PM EST||
Data Lake Phenomenon Among Enterprises
Over the past few years, there has been an explosion in the volume of data. To tackle this big data explosion, there has been a rise in the number of successful Hadoop projects in enterprises. Due to the large volumes of data, the emergence of Hadoop technology, and the need to store all soloed data in one place, has prompted a phenomenon among enterprises called: Data Lake.
Is the Data Lake an effective catchment for all of the enterprise data?
Yes and No. Data lakes are good to house the current, inter-related data but they don’t address the need for an enterprise-wide data management system
- Since the data lake holds raw data of different types the business user cannot have controlled access to risk-free, secure, governed and curated data with semantic consistency as in the case of an enterprise data warehouse
- Enterprise data today is heterogeneous, locked in disparate data sources and data from these systems are in conflict
- A data lake is agnostic to the type of data it receives and due to issues such as lack of governance, descriptive metadata and a mechanism to maintain it, the data lake can easily turn into a data swamp with too much data
- Hadoop and related technologies are still nascent even among early adopters, who are mostly conversant with SQL for data discovery and require training in Pig and MapReduce for data access. This slows down time-to-value for enterprises
Hortonworks has helped with the Data Lake phenomena. One example of this is the largest member-owned healthcare company in the US delivering industry leading supply chain management services and clinical improvement services to its members, VHA. The company had its product, supplier, and member information, and other data, spread across multiple sources, residing in silos. VHA used Hortonworks Data Platform to enable the business users to discover the related data and provide services to their members. Because of their previous success with data virtualization using the Denodo Platform, VHA decided to use data virtualization to enable their business users to discover data using the familiar SQL, and thus abstract their access directly to Hadoop.
Read more about Data Lake here.
Credit: Lisa Sensmeier.
- Megatrend of #ArtificialIntelligence | @CloudExpo #BigData #AI #ML #DL #IoT
- Security and #MachineLearning | @CloudExpo #ML #AI #DL #CyberSecurity
- Security Innovation Network Gathering Innovators and Enterprise Leaders at 28-29 March 2017 ITSEF
- Leveraging The FFIEC Cybersecurity Assessment Tool (CAT) To Improve Corporate Culture and Raise Security Posture
- Update on Apache Spot: Tremendous advancement in cybersecurity data analytics and event management capabilities
- Chances to Speak at O’Reilly Media’s Upcoming Conferences
- Eric Schmidt Provides Insights Into The Future of Artificial Intelligence and Machine Learning at RSAC2017
- RiskIQ: Tools to Improve Cyber- Situational Understanding in DoD
- Cybersecurity Due Diligence: Now a best practice in Merger & Acquisition (M&A)
- Learn The Latest On All Things Data At The 25 April 2017 Cloudera Government Forum
- How Platfora Is Transforming Hadoop
- Chrome Netbook OS; Tablet PCs; LBS; Open Source
- Don’t forget to register for FOSE 2013
- Cloud Computing vs SOA: Look For a Cross-over in Hype
- Join Me at the 1st Government IT Conference & Expo 6 Oct
- Six Enterprise Megatrends to Watch in 2010
- Technology Heroes Also Serve in Government
- My Thoughts on the Apple iPad
- Five Gadgets That I Can’t Wait to See in 2010
- Recap of the Government Big Data Forum of 26 Jan 2011