Hadoop Data Lake Business Architecture
A data scientist can use EdrawMax or EdrawMax Online to create a Hadoop Data Lake diagram for their usage. A Hadoop data lake is a data management platform comprising one or more Hadoop clusters. As shown in the below architecture diagram, it is used principally to process and store non-relational data, such as log files, internet clickstream records, sensor data, JSON objects, images, and social media posts. While the data lake concept can be applied more broadly to include other types of systems, it most frequently involves storing data in the Hadoop Distributed File System (HDFS) across a set of clustered compute nodes based on commodity server hardware. As the below image suggests, a Hadoop enterprise data lake can complement an enterprise data warehouse rather than supplant it entirely.
See More Related Templates