Deliver business-ready data with intelligent data cataloging and data lake governance

Solving data lake challenges with a DataOps approach

Ten years ago, the journey began to find a flexible, versatile approach to build a central data store where all enterprise data could reside. The solution was the data lake—a general- purpose data storage environment that would store practically any type of data. It would also allow business analysts and data scientists to apply the most appropriate analytics engines and tools to each data set, in its original location.

Typically, these data lakes were built using Apache Hadoop and Hadoop Distributed File System (HDFS), combined with engines such as Apache Hive and Apache Spark. As these data lakes began to grow, a set of problems became apparent. While the technology was physically capable of scaling to capture, store and analyze vast and varied collections of structured and unstructured data, too little attention was paid to the practicalities of how to embed these capabilities into business workflows.

    IBM may use my contact data to keep me informed of products, services, and offerings.

    By Postal Mail
    By Telephone
    By E-mail

    You can withdraw your marketing consent at any time by sending an email to NETSUPP@us.ibm.com. Also you may unsubscribe from receiving marketing emails from IBM by clicking the unsubscribe link in each such email. More information on IBM processing of your personal data can be found in the IBM Privacy Statement.