Putting in the Cloudera CDP Non-public Cloud Base on IBM Cloud with Ansible – IBM Developer


On this second weblog put up in our collection, we speak about Cloudera Information Platform for IBM Cloud Pak for Information. Very similar to IBM Cloud Pak for Information, the Cloudera Information Platform is an information and AI platform that may be put in on-premises. In truth, many IBM prospects are additionally Cloudera prospects. IBM Cloud Pak for Information is constructed on Purple Hat OpenShift and breaks down silos to allow your entire knowledge customers to collaborate from a single, unified interface.

Like most fashionable platforms, set up is rather more than simply unzipping a file or clicking a “subsequent” button on a wizard. Fortunately, the Cloudera staff not too long ago introduced it will open supply Ansible playbooks that we’ll leverage to make this complete course of simpler for our personal functions.

This weblog put up is meant to share our expertise in utilizing Ansible to put in Cloudera Information Platform on IBM Cloud. It’s value mentioning that the automation used is open supply and follows the very best practices really useful by the Cloudera Skilled Companies staff.

Our surroundings

We used Digital Servers on IBM Cloud because the goal for our Cloudera Information Platform set up. A complete of 8 VMs, every 32 vCPU by 128 GB of RAM working CentOS, have been chosen. We additionally had one other Home windows-based VM to run Lively Listing, to finest mimic what prospects most frequently use of their environments. And a single bastion node was provisioned to simplify the communication between the consumer and the hosts. IBM Cloud Pak for Information was additionally provisioned, however the particulars of which might be out of scope for this put up.

List of virtual servers on IBM Cloud
Determine 1. Checklist of digital servers on IBM Cloud

When put collectively, our surroundings resembled the structure diagram under.

Diagram of environment used for integrating Cloudera Data Platform and IBM Cloud Pak for Data
Determine 2. An structure diagram of the setting used for integrating Cloudera Information Platform and IBM Cloud Pak for Information

The Ansible playbooks

As talked about earlier, to put in Cloudera Information Platform on IBM Cloud, we leveraged present Ansible playbooks that have been open sourced.

The set up takes roughly 30-60 minutes to finish, relying on machine specs. The longest half is when the installer pulls down the required artifacts and pushes them to every host.

Cloudera Manager installing Cloudera Data Platform
Determine 3. A screenshot of the Cloudera Supervisor putting in Cloudera Information Platform

Subsequent steps

Should you’re an IBMer trying to get your arms on Cloudera, or occupied with studying extra about utilizing Ansible playbooks to put in Cloudera, take a look at the GitHub repo. Should you loved this, take a look at A technical deep-dive on integrating Cloudera Information Platform and IBM Cloud Pak for Information. You can even be taught extra in regards to the Cloudera Information Platform for IBM Cloud Pak for Information joint providing.


Please enter your comment!
Please enter your name here