Create secure Hadoop cluster

This section provides information on creating secure Hadoop cluster and to manage the access control over HDFS, Hive and HBase for the users using Cluster Manager.

Prerequisites

Active Directory Server

You need to have Active Directory Server configured in your network where Hadoop cluster is going to be deployed. To know more about configuring Active Directory Server, refer here.
Also enable PowerShell remoting in the AD machine, since Cluster Manager requires this to generate SSL certificates and Keytabs for forming Hadoop cluster.

Execute below PowerShell command in AD machine:

Enable-PSRemoting –force

Software Requirement

  • In addition to software requirement mentioned in create cluster section, you need to have PowerShell version 3 or later in Active directory server and cluster manager installed machine.
  • For Linux node additionally you need to install krb5-user packages. To know more about install and configure krb5-user in Linux nodes, refer here.

Create secure cluster

Under create cluster page, check the Secure Cluster check box as highlighted in the screenshot. Provide Active Directory Server hostname, username and password along with Cluster name, replication factor and node details. Then proceed with the steps as mentioned in creating normal cluster to create secure cluster.

NOTE

The AD username provided here should be present in “Administrators” group.

Secure-Cluster dialog

NOTE

We will perform following operations in Active Directory.

  • Create a new group “SdpPrincipals”, users created under this group will act as super user group with all privileges.

  • Credentials used for creating secure cluster will be added to the super user group “SdpPrincipals” having all privileges over the cluster.

Create secure cluster using Cluster Manager installed in Linux

As stated in the create secure cluster, validate the cluster nodes and Active Directory node.
After validation success, cluster creation gets started by creating users in the Active Directory and click Next to proceed to the ‘Kerberos Authentication Settings’ page.

AdObject-Creation dialog

In Kerberos Authentication settings page, download the PowerShell script using DOWNLOAD FILE button.
Execute the PowerShell script in the Active Directory machine with admin rights. After executing, validate the work by clicking VALIDATE AND PROCEED button.

Download Powershell Script dialog

After validation, proceed with the steps as stated in creating normal cluster to create secure cluster in Linux.

Access Control List

The Syncfusion Big Data Cluster Manager uses Active Directory Server to manage the access control for HDFS, Hive and HBase to the users.

ACL-menu dialog

HDFS ACL

You can grant, edit and revoke HDFS directory access permission to users and groups as required.

HDFS-ACL dialog

Hive ACL

You can grant, edit and revoke Hive table access permission to users and roles as required.

Hive-ACL dialog

HBase ACL

You can grant, edit and revoke HBase table access permission to users and groups as required.

HBase-ACL dialog

IPython ACL

You can grant, edit and revoke IPython web UI access permission to users and groups as required.

IPython-ACL dialog

NOTE

Despite IPython web UI access in restricted by IPython ACL, the Spark shell associated with IPython server will run as super user.

Configuring Active Directory Server

To configure Active Directory Domain Service, refer

AD DS Installation and Removal Step-by-Step Guide

Windows Server 2012 Set Up your First Domain Controller

To configure Active Directory Certificate Service, refer

Setting Up Active Directory Certificate Services

Configure Krb5-user in Linux node

We need to install krb5-user packages in all Linux nodes. Please follow below steps to install and configure krb-user in Linux node.

Ubuntu 14.04 LTS / Ubuntu 16.04 LTS

  • In your Linux node, open Terminal and install krb5 packages by executing the below command as root user.

sudo apt install krb5-user

  • During installation it’ll ask for realm, please provide domain name.

Linux command shell dialog

  • Enter the FQDN of Kerberos server.

Linux command shell dialog

  • Enter the FQDN of administrative server.

Linux command shell dialog

CentOS7 / Red Hat

  • In your Linux node, open Terminal and install krb5 packages by executing the below command as root user.

sudo yum install krb5-workstation

CentOSKrb5Installation dialog

Trouble shooting steps

If you have faced any issues during cluster creation, please ensure the followings.

  • Ensure krb5 file configuration. To open krb5.conf file, Execute below command in terminal

sudo nano /etc/krb5.conf

Krb-user-edit dialog

Now make sure the following things are configure properly in the krb5.conf file

[libdefaults]

default_realm = SYNCTEST.COM

[realms]

SYNCTEST.COM = {

kdc = syncserver21.synctest.com

admin_server = syncserver21.synctest.com

default_domain = syncserver21.synctest.com

}
[dns_domain_realm]

.syncserver21.synctest.com = SYNCTEST.COM

syncserver21.synctest.com = SYNCTEST.COM

  • Ensure that you can ping from client machine to AD machine and vice-versa.

  • Now execute the following command for ticket generation

kinit username

  • Use klist command to view the ticket.

List-ticket dialog