Sqoop

Apache Sqoop is a tool that allows for the easy transfer of data between HDFS and relational data stores. Such transfers can be in either direction. For additional details on Sqoop please refer theĀ Sqoop website. Cluster Manager offers a simple UI to submit Sqoop jobs.

Add Connection

Adding a connection to a server is straightforward. Click the create dropdown,select Connection from it and enter server details in the pop up shown like below and click the add button after choosing appropriate JDBC driver from the dropdown. You can edit the saved connection.

Submit Sqoop Job

To submit a Sqoop job, click the create dropdown and select Job from it. Provide job name,choose database server connection and type of job (import / export) and click next button.

Provide the database name, HDFS directory path, mappers count, required delimiter and specify arguments if required. If you want to choose only upto specific rows use limit to specify it. Check all tables,if you would like to import all tables or else specify the table name explicitly for importing. Click submit button to run the job.

You can use predefined arguments list for Sqoop import and export jobs and set values based on your use cases.

Similarly you can create and run Export jobs and also you can edit and rerun the jobs multiple times.