Sqoop job in production system

In real-time production system Sqoop job has option file with password protection.
Sqoop job can be run with Oozie workflow also that I can write post separately. In this post explaining the steps to creating the password and option files then how to run Sqoop job.

sqoop-job-in-production

Image Courtesy: http://www.slideshare.net/wlangiewicz/2014-hadoop-wrocaw-jug

Defining the Password file
Password file usually use like below in Sqoop command.

sqoop-job-password-fileThe Password file stored in HDFS not in local system. This password file given 400 (read+write -Only for owner) permission so that it can not be expose to other users. Here is the steps to create password file.

1. vi sqoop.pwd — Enter the password and save it

2. Transfer the sqoop.pwd file to HDFS by the command
hadoop dfs –put sqoop.pwd hdfs:/sqoop.pwd

3. Change the ownership of sqoop.pwd file for more security.
hadoop dfs –chown 400 sqoop.pwd

4. Remove the sqoop.pwd in current directory(non-hdfs) by
rm sqoop.pwd

Option File Creation
Option file has common commands that using for running the Sqoop in command line interface. Option file can be created through vi editor.
————————–
Option-file.txt
————————–
Import
–connect
jdbc:mysql://hostname-or-ip/<dbname>
–username
<username>
–password-file
/sqoop.pwd

Save this file.

This option file can have the comments and spaces which usually ignored by Sqoop.

Final Run
After both option file and password file created and placed. The Sqoop command is running like below

sqoop-job-in-production1

Leave a Reply