Apache Sqoop Job – How to create and run?

What is Sqoop Job?
Incremental load mode option is good in Sqoop. However it has demerits of remember the last successful modified record or time. Next time if we run the incremental load in sqoop we need to run the data where last successful imported record. We need to maintain this last successful imported id in somewhere and it is not ideal solution in automation of this entire process. Sqoop has good solution for this problem – that is Sqoop Job. We write shell script kind of program (Sqoop job) and saved it and can run it manually whenever require or by Cron it can be run automatically in specified time intervals.

A Sqoop meta store keeps track of all jobs that mean Sqoop meta store allows a Sqoop job to preserve the last successfully retrieved value. By default, the meta store is contained in user home directory under.sqoop and is only used for login user own jobs.
Creating Sqoop Job
Sqoop Job creation command is as below
sqoop-job-creationimport-employees — Sqoop Job Name.

Sqoop Job create command creates the Sqoop job and add to the meta store. Other Sqoop Job commands are listed below.

  1. List all jobs in the meta store
    $ sqoop job –list
  2. Executing the Sqoop Job
    $ sqoop job –exec <sqoop-job-name>
    During the Sqoop Job execution it will prompt for password.
  3. Shows metadata information about your job
    sqoop job –show <sqoop-job-name>

sqoop-job-propertiesIn this above metadata can see Incremental.last.value=1  this value is actually the time that the command was executed, and not the last value seen in the RDBMS table. In Real time the Sqoop Job is run by option file and hiding the password. I will cover this in next post.

Leave a Reply