Thursday, February 23, 2017

Sqoop1.99.7 Installation


Expecting that Sqoop would work out of box downloaded sqoop-1.99.7-bin-hadoop200.tar.gz and followed the installation instructions from sqoop 1.99.1 and others as they were the first search results and the installation instructions for various versions were available by just changing the version number.  Documentation for installation instructions were available upto version 1.99.6 at url https://sqoop.apache.org/docs/1.99.6/Installation.html.  But documentation to 1.99.7 could not be found based on similar url.  Following the available documentation encountered hadoop configuration ClassNotFoundExceptions when tried to start sqoop server.

Documentation upto 1.99.6 refers to a non-existing catalina.properties.  Web search lead to several non-working solutions.  Some of the working solutions suggested to copy required hadoop libraries to to sqoop lib directory.   Finally stumbled across the documentation for 1.99.7 installation instructions at htps://sqoop.apache.org/docs/1.99.7/admin/Installation.html.  The documentation is very clear that environment variables related to HADOOP have to be set.  Based on these instructions added HADOOP_HOME to the user profile, and encountered the following error:

Caused by: org.apache.sqoop.common.SqoopException: MAPREDUCE_0002:Failure on submission engine initialization - Invalid Hadoop configuration directory (not a directory or permission issues): /etc/hadoop/conf/

Based on the solution presented at: http://brianoneill.blogspot.com/2014/10/sqoop-1993-w-hadoop-2-installation.html, modified the property org.apache.sqoop.submission.engine.mapreduce.configuration.diretory in <SQOOP_HOME>/conf/sqoop.properties file to point to the correct hdfs configuration.

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=<location to hadoop configuration, e.g /etc/hadoop/conf>

Finally <SQOOP_HOME>/sqoop.sh server start resulted in a successful start of sqoop sever with the following output:

etting conf dir: bin/../conf
Sqoop home directory: <sqoop home>/sqoop
Starting the Sqoop2 server...
0    [main] INFO  org.apache.sqoop.core.SqoopServer  - Initializing Sqoop server.
7    [main] INFO  org.apache.sqoop.core.PropertiesConfigurationProvider  - Starting config file poller thread

Sqoop2 server started.