Sqoop

Sqoop allows to easily import data from relational databases into HDFS.

We have already deployed the Sqoop connectors for the following databases:

  • MySQL / MariaDB

  • PostgreSQL

  • Microsoft SQL Server

  • Oracle 18c

This way, out of the box you can use the Sqoop tool to import data from any of these databases:

sqoop import \
    --username ${USER} --password ${PASSWORD} \
    --connect jdbc:postgresql://${SERVER}/${DB} \
    --table mytable \
    --target-dir /user/username/mytable \
    --num-mappers 1

Note

We recommend that you use only one mapper process to avoid overloading your database.

If you need to import data from a different database don’t hesitate to contact us.

For further information on how to use Sqoop you can check the Sqoop Tutorial that we have prepared to get you started and the Sqoop Guide in the CDH documentation.