Apache Sqoop Cookbook

Apache Sqoop Cookbook

Jarek Jarcec Cecho


Integrating information from a number of resources is key within the age of massive info, however it could be a tough and time-consuming activity. this useful cookbook presents dozens of ready-to-use recipes for utilizing Apache Sqoop, the command-line interface program that optimizes info transfers among relational databases and Hadoop.

Sqoop is either robust and bewildering, yet with this cookbook’s problem-solution-discussion structure, you’ll quick the way to set up after which observe Sqoop on your atmosphere. The authors supply MySQL, Oracle, and PostgreSQL database examples on GitHub so you might simply adapt for SQL Server, Netezza, Teradata, or different relational systems.

  • Transfer facts from a unmarried database desk into your Hadoop ecosystem
  • Keep desk facts and Hadoop in sync by means of uploading facts incrementally
  • Import information from a couple of database table
  • Customize transferred information by means of calling a number of database functions
  • Export generated, processed, or backed-up info from Hadoop for your database
  • Run Sqoop inside of Oozie, Hadoop’s really good workflow scheduler
  • Load info into Hadoop’s info warehouse (Hive) or database (HBase)
  • Handle set up, connection, and syntax matters universal to express database vendors

Show sample text content

Download sample