MapReduce execution with Oozie

In this example, we are going to enhance the Yahoo! Finance – Map Reduce to be executed with help of Oozie.

In order to run a Map-Reduce job through Oozie, the jar file should be placed in a structured directory with supporting files for Oozie. Below is the folder structure:

Oozie Co-Ordinator

All oozie jobs should have the above directory structure where:
lib is a directory which consists of the jar file required to run map and reduce jobs is a file which consists of parameters required to run the job
workflow.xml is a file which consists of complete procedural flow of how the job needs to be run with input and output paths

Execution Procedure
1) Transfer the Project_Directory “yahoofinance” to hdfs using the below syntax:

$ hadoop fs –put yahoofinance /

2) Now from prompt, go to Oozie directory which consists of oozie execution file

$ cd /usr/lib/oozie/bin

3) Execute the below command to run the program through oozie:

$ oozie job –oozie http://localhost:110000/oozie -config /home/cloudera/yahoofinance/ -run 

4) To check the information regarding the workflow:

$ oozie job –oozie http://localhost:110000/oozie -info <job_number>

5) To check the logs of the workflow:

$ oozie job –oozie http://localhost:110000/oozie -log <job_number>

Running the workflow through Oozie will generate the below files:
1) Output file
2) Success file
3) Log file

All the source code with output is present in Map Reduce execution with Oozie of GitHub.

Tagged with: , , , , , , , ,

Leave a Reply