Monday, November 23, 2015

Reduce Side Join - Hadoop MapReduce



Design Pattern - REDUCE Side Join 


You will use reduce side join if you are using more than one dataset and both of them are equally big.

Dataset to be used

File Name - customerDetails.txt

Name CustomerId

Example -

Aaron Hawkins,296334
Aaron Smayling,814503
Adam Bellavance,960803
Adam Hart,157942
Adam Shillingsburg,713629
Adrian Barton,525624
Adrian Hane,434995
Adrian Shami,813495


Filename - customerTransaction.txt

 transaction details......  , Name , ......

Example - 

1,3,13/10/2010,Low,6,261.54,0.04,Regular Air,-213.25,38.94,35,Muhammed MacIntyre,Nunavut,Nunavut,Small Business,Office Supplies,Storage & Organization,"Eldon Base for stackable storage shelf, platinum",Large Box,0.8,20/10/2010
49,293,01/10/2012,High,49,10123.02,0.07,Delivery Truck,457.81,208.16,68.02,Barry French,Nunavut,Nunavut,Consumer,Office Supplies,Appliances,"1.7 Cubic Foot Compact ""Cube"" Office Refrigerators",Jumbo Drum,0.58,02/10/2012
50,293,01/10/2012,High,27,244.57,0.01,Regular Air,46.71,8.69,2.99,Barry French,Nunavut,Nunavut,Consumer,Office Supplies,Binders and Binder Accessories,"Cardinal Slant-D® Ring Binder, Heavy Gauge Vinyl",Small Box,0.39,03/10/2012
80,483,10/07/2011,High,30,4965.7595,0.08,Regular Air,1198.97,195.99,3.99,Clay Rozendal,Nunavut,Nunavut,Corporate,Technology,Telephones and Communication,R380,Small Box,0.58,12/07/2011
3866,27559,30/10/2011,High,38,465.9,0.05,Regular Air,79.34,12.28,4.86,Aaron Hawkins,Nova Scotia,Atlantic,Home Office,Office Supplies,Paper,Xerox 1933,Small Box,0.38,31/10/2011
Here we will attach some marker to output values of each mapper so that in Reducer we can identify which mapper has emitted that output.
MapReduce Program :






























No comments:

Post a Comment