So I've been learning Hadoop using Tom White's book. Chapter 2 gives the first sample MapReduce program.
First off, for getting the NCDC data set, I found the script listed at this page immensely useful
https://gist.github.com/Alexander-Ignatyev/6478289
Note the updated FTP server listed in the comments section.
I downloaded a few of the year.gz files and then did
Anyhow, I am using Eclipse to code the MaxTemperature example. So I created a project for MaxTemperature. To add the jar files so that eclipse & java will recognize the org.apache.hadoop.* packages,
Right Click on the project > Properties > Java Build Path > Libraries
Click the "Add External JARs..." option, and add
.
First off, for getting the NCDC data set, I found the script listed at this page immensely useful
https://gist.github.com/Alexander-Ignatyev/6478289
Note the updated FTP server listed in the comments section.
I downloaded a few of the year.gz files and then did
- gunzip *
- cat * > sample.txt
Anyhow, I am using Eclipse to code the MaxTemperature example. So I created a project for MaxTemperature. To add the jar files so that eclipse & java will recognize the org.apache.hadoop.* packages,
Right Click on the project > Properties > Java Build Path > Libraries
Click the "Add External JARs..." option, and add
- share/hadoop/common/hadoop-common-2.X.Y.jar
- share/hadoop/mapreduce/hadoop-mapreduce-client-core.2.X.Y.jar
.
No comments:
Post a Comment