thanks to Sameer Farooqui from Linkedin..
++ MapReduce Framework ++
Great 1 hour video introduction: http://nosqltapes.com/video/understanding-mapreduce-with-mike-miller
Read the famous 2004 paper from Google that kicked off the MapReduce revolution. This is a very readable paper that can be digested in about 2 - 3 hours:http://research.google.com/archive/mapreduce.html
Here's a 33 minute video on what kinds of simple things you can do with MapReduce:
http://www.cloudera.com/videos/mapreduce_algorithms
Google's MapReduce course:
http://code.google.com/edu/parallel/mapreduce-tutorial.html
++ Beginner Hadoop ++
Excellent beginner's video on understanding Hadoop, MapReduce and HDFS:
http://www.cloudera.com/protected/?resource=introduction-to-apache-mapreduce-and-hdfs
Understanding the Hadoop ecosystem:
http://www.cloudera.com/protected/?resource=apache-hadoop-ecosystem
++ HDFS ++
An easy 2-3 hour read about Hadoop's distributed File System:
http://www.aosabook.org/en/hdfs.html
++ Labs ++
Install VirtualBox on your laptop, get an Ubuntu Virtual Machine going and follow this excellent tutorial to install your first Hadoop node:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Then use this to scale your cluster to multiple nodes:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
Run a MapReduce job in Python on your cluster:
http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
++ Bonus ++
Dynamo/Cassandra is a good alternative to Hadoop/HBase and is worth as least being familiar with: http://nosqltapes.com/video/understanding-dynamo-with-andy-gross
Great 1 hour video introduction: http://nosqltapes.com/video/understanding-mapreduce-with-mike-miller
Read the famous 2004 paper from Google that kicked off the MapReduce revolution. This is a very readable paper that can be digested in about 2 - 3 hours:http://research.google.com/archive/mapreduce.html
Here's a 33 minute video on what kinds of simple things you can do with MapReduce:
http://www.cloudera.com/videos/mapreduce_algorithms
Google's MapReduce course:
http://code.google.com/edu/parallel/mapreduce-tutorial.html
++ Beginner Hadoop ++
Excellent beginner's video on understanding Hadoop, MapReduce and HDFS:
http://www.cloudera.com/protected/?resource=introduction-to-apache-mapreduce-and-hdfs
Understanding the Hadoop ecosystem:
http://www.cloudera.com/protected/?resource=apache-hadoop-ecosystem
++ HDFS ++
An easy 2-3 hour read about Hadoop's distributed File System:
http://www.aosabook.org/en/hdfs.html
++ Labs ++
Install VirtualBox on your laptop, get an Ubuntu Virtual Machine going and follow this excellent tutorial to install your first Hadoop node:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
Then use this to scale your cluster to multiple nodes:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
Run a MapReduce job in Python on your cluster:
http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
++ Bonus ++
Dynamo/Cassandra is a good alternative to Hadoop/HBase and is worth as least being familiar with: http://nosqltapes.com/video/understanding-dynamo-with-andy-gross
No comments:
Post a Comment