Abstract

Hadoop is a hot topic right now, and Microsoft is betting that Hadoop and SQL Server will go together like peanut butter and chocolate. This presentation will spend a few minutes on the theory behind Hadoop, but the rest will be a demo-driven look at how to get a handle on Hadoop before somebody introduces it into your enterprise.


Slides

The slides are available in HTML 5 format. All modern browsers (including tablets and phones) should be able to navigate the slides successfully.

The slides are licensed under Creative Commons Attribution-ShareAlike.


Demo Code

The demonstration code is available on my GitHub repository. The solution and scripts give you an opportunity to play with Hadoop. You should first install the HDInsight Emulator.

The source code is licensed under the terms offered by the GPL. The slides are licensed under Creative Commons Attribution-ShareAlike.


Links And Further Information

Hadoop Implementations

If you want to get started with Hadoop, there are a number of pre-built sources available to you. Here are some that I can recommend through personal experience; I'm sure there are a lot of others out there (especially in the cloud services space), but these are ones I've at least played with.

Local sandboxes:

Online services:

Interesting Links

In my demo, I make mention of a tool by Darrin Ward to plot by latitude and longitude on Google Maps. It also saves your work, so you can investigate more closely the results from my demo.

I make use of an excellent image from the MapReduceFoundation at Universit├Ąt Passau. They have a few academic papers that I admittedly haven't read but which do sound interesting.

I make an off-hand mention of a Serializer/De-serializer (SerDe) for Hive which processes CSVs better than Hive's default. You can get csv-serde from Github and plug it into your Hadoop cluster.

Books and Other Resources

At this time, I can recommend the following books on Hadoop. Please note that with a fast-evolving technology, a book which is even a couple years old can feel aged.

At this time, I have not read the following books, but I do own copies of them. As I get through these books, I'll shift the ones I recommend up to the recommended section.

I learned a good deal from the Hortonworks tutorials, which include both written and video tutorials. They are a good place to start.