Monday, February 11, 2013

Install Hadoop on windows 7

Hadoop distribution is not available for windows so we have to tweak few things to get our work done.

Moreover this was more on personal learning so we don't have luxury to have multiple machines, so we'll have single node cluster (not fair to call it cluster though)

If everything goes fine which is not always that straight though, we should see below ui successfully  which would be like aaha moment !!

Name Node UI once successfully installed





Prerequisite :

Eclipse 3.3.2 Europa 
Hadoop 0.19.1 
Jdk 1.6 
Cygwin
Dell Latitude Win 7 i586, 2 GB
.sh file editor like EditRocket

Please note lot of people facing problems in different versions (Europa/Hadoop etc), above combination is tested successfully, so if you are like one of those who want to test latest, do it @ your own risk.

Tuesday, January 29, 2013

Introduction to Big Data

Big data are high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.


Basically it has 3 main features which differentiate it from traditional data :
1. Velocity
2. Volume
3. Variety

Examples of Big Data includes web logs, sensor networks (RFID tags), social networks, big social data analysis, Internet text and documents, Internet search indexing, call detail records, astronomy, atmospheric science, genomics, bio geochemical  biological, and other complex and often interdisciplinary scientific research, military surveillance, forecasting drive times for new home buyers, medical records, photography archives, video archives, and large-scale e-commerce.


To be continued ..