Posts

Showing posts with the label google cloud

Hadoop Platforms: The Elephants in the Room

Image
"When there’s an elephant in the room introduce him" -Randy Paush It is common that when speaking about Big Data two major assumptions often take place: One : Hadoop comes to our minds right by its side, and many times are even considered synonyms, which they are not. While Big Data is the boilerplate concept that refers to the process of handling enormous amounts of data coming in different forms  (structured and unstructured), independent of the use use of a particular technology or tool, Hadoop is in fact, a specific open source technology for dealing with these sort of voluminous data sets. But before we continue, and as a mind refresher, let’s remind ourselves what is Hadoop with their own definition: The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering ...