
Spark Word Count program in Python
Here is the word count program in Python using Spark (pyspark) and Hadoop (hdfs). In this tutorial, you will get to know how to process the data in spark using...
No Technical, No TechMax, Refer only TechBlog
Here is the word count program in Python using Spark (pyspark) and Hadoop (hdfs). In this tutorial, you will get to know how to process the data in spark using...
Hadoop is a framework which deals with Big Data but unlike other frameworks, it's not a simple framework, it has its own family for processing different thing which is tied...
Page Rank is a function that assigns a real number to each page in the Web. The intent is that the higher the Page Rank of a page, the more...
Algorithm for Natural Join For doing Natural join, the relation R(A, B) with S(B, C), it is required to find tuples that agree on their B components, i.e, the second...
A dead end is a Web Page with no links out. The presence of dead ends will cause the Page Rank of some or all the pages to go to...
M is a matrix with element mi,j in row i and column j. N is a matrix with element nj,k in row j and column k. P is a matrix = MN...
To estimate the number of different elements appearing in a stream, we can hash elements to integers interpreted as binary numbers. 2 raised to the power that is the longest...
1) Manhattan Distance (L1) is given as , Here, = | 1 - 2 | + | 2 - 5 | + | 2 - 3 | ...
Big Data is characterized by 3 V's, they are as follow: 1) Volume, 2) Velocity, 3) Variety 1) Volume (Data at Rest) -> The name 'Big Data' itself is related...