top of page

The Big Data Blog


Spark RDD for Small case study
While working with DataFrame & DataSet API, difficult to come back on RDD As everybody focused on spark dataframe API or Dataset API,...
May 28, 20202 min read
MapReduce Job Chaining with Bigram Count and Sorting Data elements
Bigram Count Program with Sorting data using Comparator code will be shown in this blog with details explanation. As of now we have seen...
May 9, 20204 min read
Complex Data Type In Hive
Hive: A data warehouse tool on top of Hadoop HDFS (hadoop distributed file system) support most of the useful or I should say SQL like...
May 1, 20202 min read
Mainframe (Cobol data) source in Spark
As Apache Spark became the first choice for data processing and analytics because it is scalable, easy to code, support lot of file...
Oct 24, 20191 min read


A Simple Big Data pipeline with MySql
Efficient implementation of data pipeline increase performance of Data warehouse and ease to generate quality KPI. As you already know...
Oct 19, 20191 min read


ThingsBoard : Open-source IoT Platform
Thingsboard game changer in IOT space, with open source and enterprise version both. It's easy to use open source customized version as...
Oct 19, 20191 min read


Security on Hadoop Cluster
Best practice and point to keep in mind while to setting up secure bigdata hadoop cluster. Welcome to #anydataflow. As the internet world...
Oct 19, 20191 min read


HDP 2.6 on google cloud platform in 5 minutes
Launch Hadoop cluster in google cloud platform in few minutes with less configuration and installation of tools . As you all aware that...
Oct 19, 20191 min read
bottom of page