• Home
  • About Me

Software Theory and Practice

  • Home
  • About Me

Analytics Archive

01 August 2018
Analytics

How to index geospatial data with Apache Spark Solr Connector and query with Solr Client

Alvin Henrick Leave a Comment

This post will describe how can we ingest the geospatial data into Apache Solr for search and query. The pipeline is built with Apache Spark and Apache Spark Solr connector. The purpose of this project is to ingest and index

Read More
16 May 2017
Analytics

Apache Spark Analytical Window Functions

Alvin Henrick 1 Comment

It’s been a while since I wrote a posts here is one interesting one which will help you to do some cool stuff with Spark and Windowing functions.I would also like to thank and appreciate Suresh my colleague for helping me

Read More
10 July 2016
Analytics

Apache Spark User Defined Functions

Alvin Henrick 1 Comment

I have been working with Apache Spark for a while now and would like to share some UDF tips and tricks I have learned over the past year. Below is the sample data (i.e. people.json) used  to demonstrate example of UDF

Read More
26 November 2015
Analytics

Query Nested JSON via Spark SQL

Alvin Henrick Leave a Comment

It’s been a while since I wrote a blog so here you go. I have been researching with Apache Spark currently and had to query complex nested JSON data set, encountered some challenges and ended up learning currently the best

Read More
18 August 2014
Analytics

Apache Storm and Kafka Cluster with Docker

Alvin Henrick 18 Comments

This post is all about real time analytic on large data sets. I am sure every one has heard about Apache Kafka (Distributed publish subscribe messaging broker) and Apache Storm (Distributed real time computation system.) and if you were disappointed

Read More

Search

Recent Posts

  • How to index geospatial data with Apache Spark Solr Connector and query with Solr Client August 1, 2018
  • Apache Spark Analytical Window Functions May 16, 2017
  • Apache Spark User Defined Functions July 10, 2016
  • Query Nested JSON via Spark SQL November 26, 2015
  • Docker backup and restore volume container January 26, 2015

Calendar

March 2023
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  
« Aug    
© Copyright 2015. Alvin Henrick