Big Data Strategy 2015 conference took place in Vilnius on October 5th, 2015. Vinted engineers gave two talks about our Hadoop -based data warehouse, metric computation, and lessons learnt while implementing it.

Rocky road to Big Data analytics

Saulius and Jonas gave a talk about rewriting our initial, MySQL-based analytics solution in order to cope with ever-growing amount of data. This was a ‘lessons learnt’ talk, in which we described how we:

  • Designed our Apache Kafka-based data ingestion pipeline,
  • Evaluated several SQL-on-Hadoop engines in order to compute business metrics,
  • Arrived at Apache Spark and Apache HBase duo as our final stack for computing and serving the metrics,
  • How our analysts use notebooks like Apache Zeppelin for interactive data analysis.

Check out the slide deck if you want to learn more:

Building a Simple, Flexible and Scalable Data-cubing Solution with Spark, Algebird and HBase

In the second talk, Vidmantas described how we implemented our own data cubing solution. It now allows our users to interactively ‘slice and dice’ business metrics by various dimensions. The talk detailed how we:

  • Perform metric computations using the brilliant Algebird library from Twitter,
  • Store and serve precomputed metrics to our users using Apache HBase,
  • Implemented simple data cubing using Apache Spark and made it fast.

Check out the slide deck if you want to learn more:

We are grateful to our friends at Adform for inviting us to speak. See you at Big Data Strategy 2016!