What you’ll learn from this workshop:
- A basic understanding of the blockchain, how transactions are structured on the blockchain and the concept of unspent transaction output.
- How to subscribe to a high throughput stream of messages from an Apache Kafka topic, technology often used in ETL pipelines.
- The concepts of Bigtable storage for petabyte scale databases with efficient random access. We’ll utilise Apache Accumulo to store the blockchain in the workshop.
- How to perform analytics over a massive data warehouse, in our case leveraging Apache Spark to analyse the monetary transactions.What you’ll need:
- A Linux based laptop or Macbook.
- Coding will be in Java so the latest JDK installed and an IDE such as IntelliJ or Eclipse.
- Happiness programming in Java to at least a basic level.
Dan is a Big Data consultant operating predominantly as a Technical Architect. He has led the development of the UK Hydrographic Office’s Hadoop-as-a-Service offering and built software streaming frameworks long before the rise of Apache Spark and Storm. More recently he has helped build a real time alerting pipeline that processed in excess of 150,000 events per second utilising Apache Kafka and Spark Streaming.