Making Sense of Big Data File formats * Modern applications generate and manipulate a lot of data.
The growth rate of the data is staggering. Unfortunately, large datasets can be expensive to store at large scale and also slow to process. In fact, memory speed has been evolving at a much lower rate in comparison to CPUs. Thankfully, there are various file formats suited for big data systems to help. In this webinar, you will learn about popular file formats suitable for big data systems with a focus on Parquet. Through live coded examples in Python, you will learn the good, the bad, the ugly, and how you can make use of Parquet in practice.
Raoul-Gabriel Urma is the director of Cambridge Spark, a leading learning community for data scientists and developers in UK. In addition, he is also Chairman and co-founder of Cambridge Coding Academy, a growing community of young coders and pre-university students. Raoul is author of the bestselling programming book “Java 8 in Action” which sold over 25,000 copies globally. Raoul completed a PhD in Computer Science at the University of Cambridge. Raoul has delivered over 100 technical talks at international conferences. He has worked for Google, eBay, Oracle, and Goldman Sachs. He is also a Fellow of the Royal Society of Arts.