Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
J**W
Only 1 chapter in, but so far so good.
Being as this book is brand new and has no reviews, I wanted to give my first impression.I have only finished the first chapter, but so far so good. This is professionally written and there is a bit of humor, so the read isn’t dry and boring. He covers background, history and related topics to give you a good understanding, but not in too much detail that you want to skip it.I will try to update my review as I progress through the book. But my initial feelings says to go for it if you are interested in “big data.”
D**B
Nicely done
Wishing to learn Spark, I signed up for Databricks Associate Spark Developer certification exam - Python flavor - and ordered off Amazon a number of Spark books, avoiding Scala-based titles, and older titles pre-dating the DataFrame API. I ended up with the following list:"Learning PySpark" by Drabas and Lee, published by Packt in 2017"Frank Kane's Taming Big Data with Apache Spark and Python" by (no surprise) Kane, Packt, 2017"Data Analytics with Spark Using Python" by Aven, Addison Wesley, 2018"PySpark Cookbook" by (once again) Drabas and Lee, Packt, 2018"Developing Spark Applications with Python" by Morera and Campos, self-published in 2019"PySpark Recipes" by Mishra, Apress, 2017"Learning Spark" by Damjil et al., O'Reilly, 2020"Beginning Apache Spark Using Azure Databricks" by Ilijason, Apress, 2020"Spark: The Definitive Guide" by Chambers and Zaharia, O'Reilly, 2018Databricks themselves point to "Learning Spark" and "Spark: The Definitive Guide" as preparation aids, so I started with these, skimming both books - and strongly preferring "The Definitive Guide" - and then took a look at the others.Ilijason's book is a pleasant surprise. In my eyes, Apress used to publish decent technical books but is now competing with Packt in the race to the bottom. Mishra's book in my list is an example of that trend. Ilijason's however, is an exception. Far from being a value-reducing copy-paste of Spark docs seen in many other books, here I see an original, well-constructed, accessible narrative, explaining both Spark and the practicalities involved in actually getting started with Spark. (I agree that Databricks is the easiest route). The book is undermined by low production values - a few screenshots would be useful, and occasionally you see an odd line break confuse a code snippet - but I am not going to deck a star because of that. "Beginning Apache Spark Using Azure Databricks" is the best available "lite", hands-on introduction to Spark. Get the Chambers-Zaharia book as well.
M**A
Hard to follow with only text
The book has almost no screenshots, making the read annoyinly hard to follow. The examples should have both the commands and screenshots of the results. Even the SPARK UI section has no screenshot, only text!
K**E
Excellent introduction
Normally I don't bother reviewing study books but this one was really great - introduces the key concepts, use cases, and allows you to be flexible in the languages or tools you choose to use with databricks.
C**S
It’s a bit poor
I don’t think this book is organised well nor does it have enough depth on the subject matter especially on spark and using Databricks. Read in all but 2 days and threw it in the bin.
D**E
Excellent!!
The author has taken care to teach the hard stuffs in a really simple manner. If you know the basics of SQL and are trying to make a career switch to the area of Data Engineering or Analytics, this book will be the best start to learn Databricks and explore the immense world of data.
Trustpilot
1 month ago
1 week ago