Online Tools
Databricks is an analytic platform that enables you to analyse big data more easily by providing a unified framework for building data pipelines through Apache Spark. Notebooks support Python, Scala, SQL or R as their primary language. You can also integrate HTML, other libraries and graphics packages such as ggplot, d3 and matplotlib into Databricks.
There are extensive online resources available on the internet covering these topics including courses, online communities and training videos. We have listed our favourite online tutorials and websites below to help you get started with coding and also get a broader understanding of Databricks. As you are probably aware, this is a very dynamic and constantly changing environment, so you should keep your eyes open for new uploads and materials.
Links to Tutorials for Beginners
(Note that these are generic examples. Databricks itself will implement its own versions of these languages which may have variations from what is available in these tutorials.)
Structured Query Language (SQL)
This youtube link is aimed for beginners and/or intermediate researchers looking to learn the basics of Structured Query Language (SQL).
R Programming
This youtube link is an online course offering an introduction to R Programming. This is perfect for beginners or those who are looking for a refresh.
Python
This is SIRCA’s favourite Python self-learning site. Use it as an educational tool to harness your current coding knowledge, practice and develop more Python skills.
SIRCA’s Favourite Databricks Websites & Videos
Writing Scala in Spark Notebook
The Spark Notebook provides Data Scientists with an interactive web-based editor that can combine Scala code, SQL queries, Markup and JavaScript in a collaborative manner to explore, analyse and learn from massive data sets.
Working with Notebooks
Notebooks are one interface for interacting with Databricks. Watch the video link above for learning the benefits and useful tips on how to navigate through Notebooks.
Visualisation
Databricks can create a number of charts with a single click as well as support popular third-party libraries such as ggplot, d3, and matplotlib so that you can create your own custom visualisations. Learn how to visualise your data on the Databricks platform by watching this video link above.
Data Exploration on Databricks
Learn how to go from data source to visualisation in a few easy steps. This video link will inform you on how to take semi-structured logs, easily extract and transform them, analyse and visualise the data using Spark SQL, so we can quickly understand our data.
Apache Spark
The above link will take you through to a gentle introduction to the Databricks Apache Spark. You can read information on Spark terminology, the environments of which it is used, how to access and use collaborative tools as well as how to code within it.
Databricks Youtube Channel
The Databricks Youtube Channel offers countless useful training videos and related videos that will provide you with a thorough understanding of Databricks and the products/services they can offer. For the latest developments we recommend to explore their channel.
FAQs
Databricks FAQs
For frequently asked questions that specialise only on the technology platform, click here.