To become more familiar with distributed compute technologies, such as Hadoop and Spark, I decided a hands on approach was best. I discovered that Raspberry Pi computers were powerful enough to manage the task of running Hadoop and Spark. From there I purchased a few required resources and got started.
I followed a combination of walkthroughs on setting up Hadoop and Spark in a cluster configuration. I used four Raspberry Pi 4Bs in headless mode running Ubuntu.