The New & Improved Spark UI

Delight is a free, hosted, Spark UI alternative with new metrics and visualizations that will delight you!

Installation instructions

How It Works

Delight consists of an open-source agent running inside your Spark applications, streaming metrics to our backend. Just install our agent and you're ready to go!

1. Sign Up

Create an account on our website, then head to Settings to c your access token.

2. Install the open-sourced agent

Follow the installation instructions specific to your Spark platform on our GitHub page.

3. You’re all set

Your Spark applications will show up on our dashboard as soon as they're completed.

Frequently Asked Questions

Is Delight really free?

Yes, Delight is entirely free of charge.

Is Delight open-source?

Data Mechanics Delight consists of two main components:

1. An agent which runs within your Spark applications (SparkListener) and streams metrics in real-time to our backend. This agent is open-sourced. This is a matter of trust: we want he Spark community to audit the information collected by the agent.

2. A backend system responsible of collecting, storing, and serving the metrics necessary to Delight, as well as authentication. We don't have plans to open source the server yet.

What data does Delight collect? Is Delight Secure?

The open-sourced agent running inside your Spark application collects Spark event logs. This is non-sensitive information about the metadata of your Spark application (for example, for each Spark task there is metadata on memory usage, CPU usage, network traffic). Delight does not record any sensitive information (like the data that Spark operates on). 

This data is encrypted with your access token and sent over HTTPS to the Data Mechanics control panel. Your access token guarantees that the metrics collected will only be visible to yourself, and to your colleagues from your Google organization if you signed up with your company's Google account. 

This data is automatically deleted 30 days its collection, and it is not shared with any third party.

What is the efficiency score visible in Delight?

The efficiency ratio is calculated as the sum of the duration of all the Spark tasks, divided by the sum of the core uptime of your Spark executors. 

An efficiency score of 75% means that on average, your Spark executor cores are running Spark tasks three quarter of the time. A low efficiency score means that you are wasting a lot of your compute resources. The Data Mechanics platform automatically tunes your Spark application configurations to make them more efficient!

Can I run Delight over other platforms?

Yes, Delight works on top of any Spark platform whether it's on premise or in the cloud, a commercial platform or your own open-source setup. All it needs is the ability to make outbound internet calls to stream metrics to our backend. Our Github page has instructions for installing Delight on top of Databricks, EMR, Dataproc, Spark-Submit, Spark-on-Kubernetes operator, Apache Livy, and more!

Is Delight accessible while the Spark app is running ?

No, at this time Delight is only accessible about a minute after a Spark application has completed. Making Delight accessible in real-time for live applications is on our roadmap.

How can I report a bug or request a feature?

The timeline and roadmap of the project are on Github. Please submit a Github issue to report bugs or request improvements. We'd love to have your feedback!

Which Google account should I use to signup? What if I don't have a Google account ?

At this time, the only way to sign up is using a Google account. We will add more authentication mechanisms in the future.

You can sign up with a personal Google account or your company's Google account. We recommend the latter, as it means the Delight dashboard will be shared with your colleagues and give you a global view of your company's applications.

