Introducing Dune SQL
DuneSQL is our new engine for querying Dune’s V2 data in a faster, more powerful and user-friendly way.
TLDR:
- We are investing heavily in DuneSQL to make the fastest, most powerful and convenient crypto data querying experience on the planet.
- DuneSQL will eventually be the only query engine on Dune.
- We have shipped a tool to help Wizards migrate their queries to DuneSQL. We’re committed to helping make the transition to DuneSQL as smooth as possible for Wizards.
- We acknowledge that we have not communicated enough and well enough to the community about the new query engine. We are sorry about that and promise to do better from now on.
We first announced Dune SQL at DuneCon in Berlin, September 2022. Dune SQL is a query engine and SQL flavor tailored to query experience on Dune. After having launched it in alpha in December, it quickly became the most popular option for querying on Dune, and this week we’re incredibly proud to promote Dune SQL into the default querying experience on Dune. A proper announcement will follow, but I wanted to talk through the changes and the direction Dune is taking, to give you all context.
Writing and executing SQL is at the core of the user experience at Dune.We have a deep understanding of blockchain datasets and the queries the community runs on top. Instead of putting crypto data on off the shelf enterprise query providers we can provide a better experience by investing in our own query engine. Running and modifying not just the infrastructure, but also the software that runs queries means that we can innovate and deliver value faster.
Dune SQL is mostly based on Trino, an open source query engine, and accesses exactly the same data as the Spark SQL engine. Ergonomically, querying in Dune SQL is much like querying in Spark SQL, the main difference being that some functions have different names! More details in our docs.
Dune SQL features many Dune specific enhancements and improvements. Since launching in Alpha in December we’ve enabled
- Using other queries as views
- Querying byte arrays directly using 0x-encoding: `WHERE from=0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045` (notice the lack of quoting or lowercasing)
- Full wei-level precision calculations via UINT256 and INT256 data types.
- ~30% speedup over the alpha launch configuration through better data types.
We’re also incredibly excited about our immediate roadmap for Dune SQL:
- Using other queries as materialized views to unlock incredible speedups and new use cases
- Enabling the wizard community to contribute user-defined functions to Dune SQL.
- Query your own data sources directly from Dune.
- Upload data to Dune SQL
- Ability to select performance level for your query
Dune SQL has been incredibly well received by our community of wizards, and we’re excited to work with everyone to make it even better. However, rolling out Spark SQL and now Dune SQL has admittedly caused some confusion and frustration. We’ve not done a great job at communicating the long term direction, and that’s on me. Expect more and clearer communication from us going forward.
By now, you are probably aware that our Postgres based v1 platform is almost fully sunset, with only the Ethereum dataset still available. This dataset will sunset by July 2023. I’ve included way more context for why Postgres is being sunset further down in the post, if you are interested in the details.
Today, you have two choices when it comes to querying the growing list of datasets on our v2 data platform: Spark SQL and Dune SQL. Going forward, all of Dune’s efforts will be put into Dune SQL, and we are also aiming to sunset Spark SQL by July. This is because Dune SQL is the fastest and most ergonomic engine that we believe will completely change how you work with blockchain data.
Moving to Dune SQL
While there are syntax changes needed to make Postgres and Spark SQL queries run on Dune SQL, you will find that Dune SQL is faster, and has better ergonomics. In Dune SQL, you no longer need to copy common logic around as CTEs, as you can refer to common logic in other queries using our Queries as Views functionality.
That said, migrating queries from one engine to another while learning a slightly new syntax can be painful, and we know this. This is why we have built a tool to automatically help make query changes for you. This is now available in the query editor with some limited functionality, and we’ll iterate on making it better. The tool will be open sourced, so that wizards can help contribute fixes that in turn help other wizards.
An automated tool will likely not be able to fully translate all queries, but we aim to take away the majority of the complexities you face when moving over to Dune SQL. Our team will also be standing by to help users who are having issues moving over.
Here’s a table to summarize the status
* we are aiming to make Spellbook Dune SQL native before Spark SQL is sunset.
V1
- Dune V1 is in the middle of being phased out as Postgres is not a suitable system for the scale and complexity of our data and pipelines
- Support for everything besides Ethereum on V1 has been deprecated already and only supports editing queries. Many of these datasets are so big that our v1 data platform is struggling to fully support them.
- We’re working with teams to make sure that they can use V2, and we’re very happy to have a dialogue with you.
Dune Engine V2
- Spark SQL will be around for another 3-5 months. There’s no immediate action needed from your side if you are using Dune Engine V2 (Spark SQL), but we highly recommend not creating further queries on it.
- There will be prompts for you to migrate to Dune SQL. Again, Dune SQL has the same data, minus some spells that we are actively working to make available as soon as possible.
- Dune SQL is now the default experience. We are devoting a lot of resources to improve it.
FAQ
Q: I am using V1, but support for my dataset is disappearing:
- Start using Dune SQL! If it’s not possible for you to use Dune SQL, reach out to a Dune team member to tell us why
I am using v2 (Spark SQL), and I am confused:
- You can still use Spark SQL for 3-5 months, but Dune SQL is the preferred query engine and eventually you will have to use it.
What are the main differences between Spark SQL and Dune SQL:
- Spark SQL on dune.com is not ANSI compliant, which in effect means that it implicitly converts a lot of types for you. When using Dune SQL you may have to explicitly cast types for them to be comparable.
- Spark SQL and Trino SQL (which is what Dune SQL is built on) supports a lot of the same functionality, but their functions and keywords are a little bit different. E.g. `array_contains()` in Spark is just `contains()` in Trino.
- In our experience, the changes required to migrate an existing Spark query to Trino SQL should be relatively straightforward. And because Trino is generally faster for many of Dune’s queries, it should be faster to iterate.
- Some types are different on Dune SQL than on Spark SQL. See here
Which engine and SQL dialect should I use for Spellbook contributions?
- For the time being, Spellbook will continue running on Spark SQL. Our Spellbook CI/CD pipeline will validate all new spells or changes are compatible with Dune SQL
- We expect to migrate Spellbook to run Dune SQL in the next 3 months.
Join the SQL revolution
As you come across issues, or areas of improvement, shoot us an email at dunesql-feedback@dune.com and we will gladly take a look.