Polymarket Markets Data: A Success Story for LiveFetch Functions
Discover how Dune’s LiveFetch tool enhances dashboard analysis by integrating real-time offchain data.
Let’s say you are working on a beautiful dashboard and you've spent a lot of time on it. You are almost done: all your queries are optimized, all the graphs are in place, everything looks great, but you are missing something. It could be the volume traded from a centralized exchange, the ETH fee history directly from the QuickNode RPC, the list of tokens on CoinGecko, or stablecoin prices from Defillama. The thing is, it’s always a good idea to add external and offchain data to your analysis.
The question now is, how? You can upload a CSV with some data, but what if the data is live and constantly changing? That won’t work. You want to add external data to a query… it seems impossible, right?
No worries! The Dune team has already solved this. Say hi to the shiny new tool: LiveFetch.
LiveFetch
At its core, LiveFetch refers to a set of SQL functions that can send HTTP requests to a specified URL, effectively interacting with APIs and remote servers directly from a query. These functions allow you to fetch data in real-time from any public or private API, greatly expanding the kind of data available to Dune creators for analysis and dashboard creation.
SELECT * FROM UNNEST(
CAST(json_parse(
http_get('https://api.coingecko.com/api/v3/coins/list'))
AS array(json))) t(coin_data)
For Dune creators, having the ability to pull in external data in real-time opens up a world of possibilities. Previously, creators were restricted to analyzing onchain data available in Dune’s databases. But now, using LiveFetch, they can augment their analysis with offchain data enriching the value of their insights.
Find more information on how the LiveFetch function works on Dune’s documentation.
In this blog, I would love to share with you a success case from a real scenario using this new functionality.
Polymarket
Polymarket is a decentralized prediction market platform where users can bet on the outcomes of real-world events, such as politics, sports, or cryptocurrency trends. By leveraging blockchain technology, it ensures transparent and secure betting, allowing participants to trade shares based on their beliefs about the future. Users can profit from correct predictions, as the market prices adjust to reflect collective expectations. If you are looking for a full explanation about how Polymarket works, check this article.
Polymarket operates on the Polygon blockchain, ensuring low-cost and efficient transactions for users who place bets on various prediction markets. However, while the core trading and settlement processes occur onchain, some of the metadata — such as market descriptions, rules, and certain operational details — is stored offchain. This hybrid approach allows for flexibility and scalability, ensuring that the platform can handle large amounts of data without overwhelming the blockchain, while still maintaining transparency and decentralization for the critical aspects of the market.
One approach to get data from Polymarket is to check only onchain data related to markets. Tracking the OrderFilled event will give us the volume of each bet, and it can be used to create different charts, such as Daily and Monthly Volume or Daily and Monthly users. You can check the full dashboard here: https://dune.com/fergmolina/polymarket
In an attempt to segment onchain data and identify which market the volume was coming from, we can create a query that maps the Market ID (stored both onchain and offchain) and the market slug. Also, it can be added a flag indicating whether that market was specifically related to the US Elections.
This approach works well, as you can see in the graphs he created to differentiate volume between election-related and non-election-related markets. However, this approach may have limitations in scalability, as it likely requires running an external script and manually updating the query. Additionally, it seems to focus only on specific markets, potentially overlooking new markets that may open.
After a Polymarket market ends, the resolution data necessary to determine the outcome is stored onchain via UMA’s Optimistic Oracle (OO) using ancillaryData. However, to enhance usability and provide more detailed information, Polymarket offers the Gamma Markets API. This API indexes onchain data and provides additional market metadata, such as categorization, question, answers, ids, timestamps, market rules, and other insights, in a more accessible format. The API contains enriched data for market resolution, allowing public users to access it for research, trading interfaces, and automated trading systems through a read-only REST interface. Full documentation is available on Polymarket’s documentation site.
In summary, the Gamma Markets API provides the most crucial element we need: the relationship between a Market and its Market ID. But that’s not all — we also get the relationship between the CLOB Token IDs (stored both onchain and offchain) and the possible outcomes for each market. By combining this offchain data from the Gamma API with onchain data, we can now determine not only the exact market that generated the volume, but also the specific outcome that was bet on.
Polymarket Markets Data
We are going to fork this dashboard and, using the Gamma API, create a new one with the ability to differentiate between various markets and outcomes.
https://dune.com/fergmolina/polymarket-markets-data
To create this new dashboard, we will follow these steps:
- Use the Gamma API to fetch offchain data
- Parsed the data
- Join offchain data with onchain data to obtain our metrics
Use the Gamma API to fetch offchain data
First, we need to understand the different parameters required for the Gamma API:
- limit: The number of markets shown per call. The maximum is 100.
- volume_num_min: Many markets have very low volume, which would add a lot of noise to our analysis and could cause the query to fail due to the number of API calls (there are limits per execution; check Dune’s documentation). We set the minimum market volume to $50,000 USD.
- offset: Since Polymarket only shows 100 markets per call, we will need to make multiple calls and increment the offset by 100 to fetch a new batch of markets with each call.
An elegant solution I found for handling the different offsets was to create a CTE table called "numbers," which has a single column called "num" with rows that increment by 100. I used this CTE table with only the first 25 rows — this covers offsets from 1 to 2501, representing 26 API calls.
SELECT num
, current_date as execution_date
, http_get(concat('https://gamma-api.polymarket.com/markets?limit=100&volume_num_min=50000&offset=', cast(num as varchar))) AS json_data
FROM numbers
WHERE num BETWEEN 1 AND 2501
At the end of the query, I’ve added two more API calls. It’s likely that a newly created market will not reach the minimum $50,000 volume right away, but it’s still useful to display it in the dashboard. By using the start_date_min parameter, markets created on the day of execution will be included in the query. If the market doesn’t reach the $50,000 minimum volume by the second day, it won’t appear again in the query result.
SELECT
0 as num
, current_date as execution_date
, http_get(concat('https://gamma-api.polymarket.com/markets?limit=100&offset=1&start_date_min=', CAST(current_date as varchar), '&start_date_max=', cast(current_date + interval '1' day as varchar))) AS json_data
UNION
SELECT
0 as num
, current_date as execution_date
, http_get(concat('https://gamma-api.polymarket.com/markets?limit=100&offset=100&start_date_min=', CAST(current_date as varchar), '&start_date_max=', cast(current_date + interval '1' day as varchar))) AS json_data
A great tip I can give you for queries using the LiveFetch function is that if your data doesn’t change too frequently, it’s a good idea to create a Materialized View. This approach allows you to set the update frequency, and if someone refreshes your dashboard, it won’t trigger any API calls. There’s always a risk that the API could be down when someone refreshes the dashboard, causing the data not to display. By using a Materialized View, we minimize that risk.
You can find the query for this step here: https://dune.com/queries/4001034
Parsed the data
As you can see, we now have 28 rows that are lists of objects. We cannot use this response in a normal query; we need to parse the data to make it usable.
The query is structured into multiple steps, with the main focus on handling unparsed JSON data and transforming it into a readable table format.
Key Components:
- UNNEST(TRY_CAST(json_parse(json_data) AS array(json))) AS json_each(value):some text
- This part of the query takes the json_data column, which contains a string of raw JSON data, and parses it into an array of JSON objects using json_parse(). The TRY_CAST() function attempts to cast this parsed data into an array of JSON objects, and UNNEST() is used to expand this array into individual rows. Each row represents one JSON object from the array.
- In simple terms, this step breaks down the JSON data into its individual components, making it easier to work with each object separately.
- JSON_EXTRACT_SCALAR(market, '$.question'):some text
- This function extracts a specific value from a JSON object (in this case, the market object) as a string. The $.question refers to the path of the JSON key being extracted (in this case, question).
- JSON_EXTRACT_SCALAR returns the value as a scalar, meaning it returns a single value (as opposed to a nested object or array), ensuring that the result is in a format that can be directly used in SQL queries.
- TRY_CAST(JSON_EXTRACT(market, '$.outcomes') AS varchar) and similar expressions:some text
- The JSON_EXTRACT() function extracts data from the JSON object (like a list of outcomes or token IDs). This extracted data is then processed by TRY_CAST() to convert it into a specific type (in this case, varchar), ensuring that the data is usable in SQL.
For this step, I’ve created a completely new query using the output from the first step, but you don’t have to. You can use the LiveFetch function and parse the response in the same query. You can find the query for this step here: https://dune.com/queries/3999457
Join offchain data with onchain data to obtain our metrics
We already have a query with all the offchain data parsed, and from the previous dashboard, we have all the necessary queries to fetch Polymarket onchain data.
Different queries were created for various metrics: daily and monthly volume, daily users, etc. However, they all have one thing in common — the connection between the output from the second step and the onchain queries. This connection was made using the clob_token_id from the Gamma API and the takerAssetId and makerAssetId. Both IDs refer to the same thing: the outcome that was bet on. Using this information, we can also determine which market the bet was made in.
An example of this step is the query https://dune.com/queries/4003182/6739222, which is used to create a table of USD volume per month for each event.
Conclusion
Incorporating offchain data into onchain analysis can be the missing piece to truly enrich your dashboard and provide deeper insights. With Dune's LiveFetch functionality, creators now have the power to bring real-time, external data directly into their queries. Whether you're working with APIs like CoinGecko or Polymarket's Gamma Markets API, this tool allows you to enhance your analyses with live data that was previously inaccessible within Dune. Tools like Materialized Views offer additional optimization, reducing API call frequency and ensuring your dashboard's stability even if external services go down.
By combining onchain and offchain data, such as the market and outcome information from Gamma's API with onchain transaction data, you can create more detailed and dynamic dashboards. A great example is September 18, 2024, when the Fed held its meeting to cut rates for the first time since 2020. A lot of betting took place around this event, surpassing the US Elections in daily volume. With this dashboard that combines offchain and onchain data, creators were able to analyze the real-time behavior of users related to this event.
In the end, leveraging LiveFetch enables you to build dashboards that are more informative, scalable, and resilient, while giving you the flexibility to analyze both onchain and offchain events in one cohesive place.
About me
Hi there! 👋 I’m Fernando Molina, an Argentinian Data Engineer with years of experience across various industries including Telecommunications, Oil & Gas, Finance, and Crypto.
I’m passionate about crypto and blockchain data. I also write for my Spanish blog, "Bloque X", where I share tutorials (Python, Dune, SQL, and more) and dive into crypto, blockchain, and data analysis.
Previously, I was the Lead Data Engineer at The Defiant, working on The Terminal, a platform for crypto traders. I’ve built numerous dashboards and queries on Dune, covering projects like Polymarket, the Ethereum Foundation, DAOs (Exactly DAO, GoatDAO), micro-payment apps (Amplify), and memecoins analysis (MAGAIBA, Emilio).
Feel free to connect with me on Twitter/X or Telegram, and check out my other profiles here!
This article was created by Dune community Wizard, Fernando Molina. The views and opinions expressed in this article are solely those of the author(s) and do not necessarily reflect the official position of Dune.
If you would like to contribute to Dune blog, please contact alsie@dune.com.