Skip to content

bigfunctions > list_public_datasets

list_public_datasets

Call or Deploy list_public_datasets ?

✅ You can call this list_public_datasets bigfunction directly from your Google Cloud Project (no install required).

  • This list_public_datasets function is deployed in bigfunctions GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error).
  • Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework

Public BigFunctions Datasets:

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...

Description

Signature

list_public_datasets()

Description

Returns list of BigQuery public_datasets

Examples

select bigfunctions.eu.list_public_datasets()
select bigfunctions.us.list_public_datasets()
select bigfunctions.europe_west1.list_public_datasets()
+---------------------------------------------------------------------------------------------------+
| public_datasets                                                                                   |
+---------------------------------------------------------------------------------------------------+
| [
  "bigquery-public-data.america_health_rankings",
  "bigquery-public-data.austin_311",
  ...
]
 |
+---------------------------------------------------------------------------------------------------+

Need help using list_public_datasets?

The community can help! Engage the conversation on Slack

For professional suppport, don't hesitate to chat with us.

Found a bug using list_public_datasets?

If the function does not work as expected, please

  • report a bug so that it can be improved.
  • or open the discussion with the community on Slack.

For professional suppport, don't hesitate to chat with us.

Use cases

A use case for the list_public_datasets BigQuery function is to dynamically discover and explore the available public datasets in BigQuery. This can be useful for several scenarios:

  1. Data Discovery and Exploration: A data analyst or scientist might want to explore what public datasets are available for research or analysis without manually browsing the BigQuery UI or relying on outdated documentation. This function provides a quick and programmatic way to get a list of all public datasets.

  2. Automated Data Pipelines: In an automated data pipeline, you could use this function to check for the existence of a specific public dataset before attempting to query it. This adds robustness to your pipeline, handling cases where a dataset might be temporarily unavailable or renamed.

  3. Building a Data Catalog: You can use the output of this function to populate a custom data catalog or metadata store. This allows you to maintain an internal index of available public datasets with additional metadata, such as descriptions or tags.

  4. Interactive Data Exploration Tools: A web application or interactive notebook could use this function to present users with a list of available public datasets to choose from for analysis.

  5. Training and Education: In a training environment, this function can be used to quickly demonstrate the breadth of publicly available data in BigQuery, allowing students to explore different datasets.

Example Scenario:

Let's say a data analyst wants to build a dashboard showing trends in cryptocurrency prices. They know there are several public datasets related to cryptocurrency, but they're not sure of the exact names or what data is available. They can use the list_public_datasets function to get a list of all public datasets. Then, they can filter that list (perhaps using a regular expression) to find datasets related to cryptocurrency and explore their schemas to determine which datasets are suitable for their dashboard.

Code Example (Illustrative):

SELECT dataset_id
FROM UNNEST(bigfunctions.us.list_public_datasets()) AS dataset_id
WHERE REGEXP_CONTAINS(dataset_id, r'cryptocurrency');

This query would return all public datasets containing the term "cryptocurrency" in their ID, allowing the analyst to quickly identify relevant datasets.

Spread the word

BigFunctions is fully open-source. Help make it a success by spreading the word!

Share on Add a on