list_public_datasets¶
list_public_datasets()
Description¶
Returns list of BigQuery public_datasets
Usage¶
Call or Deploy list_public_datasets
?
Call list_public_datasets
directly
The easiest way to use bigfunctions
list_public_datasets
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy list_public_datasets
in your project
Why deploy?
- You may prefer to deploy
list_public_datasets
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
list_public_datasets
function can be deployed with:
pip install bigfunctions
bigfun get list_public_datasets
bigfun deploy list_public_datasets
Examples¶
select bigfunctions.eu.list_public_datasets()
select bigfunctions.us.list_public_datasets()
select bigfunctions.europe_west1.list_public_datasets()
+---------------------------------------------------------------------------------------------------+
| public_datasets |
+---------------------------------------------------------------------------------------------------+
| [
"bigquery-public-data.america_health_rankings",
"bigquery-public-data.austin_311",
...
]
|
+---------------------------------------------------------------------------------------------------+
Use cases¶
A use case for the list_public_datasets
BigQuery function is to dynamically discover and explore the available public datasets in BigQuery. This can be useful for several scenarios:
-
Data Discovery and Exploration: A data analyst or scientist might want to explore what public datasets are available for research or analysis without manually browsing the BigQuery UI or relying on outdated documentation. This function provides a quick and programmatic way to get a list of all public datasets.
-
Automated Data Pipelines: In an automated data pipeline, you could use this function to check for the existence of a specific public dataset before attempting to query it. This adds robustness to your pipeline, handling cases where a dataset might be temporarily unavailable or renamed.
-
Building a Data Catalog: You can use the output of this function to populate a custom data catalog or metadata store. This allows you to maintain an internal index of available public datasets with additional metadata, such as descriptions or tags.
-
Interactive Data Exploration Tools: A web application or interactive notebook could use this function to present users with a list of available public datasets to choose from for analysis.
-
Training and Education: In a training environment, this function can be used to quickly demonstrate the breadth of publicly available data in BigQuery, allowing students to explore different datasets.
Example Scenario:
Let's say a data analyst wants to build a dashboard showing trends in cryptocurrency prices. They know there are several public datasets related to cryptocurrency, but they're not sure of the exact names or what data is available. They can use the list_public_datasets
function to get a list of all public datasets. Then, they can filter that list (perhaps using a regular expression) to find datasets related to cryptocurrency and explore their schemas to determine which datasets are suitable for their dashboard.
Code Example (Illustrative):
SELECT dataset_id
FROM UNNEST(bigfunctions.us.list_public_datasets()) AS dataset_id
WHERE REGEXP_CONTAINS(dataset_id, r'cryptocurrency');
This query would return all public datasets containing the term "cryptocurrency" in their ID, allowing the analyst to quickly identify relevant datasets.
Need help or Found a bug?
Get help using list_public_datasets
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about list_public_datasets
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.