Skip to content

bigfunctions > get_json

get_json

Call or Deploy get_json ?

✅ You can call this get_json bigfunction directly from your Google Cloud Project (no install required).

  • This get_json function is deployed in bigfunctions GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error).
  • Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions --> Read Getting Started. This is particularly useful if you want to create private functions (for example calling your internal APIs).
  • For any question or difficulties, please read Getting Started.
  • Found a bug? Please raise an issue here

Public BigFunctions Datasets are like:

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...

Description

Signature

get_json(url, headers)

Description

GET json data from url

Examples

select bigfunctions.eu.get_json('https://api.github.com/repos/unytics/bigfunctions', null)
select bigfunctions.us.get_json('https://api.github.com/repos/unytics/bigfunctions', null)
select bigfunctions.europe_west1.get_json('https://api.github.com/repos/unytics/bigfunctions', null)
+-------+
| data  |
+-------+
| {...} |
+-------+

Use cases

A use case for the get_json function is enriching data in BigQuery with information from an external API.

Scenario: You have a table in BigQuery containing information about GitHub repositories, including their names. You want to enrich this data with details like the number of stars, forks, and open issues for each repository. The GitHub API provides this information in JSON format.

Implementation:

  1. BigQuery Table: Let's assume your BigQuery table is named github_repos and has a column named repo_name containing the names of the repositories (e.g., "unytics/bigfunctions").

  2. BigQuery Query using get_json: You can use the following query to fetch data from the GitHub API and extract the desired information:

SELECT
    repo_name,
    JSON_EXTRACT_SCALAR(get_json(CONCAT('https://api.github.com/repos/', repo_name), NULL), '$.stargazers_count') AS stars,
    JSON_EXTRACT_SCALAR(get_json(CONCAT('https://api.github.com/repos/', repo_name), NULL), '$.forks_count') AS forks,
    JSON_EXTRACT_SCALAR(get_json(CONCAT('https://api.github.com/repos/', repo_name), NULL), '$.open_issues_count') AS open_issues
FROM
    `your-project.your-dataset.github_repos`;

Explanation:

  • CONCAT('https://api.github.com/repos/', repo_name) dynamically constructs the URL for each repository's API endpoint.
  • get_json(..., NULL) fetches the JSON data from the constructed URL. The second argument NULL indicates that no custom headers are needed for this request.
  • JSON_EXTRACT_SCALAR(..., '$.stargazers_count') extracts the value associated with the key stargazers_count from the JSON response. Similarly, we extract forks_count and open_issues_count.
  • Remember to replace your-project.your-dataset.github_repos with the actual path to your BigQuery table and select the correct regional dataset for bigfunctions (e.g., bigfunctions.us, bigfunctions.eu).

Result: This query will produce a new table with the original repo_name and the newly fetched stars, forks, and open_issues columns.

This example demonstrates how get_json can be used to integrate external API data directly into BigQuery for analysis and reporting, avoiding the need for intermediate data extraction and loading steps. You can adapt this pattern to interact with other APIs that provide JSON data.