Skip to content

bigfunctions > json_column_schema

json_column_schema

Call or Deploy json_column_schema ?

✅ You can call this json_column_schema bigfunction directly from your Google Cloud Project (no install required).

  • This json_column_schema function is deployed in bigfunctions GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error).
  • Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework

Public BigFunctions Datasets:

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...

Description

Signature

json_column_schema(data)

Description

Returns the schema of a json column .

  • The schema of the json column is the union of schemas of the json objects of each cell.
  • schema is a json object like {path: type} with
  • path the path of the nested field
  • type among (string, numeric, bool, date, timestamp, array)

Examples

with sample_data as (

  select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
  union all
  select json '{"user": {"friends": ["Jack", "Peter"]}}' as data

)


select bigfunctions.eu.json_column_schema(data)
from sample_data
with sample_data as (

  select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
  union all
  select json '{"user": {"friends": ["Jack", "Peter"]}}' as data

)


select bigfunctions.us.json_column_schema(data)
from sample_data
with sample_data as (

  select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
  union all
  select json '{"user": {"friends": ["Jack", "Peter"]}}' as data

)


select bigfunctions.europe_west1.json_column_schema(data)
from sample_data
+------------------------------------------------------------------------+
| schema                                                                 |
+------------------------------------------------------------------------+
| {"created_at": "date", "user.name": "string", "user.friends": "array"} |
+------------------------------------------------------------------------+

Need help using json_column_schema?

The community can help! Engage the conversation on Slack

For professional suppport, don't hesitate to chat with us.

Found a bug using json_column_schema?

If the function does not work as expected, please

  • report a bug so that it can be improved.
  • or open the discussion with the community on Slack.

For professional suppport, don't hesitate to chat with us.

Use cases

You have a BigQuery table containing a JSON column, and you want to understand the structure and data types within those JSON objects. However, the JSON data isn't entirely consistent across all rows; some rows might have additional fields or different data types for the same field. json_column_schema helps you discover all the possible fields and their respective types present in the entire column.

Scenario: You're analyzing event data stored as JSON in BigQuery. Each event might have slightly different properties. For instance:

{"event_type": "page_view", "url": "/home", "user_id": 123}
{"event_type": "purchase", "item_id": 456, "user_id": 456, "value": 10.99}
{"event_type": "sign_up", "user_id": 789, "referral_code": "ABC"}

Use Case with json_column_schema:

WITH event_data AS (
    SELECT JSON '{"event_type": "page_view", "url": "/home", "user_id": 123}' AS event_json
    UNION ALL
    SELECT JSON '{"event_type": "purchase", "item_id": 456, "user_id": 456, "value": 10.99}' AS event_json
    UNION ALL
    SELECT JSON '{"event_type": "sign_up", "user_id": 789, "referral_code": "ABC"}' AS event_json
)
SELECT bigfunctions.us.json_column_schema(event_json) AS schema
FROM event_data;

Result:

{"event_type": "string", "url": "string", "user_id": "numeric", "item_id": "numeric", "value": "numeric", "referral_code": "string"}

Benefits:

  • Schema Discovery: You automatically identify all potential fields and their most general types within the JSON column without manual inspection.
  • Data Validation: You can use the generated schema to validate incoming data or build data quality checks.
  • Downstream Processing: Knowing the complete schema helps you design efficient queries and transformations for further analysis. For example, you could use this information to create a materialized view with extracted JSON fields.

This function is particularly useful for exploring and understanding semi-structured JSON data where the schema might not be strictly enforced or evolves over time. It allows you to programmatically handle variations in JSON structure.

Spread the word

BigFunctions is fully open-source. Help make it a success by spreading the word!

Share on Add a on