bigfunctions > json_column_schema
json_column_schema¶
Call or Deploy json_column_schema
?
✅ You can call this json_column_schema
bigfunction directly from your Google Cloud Project (no install required).
- This
json_column_schema
function is deployed inbigfunctions
GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error). - Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework
Public BigFunctions Datasets:
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Description¶
Signature
json_column_schema(data)
Description
Returns the schema of a json column .
- The schema of the json column is the union of schemas of the json objects of each cell.
schema
is a json object like{path: type}
withpath
the path of the nested fieldtype
among (string
,numeric
,bool
,date
,timestamp
,array
)
Examples¶
with sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
)
select bigfunctions.eu.json_column_schema(data)
from sample_data
with sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
)
select bigfunctions.us.json_column_schema(data)
from sample_data
with sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
)
select bigfunctions.europe_west1.json_column_schema(data)
from sample_data
+------------------------------------------------------------------------+
| schema |
+------------------------------------------------------------------------+
| {"created_at": "date", "user.name": "string", "user.friends": "array"} |
+------------------------------------------------------------------------+
Need help using json_column_schema
?
The community can help! Engage the conversation on Slack
For professional suppport, don't hesitate to chat with us.
Found a bug using json_column_schema
?
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
For professional suppport, don't hesitate to chat with us.
Use cases¶
You have a BigQuery table containing a JSON column, and you want to understand the structure and data types within those JSON objects. However, the JSON data isn't entirely consistent across all rows; some rows might have additional fields or different data types for the same field. json_column_schema
helps you discover all the possible fields and their respective types present in the entire column.
Scenario: You're analyzing event data stored as JSON in BigQuery. Each event might have slightly different properties. For instance:
{"event_type": "page_view", "url": "/home", "user_id": 123}
{"event_type": "purchase", "item_id": 456, "user_id": 456, "value": 10.99}
{"event_type": "sign_up", "user_id": 789, "referral_code": "ABC"}
Use Case with json_column_schema
:
WITH event_data AS (
SELECT JSON '{"event_type": "page_view", "url": "/home", "user_id": 123}' AS event_json
UNION ALL
SELECT JSON '{"event_type": "purchase", "item_id": 456, "user_id": 456, "value": 10.99}' AS event_json
UNION ALL
SELECT JSON '{"event_type": "sign_up", "user_id": 789, "referral_code": "ABC"}' AS event_json
)
SELECT bigfunctions.us.json_column_schema(event_json) AS schema
FROM event_data;
Result:
{"event_type": "string", "url": "string", "user_id": "numeric", "item_id": "numeric", "value": "numeric", "referral_code": "string"}
Benefits:
- Schema Discovery: You automatically identify all potential fields and their most general types within the JSON column without manual inspection.
- Data Validation: You can use the generated schema to validate incoming data or build data quality checks.
- Downstream Processing: Knowing the complete schema helps you design efficient queries and transformations for further analysis. For example, you could use this information to create a materialized view with extracted JSON fields.
This function is particularly useful for exploring and understanding semi-structured JSON data where the schema might not be strictly enforced or evolves over time. It allows you to programmatically handle variations in JSON structure.
Spread the word¶
BigFunctions is fully open-source. Help make it a success by spreading the word!