sql_to_flatten_json_column¶
sql_to_flatten_json_column(data, fully_qualified_column)
Description¶
Generate the SQL to flatten a json column
into multiple flat columns.
Usage¶
Call or Deploy sql_to_flatten_json_column
?
Call sql_to_flatten_json_column
directly
The easiest way to use bigfunctions
sql_to_flatten_json_column
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy sql_to_flatten_json_column
in your project
Why deploy?
- You may prefer to deploy
sql_to_flatten_json_column
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
sql_to_flatten_json_column
function can be deployed with:
pip install bigfunctions
bigfun get sql_to_flatten_json_column
bigfun deploy sql_to_flatten_json_column
Examples¶
1. Example with a temp table. Copy the code in your BigQuery console & run it!
create temp table sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
);
select bigfunctions.eu.sql_to_flatten_json_column(data, sample_data.data)
from sample_data
create temp table sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
);
select bigfunctions.us.sql_to_flatten_json_column(data, sample_data.data)
from sample_data
create temp table sample_data as (
select json '{"created_at": "2022-01-01", "user": {"name": "James"}}' as data
union all
select json '{"user": {"friends": ["Jack", "Peter"]}}' as data
);
select bigfunctions.europe_west1.sql_to_flatten_json_column(data, sample_data.data)
from sample_data
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| sql |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| select
*,
cast(json_value(`data`, '$.created_at') as date) as created_at,
json_extract(`data`, '$.user.friends') as user__friends,
cast(json_value(`data`, '$.user.name') as string) as user__name,
from `sample_data`
|
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2. Get SQL to flatten column data
of your table project.dataset.table
select bigfunctions.eu.sql_to_flatten_json_column(data, project.dataset.table.data)
select bigfunctions.us.sql_to_flatten_json_column(data, project.dataset.table.data)
select bigfunctions.europe_west1.sql_to_flatten_json_column(data, project.dataset.table.data)
+-------------------------------------------------------+
| sql |
+-------------------------------------------------------+
| select
*,
...
from `project.dataset.table`
|
+-------------------------------------------------------+
Use cases¶
You have a BigQuery table containing a JSON column called data
, and you want to analyze specific fields within these JSON objects. Instead of repeatedly using JSON_EXTRACT
or JSON_VALUE
in your queries, you can use sql_to_flatten_json_column
to generate a SQL query that extracts all the JSON fields into separate columns. This makes subsequent analysis easier and potentially more performant.
Use Case Example:
Let's say you have a table called website_events
with a JSON column named event_details
:
Table: website_events
Columns: event_id (INT64), event_timestamp (TIMESTAMP), event_details (STRING)
Sample Data:
1, 2024-10-26 10:00:00, '{"eventType": "pageview", "pageUrl": "/home", "userId": "123"}'
2, 2024-10-26 10:01:00, '{"eventType": "click", "elementId": "button1", "userId": "456"}'
You want to analyze the eventType
, pageUrl
(when available), and userId
for all events.
Steps:
- Generate the flattening SQL:
In the BigQuery console, run the following query, replacing <your-project-id>.<your-dataset>.website_events
with the fully qualified table name and choosing the appropriate BigFunctions dataset for your region (e.g., bigfunctions.us
for US, bigfunctions.eu
for EU, etc.):
SELECT bigfunctions.<your-region>.sql_to_flatten_json_column(event_details, '<your-project-id>.<your-dataset>.website_events.event_details');
- Execute the generated SQL: The output of the above query will be a new SQL query that flattens the JSON. It will look something like this:
SELECT
*,
CAST(JSON_VALUE(`event_details`, '$.eventType') AS STRING) AS eventType,
JSON_VALUE(`event_details`, '$.pageUrl') AS pageUrl,
CAST(JSON_VALUE(`event_details`, '$.userId') AS STRING) AS userId
FROM
`<your-project-id>.<your-dataset>.website_events`
- Copy and run the generated SQL: This final query will give you a table with individual columns for
eventType
,pageUrl
, anduserId
.
Benefits:
- Simplified Queries: Instead of constantly extracting JSON fields in every query, you have dedicated columns, making your queries cleaner and easier to read.
- Potential Performance Improvement: BigQuery can sometimes optimize queries against flattened data better than queries with repeated JSON extractions.
- Data Exploration: Flattening the JSON makes it easier to explore the data in the BigQuery UI and identify all the fields present in the JSON data.
This approach is especially useful when you need to analyze the JSON data repeatedly or when the JSON structure is complex and contains numerous nested fields.
Need help or Found a bug?
Get help using sql_to_flatten_json_column
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about sql_to_flatten_json_column
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.