json_query¶
json_query(json_string, query)
Description¶
Extract data from json_string
using advanced json querying
offered by JMESPath.
JMESPath Links:
- See JMESPath Tutorial for exhaustive
query
possibilities- GitHub of jmespath.js
Usage¶
Call or Deploy json_query
?
Call json_query
directly
The easiest way to use bigfunctions
json_query
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy json_query
in your project
Why deploy?
- You may prefer to deploy
json_query
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
json_query
function can be deployed with:
pip install bigfunctions
bigfun get json_query
bigfun deploy json_query
Examples¶
1. Basic Query
select bigfunctions.eu.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo")
select bigfunctions.us.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo")
select bigfunctions.europe_west1.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo")
+----------------------------------+
| result |
+----------------------------------+
| [{"first": "a"}, {"first": "c"}] |
+----------------------------------+
2. Getting array sub-items
select bigfunctions.eu.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].first")
select bigfunctions.us.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].first")
select bigfunctions.europe_west1.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].first")
+------------+
| result |
+------------+
| ['a', 'c'] |
+------------+
3. Slicing
select bigfunctions.eu.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[:1].first")
select bigfunctions.us.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[:1].first")
select bigfunctions.europe_west1.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[:1].first")
+--------+
| result |
+--------+
| ['a'] |
+--------+
4. Projecting
select bigfunctions.eu.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].{name: first}")
select bigfunctions.us.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].{name: first}")
select bigfunctions.europe_west1.json_query("{\"foo\": [{\"first\": \"a\"}, {\"first\": \"c\"}]}", "foo[*].{name: first}")
+--------------------------------+
| result |
+--------------------------------+
| [{"name": "a"}, {"name": "c"}] |
+--------------------------------+
Use cases¶
Let's imagine you have a BigQuery table storing user activity logs, where each row contains a JSON string representing various actions a user took within a session. The JSON structure might look like this:
{
"userId": "12345",
"sessionId": "abcde",
"actions": [
{"type": "pageview", "url": "/home"},
{"type": "click", "element": "button1"},
{"type": "form_submit", "data": {"name": "John", "email": "john@example.com"}},
{"type": "pageview", "url": "/products"},
{"type": "click", "element": "addtocart"}
]
}
Here are a few use cases for the json_query
function with this data:
- Extracting all URLs visited during a session:
SELECT bigfunctions.YOUR_REGION.json_query(activity_json, 'actions[*].url') AS visited_urls
FROM your_table
WHERE userId = '12345' AND sessionId = 'abcde';
This query would return an array like ["/home", "/products"]
.
- Finding all "click" actions and the elements clicked:
SELECT bigfunctions.YOUR_REGION.json_query(activity_json, 'actions[?type==`click`].element') AS clicked_elements
FROM your_table
WHERE userId = '12345' AND sessionId = 'abcde';
This would return ["button1", "addtocart"]
.
- Getting the data submitted in a form:
SELECT bigfunctions.YOUR_REGION.json_query(activity_json, 'actions[?type==`form_submit`].data') AS form_data
FROM your_table
WHERE userId = '12345' AND sessionId = 'abcde';
This would return an array containing a single object: [{"name": "John", "email": "john@example.com"}]
. You could further refine this to get specific fields within the data
object.
- Checking if a specific action type occurred:
SELECT bigfunctions.YOUR_REGION.json_query(activity_json, 'actions[?type==`purchase`]') IS NOT NULL AS purchased
FROM your_table
WHERE userId = '12345' AND sessionId = 'abcde';
This query returns true
if a "purchase" action exists in the actions
array and false
otherwise.
These examples demonstrate the flexibility of json_query
for extracting and analyzing data from complex JSON structures within BigQuery. The function's use of JMESPath allows for complex filtering and projections, simplifying tasks that would otherwise require more complicated SQL or User-Defined Functions (UDFs).
Need help or Found a bug?
Get help using json_query
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about json_query
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.