get¶
get(url, headers)
Description¶
Request url
Usage¶
Call or Deploy get
?
Call get
directly
The easiest way to use bigfunctions
get
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy get
in your project
Why deploy?
- You may prefer to deploy
get
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
get
function can be deployed with:
pip install bigfunctions
bigfun get get
bigfun deploy get
Keep the secrets safe!
Do NOT write secrets in plain text in your SQL queries!
Otherwise, anyone with access to your BigQuery logs can read and use them.
Instead, generate an encrypted version that you can safely share:
- Enter a secret value below along with the emails of the users who are authorized to use it (separated by commas).
- Click on
Encrypt Secret
. - The browser (no server is called) will generate an encrypted version and copy it in the clipboard
- Paste the encrypted secret into the arguments of your function exactly like if you passed the plain text version.
- The bigfunction will decrypt it and check that the calling user is authorized.
More on secret encryption
Technically, this encryption system uses the same encryption mechanism used to transfer data over the internet. It uses a pair of a public and private keys.
The public key (contained in this web page) is used to encrypt a text. The corresponding private key is the only one who is able to decrypt the text. The private key is stored in a secret manager and is only accessible to this function. Thus, this function (and this function only) can decrypt it.
Moreover, the function will check that the caller of the function belong to the list of authorized users
that you gave at encryption time.
Thanks to this:
- Nobody but this function will be able to decrypt it.
- Nobody but
authorized users
can use the encrypted version in a function. - No function but the function
get
can decrypt it.
Examples¶
1. Without headers
select bigfunctions.eu.get("https://unytics.io/bigfunctions/", null)
select bigfunctions.us.get("https://unytics.io/bigfunctions/", null)
select bigfunctions.europe_west1.get("https://unytics.io/bigfunctions/", null)
+------------------------+
| response |
+------------------------+
| <html>...</html> |
+------------------------+
2. With Content-Type = application/json
headers
select bigfunctions.eu.get("https://api.github.com/repos/unytics/bigfunctions", json_object('Content-Type', 'application/json'))
select bigfunctions.us.get("https://api.github.com/repos/unytics/bigfunctions", json_object('Content-Type', 'application/json'))
select bigfunctions.europe_west1.get("https://api.github.com/repos/unytics/bigfunctions", json_object('Content-Type', 'application/json'))
+----------+
| response |
+----------+
| {...} |
+----------+
3. With encrypted bearer token
select bigfunctions.eu.get("https://api.github.com/repos/unytics/bigfunctions_terraform", json_object(
'Content-Type', 'application/json',
'Authorization', 'Bearer ENCRYPTED_SECRET(ioLZsCtEu5ZKu...)'
)
)
select bigfunctions.us.get("https://api.github.com/repos/unytics/bigfunctions_terraform", json_object(
'Content-Type', 'application/json',
'Authorization', 'Bearer ENCRYPTED_SECRET(ioLZsCtEu5ZKu...)'
)
)
select bigfunctions.europe_west1.get("https://api.github.com/repos/unytics/bigfunctions_terraform", json_object(
'Content-Type', 'application/json',
'Authorization', 'Bearer ENCRYPTED_SECRET(ioLZsCtEu5ZKu...)'
)
)
+----------+
| response |
+----------+
| {...} |
+----------+
Use cases¶
This get
BigQuery function allows you to make HTTP GET requests directly from within your BigQuery SQL queries. Here are a few use cases:
1. Enriching Data:
Imagine you have a table of customer orders with country codes. You can use get
to call a third-party geocoding API to get more detailed location information (like city and latitude/longitude) based on the country code, enriching your order data without leaving BigQuery.
SELECT
order_id,
bigfunctions.<your-region>.get(CONCAT('https://geocoding-api.example.com/?country=', country_code), CAST('{"Content-Type": "application/json"}' as JSON)) as geo_data
FROM
`your_project.your_dataset.orders`;
2. Monitoring External Services:
You can periodically call a health check endpoint of your services using get
within a scheduled query. This lets you monitor the uptime and response times of your services directly from BigQuery and potentially trigger alerts based on the returned status.
SELECT
CURRENT_TIMESTAMP() as check_time,
bigfunctions.<your-region>.get('https://your-service.example.com/healthcheck', null) as health_status;
3. Retrieving Current Data:
Suppose you need up-to-the-minute exchange rates for currency conversions. You could use get
to fetch the latest rates from a financial API within your query, ensuring your conversions are always based on the most current data.
SELECT
transaction_amount,
JSON_VALUE(bigfunctions.<your-region>.get('https://financial-api.example.com/exchange_rates', CAST('{"Content-Type": "application/json"}' as JSON)), '$.USD_to_EUR') AS exchange_rate
FROM
`your_project.your_dataset.transactions`;
4. Simple Web Scraping (Caution):
While not its primary purpose, get
can be used for basic web scraping tasks. For example, retrieving the current price of a product from a publicly accessible website. However, be mindful of the website's terms of service and rate limiting policies. Dedicated web scraping tools are generally more robust and suitable for complex scraping tasks.
SELECT
REGEXP_EXTRACT(bigfunctions.<your-region>.get('https://example.com/product-page', null), '<price>(.*?)</price>') AS product_price;
Key Considerations:
- Rate Limiting: Be aware of potential rate limits imposed by the APIs or websites you are calling. Implement appropriate retry mechanisms and backoff strategies to avoid overloading external services.
- Error Handling: Handle potential errors gracefully. The
get
function might return error codes or empty responses if the external service is unavailable or there are network issues. Include error handling in your SQL to manage such scenarios. - Data Volume and Cost: Making a large number of external requests can impact query performance and incur costs, especially if the responses are substantial. Consider caching responses where appropriate to reduce the number of calls.
- Security: Avoid exposing sensitive information (like API keys) directly in your SQL queries. Use BigQuery authorized networks or alternative secure methods for accessing protected resources.
By carefully considering these factors, you can leverage the get
BigQuery function to effectively integrate external data and services into your data analysis workflows.
Need help or Found a bug?
Get help using get
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about get
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.