parse_date¶
parse_date(date_string)
Description¶
Parse date with automatic format detection (inspired from Sebabrata BigQuery tutorial)
Usage¶
Call or Deploy parse_date
?
Call parse_date
directly
The easiest way to use bigfunctions
parse_date
function is deployed in 39 public datasets for all of the 39 BigQuery regions.- It can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- (You need to use the dataset in the same region as your datasets otherwise you may have a function not found error)
Public BigFunctions Datasets
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Deploy parse_date
in your project
Why deploy?
- You may prefer to deploy
parse_date
in your own project to build and manage your own catalog of functions. - This is particularly useful if you want to create private functions (for example calling your internal APIs).
- Get started by reading the framework page
Deployment
parse_date
function can be deployed with:
pip install bigfunctions
bigfun get parse_date
bigfun deploy parse_date
Examples¶
select bigfunctions.eu.parse_date("2021-01-20 ")
select bigfunctions.us.parse_date("2021-01-20 ")
select bigfunctions.europe_west1.parse_date("2021-01-20 ")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("2021-1-20 ")
select bigfunctions.us.parse_date("2021-1-20 ")
select bigfunctions.europe_west1.parse_date("2021-1-20 ")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("2021/01/20 ")
select bigfunctions.us.parse_date("2021/01/20 ")
select bigfunctions.europe_west1.parse_date("2021/01/20 ")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("2021/1/20 ")
select bigfunctions.us.parse_date("2021/1/20 ")
select bigfunctions.europe_west1.parse_date("2021/1/20 ")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("01/20/21")
select bigfunctions.us.parse_date("01/20/21")
select bigfunctions.europe_west1.parse_date("01/20/21")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("1/20/21")
select bigfunctions.us.parse_date("1/20/21")
select bigfunctions.europe_west1.parse_date("1/20/21")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
select bigfunctions.eu.parse_date("Wed Jan 20 21:47:00 2021")
select bigfunctions.us.parse_date("Wed Jan 20 21:47:00 2021")
select bigfunctions.europe_west1.parse_date("Wed Jan 20 21:47:00 2021")
+--------------------+
| cleaned_date |
+--------------------+
| date('2021-01-20') |
+--------------------+
Use cases¶
You have a table containing date strings in various formats, and you need to standardize them into a consistent DATE type in BigQuery for analysis. The parse_date
function can automatically detect and convert these different formats.
Scenario:
You're analyzing customer orders, and the order_date
column contains date values, but they were entered using different formats due to various data sources or input methods:
order_id | order_date |
---|---|
1 | 2023-10-26 |
2 | 10/27/2023 |
3 | Oct 28, 2023 |
4 | 28/10/23 |
5 | Fri Oct 29 08:00:00 2023 |
Query using parse_date
:
SELECT
order_id,
bigfunctions.us.parse_date(order_date) AS standardized_order_date
FROM
your_project.your_dataset.your_table;
(Replace bigfunctions.us
with the appropriate dataset for your region.)
Result:
order_id | standardized_order_date |
---|---|
1 | 2023-10-26 |
2 | 2023-10-27 |
3 | 2023-10-28 |
4 | 2023-10-28 |
5 | 2023-10-29 |
Now all your dates are in a standard DATE
format, allowing you to perform date-based calculations, filtering, and aggregations consistently without having to manually handle the different formats. For example, you could then easily query for all orders placed in October:
SELECT
*
FROM
your_project.your_dataset.your_table
WHERE
standardized_order_date BETWEEN '2023-10-01' AND '2023-10-31';
Need help or Found a bug?
Get help using parse_date
The community can help! Engage the conversation on Slack
We also provide professional suppport.
Report a bug about parse_date
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
We also provide professional suppport.