bigfunctions > precision_recall_auc
precision_recall_auc¶
Call or Deploy precision_recall_auc
?
✅ You can call this precision_recall_auc
bigfunction directly from your Google Cloud Project (no install required).
- This
precision_recall_auc
function is deployed inbigfunctions
GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error). - Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
- You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework
Public BigFunctions Datasets:
Region | Dataset |
---|---|
eu |
bigfunctions.eu |
us |
bigfunctions.us |
europe-west1 |
bigfunctions.europe_west1 |
asia-east1 |
bigfunctions.asia_east1 |
... | ... |
Description¶
Signature
precision_recall_auc(predictions)
Description
Returns the Area Under the Precision Recall Curve (a.k.a. AUC PR) given a set of predicted scores and ground truth labels using the trapezoidal rule
Examples¶
1. Random classifier
select bigfunctions.eu.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), rand() > 0.5)) from unnest(generate_array(1, 1000)) as predicted_score))
select bigfunctions.us.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), rand() > 0.5)) from unnest(generate_array(1, 1000)) as predicted_score))
select bigfunctions.europe_west1.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), rand() > 0.5)) from unnest(generate_array(1, 1000)) as predicted_score))
+--------+
| auc_pr |
+--------+
| 0.5 |
+--------+
2. Good classifier
select bigfunctions.eu.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), predicted_score > 500)) from unnest(generate_array(1, 1000)) as predicted_score))
select bigfunctions.us.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), predicted_score > 500)) from unnest(generate_array(1, 1000)) as predicted_score))
select bigfunctions.europe_west1.precision_recall_auc((select array_agg(struct(cast(predicted_score as float64), predicted_score > 500)) from unnest(generate_array(1, 1000)) as predicted_score))
+--------+
| auc_pr |
+--------+
| 1.0 |
+--------+
Need help using precision_recall_auc
?
The community can help! Engage the conversation on Slack
For professional suppport, don't hesitate to chat with us.
Found a bug using precision_recall_auc
?
If the function does not work as expected, please
- report a bug so that it can be improved.
- or open the discussion with the community on Slack.
For professional suppport, don't hesitate to chat with us.
Use cases¶
You're evaluating a machine learning model designed to predict customer churn for a telecommunications company. You have a dataset with customer features and a label indicating whether they churned (1) or not (0). Your model outputs a churn probability score for each customer.
Here's how you would use the precision_recall_auc
function in BigQuery to evaluate your model:
SELECT bigfunctions.YOUR_REGION.precision_recall_auc(
(
SELECT
ARRAY_AGG(
STRUCT(
predicted_churn_probability AS predicted_score,
churned AS label
)
)
FROM
`your_project.your_dataset.customer_churn_predictions`
)
) AS auc_pr;
Explanation:
-
your_project.your_dataset.customer_churn_predictions
: Replace this with the actual location of your BigQuery table containing the predictions. This table should have at least two columns:predicted_churn_probability
: The predicted probability of churn (a floating-point number between 0 and 1).churned
: The ground truth label (1 for churn, 0 for no churn).
-
ARRAY_AGG(STRUCT(...))
: This constructs an array of structs, where each struct contains the predicted score and the true label for a single customer. This is the required input format for theprecision_recall_auc
function. -
bigfunctions.YOUR_REGION.precision_recall_auc
: ReplaceYOUR_REGION
with the appropriate BigQuery region where your data resides (e.g.,us
,eu
,us-central1
). This function calculates the area under the precision-recall curve. -
AS auc_pr
: This assigns the resulting AUC-PR value to a column namedauc_pr
.
Why use AUC-PR in this case?
Churn prediction is often an imbalanced classification problem, meaning there are significantly more non-churners than churners. AUC-PR is a better metric than AUC-ROC for imbalanced datasets because it focuses on the positive class (churners in this case). A higher AUC-PR indicates a better model at identifying churners, even if they are a small portion of the overall customer base.
By calculating the AUC-PR, you get a single number summarizing your model's performance, making it easier to compare different models or track the performance of a single model over time.
Spread the word¶
BigFunctions is fully open-source. Help make it a success by spreading the word!