Skip to content

bigfunctions > get_webpage_metadata

get_webpage_metadata

Call or Deploy get_webpage_metadata ?

✅ You can call this get_webpage_metadata bigfunction directly from your Google Cloud Project (no install required).

  • This get_webpage_metadata function is deployed in bigfunctions GCP project in 39 datasets for all of the 39 BigQuery regions. You need to use the dataset in the same region as your datasets (otherwise you may have a function not found error).
  • Function is public, so it can be called by anyone. Just copy / paste examples below in your BigQuery console. It just works!
  • You may prefer to deploy the BigFunction in your own project if you want to build and manage your own catalog of functions. This is particularly useful if you want to create private functions (for example calling your internal APIs). Discover the framework

Public BigFunctions Datasets:

Region Dataset
eu bigfunctions.eu
us bigfunctions.us
europe-west1 bigfunctions.europe_west1
asia-east1 bigfunctions.asia_east1
... ...

Description

Signature

get_webpage_metadata(url)

Description

Get webpage metadata (using metadata_parser python library)

Examples

select bigfunctions.eu.get_webpage_metadata('https://apps.apple.com/fr/app/nickel-compte-pour-tous/id1119225763')
select bigfunctions.us.get_webpage_metadata('https://apps.apple.com/fr/app/nickel-compte-pour-tous/id1119225763')
select bigfunctions.europe_west1.get_webpage_metadata('https://apps.apple.com/fr/app/nickel-compte-pour-tous/id1119225763')
+----------+
| metadata |
+----------+
| {...}    |
+----------+

Need help using get_webpage_metadata?

The community can help! Engage the conversation on Slack

For professional suppport, don't hesitate to chat with us.

Found a bug using get_webpage_metadata?

If the function does not work as expected, please

  • report a bug so that it can be improved.
  • or open the discussion with the community on Slack.

For professional suppport, don't hesitate to chat with us.

Use cases

You could use this function in BigQuery to analyze a dataset of URLs and extract metadata from each URL. Here are a few concrete use cases:

  • SEO Analysis: Imagine you have a table of competitor websites. You could use get_webpage_metadata to extract title tags, descriptions, and other metadata to understand their SEO strategies and identify opportunities. You could analyze trends in keywords used in titles and descriptions.

  • Content Auditing: For a large website, you might have a table of all your pages. This function could help you audit your content by extracting metadata and looking for missing or inconsistent information, like missing title tags or descriptions that are too short.

  • Social Media Analysis: If you have a table of URLs shared on social media, you could use this function to understand the type of content being shared. Extracting titles and descriptions can give you insights into the topics and themes that resonate with your audience.

  • Data Enrichment: Suppose you have a table of news articles with only URLs. You can enrich this data by extracting metadata such as the publisher, publication date, and author, if available, using this function.

  • Classifying Web Pages: Based on the extracted metadata like title and description, you can train a machine learning model to categorize web pages into different topics or industries.

Here's a simplified example in BigQuery (assuming your dataset is in the us region and your table is named urls with a column named url):

SELECT
    url,
    bigfunctions.us.get_webpage_metadata(url) AS metadata
FROM
    `your_project.your_dataset.urls`;

This query would add a new column called metadata to your table, containing the extracted metadata for each URL. You could then further process this JSON metadata within BigQuery to extract specific fields. For instance, to extract the title:

SELECT
    url,
    JSON_EXTRACT_SCALAR(bigfunctions.us.get_webpage_metadata(url), '$.title') AS title
FROM
    `your_project.your_dataset.urls`;

Spread the word

BigFunctions is fully open-source. Help make it a success by spreading the word!

Share on Add a on