Safe Browsing

The Safe Browsing dataset provides hash-based threat intelligence from the Google Safe Browsing API, enabling URL safety checks directly in BigQuery.

Overview

The Safe Browsing dataset provides hash-based threat intelligence sourced from the Google Safe Browsing API v5. It enables analysts to check URLs against Google’s global threat lists directly in BigQuery — no API key required.

The dataset is refreshed regularly and includes threat coverage for malware, phishing, unwanted software, and potentially harmful applications.

Available at:

Public Dataset on BigQuery Analytics Hub

Provider: Google

Usage

After subscribing to the dataset, check URLs for threats using the bundled stored procedure:

CALL `safebrowsing.check_urls`([
  'https://example.com',
  'http://suspicious-site.example/'
]);

Results include the matched URL, hash prefix, list name, threat types, and a description of the threat category.

Preview

Schema

hash_lists

Column	Description
name	Threat list identifier, e.g. `mw-4b`
metadata.threat_types	Array of threat type strings, e.g. `["MALWARE"]`
metadata.description	Human-readable description of the threat list
ingested_at	Timestamp of last ingestion

hash_entries

Column	Description
hash_prefix	4-byte SHA256 prefix as hex string
list_name	Threat list this entry belongs to
hash_length_bytes	Length of the hash prefix (always 4 for this dataset)
version	List version token
partial_update	Whether this was a partial update
ingested_at	Timestamp of last ingestion