Google BigQuery

Connect to Google BigQuery for cloud data warehouse analytics

Google BigQuery is a serverless, highly scalable enterprise data warehouse from Google Cloud. Scoop connects directly to BigQuery, enabling AI-powered analysis on your cloud data.

Connection Details

SettingValue
Default Port443
Connection TypeHTTPS
AuthenticationService Account

Prerequisites

Before connecting Scoop to BigQuery:

  1. Google Cloud project with BigQuery enabled
  2. Service account with BigQuery Data Viewer role
  3. Service account key (JSON format)

Create a Service Account

  1. Go to Google Cloud Console > IAM and Admin > Service Accounts
  2. Click Create Service Account
  3. Enter a name (e.g., scoop-bigquery-reader)
  4. Click Create and Continue
  5. Grant the role BigQuery Data Viewer
  6. Click Done

Generate a Key

  1. Click on your new service account
  2. Go to the Keys tab
  3. Click Add Key > Create new key
  4. Select JSON format
  5. Download and securely store the key file

Grant Dataset Access

If you need access to specific datasets only:

  1. Go to BigQuery in Google Cloud Console
  2. Select your dataset
  3. Click Share > Permissions
  4. Add your service account email
  5. Assign the BigQuery Data Viewer role

Connect in Scoop

  1. Navigate to Datasets and click New Dataset
  2. Select Database as your source
  3. Choose BigQuery from the database type dropdown
  4. Enter your connection details:
    • Server Host: Your project ID (e.g., my-project-123456)
    • Database: Your dataset name
    • Username: Not required (use service account)
    • Password: Paste the contents of your service account JSON key
  5. Click Test Connection to verify
  6. Enter your SQL query and proceed

Connection URL Format

If using a JDBC URL directly:

jdbc:BQDriver:project_id;Dataset=dataset_name

Query Syntax Notes

BigQuery uses a slightly different SQL syntax:

-- Use backticks for table references with special characters
SELECT * FROM `project.dataset.table_name`

-- Use TIMESTAMP instead of datetime functions
SELECT TIMESTAMP('2024-01-01')

-- Use STRUCT for complex types
SELECT STRUCT(field1, field2) as my_struct

Best Practices

  • Use partitioned tables to reduce query costs and improve performance
  • Use clustered tables for frequently filtered columns
  • Set maximum bytes billed to prevent unexpected costs
  • Use table wildcards cautiously as they can scan many tables

Cost Considerations

BigQuery charges based on the amount of data scanned by your queries:

  • Use SELECT * sparingly; select only needed columns
  • Filter on partitioned columns when possible
  • Use LIMIT during development and testing
  • Monitor query costs in the BigQuery console

Troubleshooting

Authentication Failed

  • Verify the service account JSON key is complete and valid
  • Check that the service account has BigQuery Data Viewer role
  • Ensure the project ID is correct

Dataset Not Found

  • Verify the dataset name is spelled correctly
  • Check that the service account has access to the dataset
  • Ensure you're using the correct project ID

Permission Denied

  • Verify the service account has the BigQuery Data Viewer role
  • Check dataset-level permissions if using fine-grained access
  • Ensure the service account is in the same organization (for org-restricted datasets)