Google BigQuery Advanced Settings
  • 01 Jul 2021
  • 2 Minutes to read
  • Dark
    Light
  • PDF

Google BigQuery Advanced Settings

  • Dark
    Light
  • PDF

Article Summary

Warning:

We do not recommend changing advanced settings unless you are an experienced Panoply user.

For users who have some experience working with their data in Panoply, there are a number of items that can be customized for this data source.

  1. Destination Schema: This is the name of the target schema to save the data. The default schema for data warehouses built on Google BigQuery is panoply. The default schema for data warehouses built on Amazon Redshift ispublic. This cannot be changed once a source has been collected.
  2. Destination: Panoply selects the default destination for the tables where data is stored.
    • The default destination is bigquery_<table or view name>, where <table or view name> is a dynamic field. For example, for a table or view name customers, the default destination table is bigquery_customers.
    • To prefix all table names with your own prefix, use this syntax: prefix_<table or view name> where the prefix is your desired prefix name and <table or view name> is a variable representing the tables and views to be collected. For example, is you use the prefix gbdata and your table is named customers, the resulting table will be named bgdata_customers.
  3. Primary Key: The default id. The primary key here determines which field(s) to use as the deduplication key when ingesting data.
  4. Incremental Key -By default, Panoply fetches all of your BigQuery data on each run. If you only want to collect some of your data, enter a column name to use as your incremental key. The column must be logically incremental. Panoply will keep track of the maximum value reached during the previous run and will start there on the next run.
Warning:

If you set an incremental key, you can only collect one table per instance of BigQuery.

5. Exclude: The Exclude option allows you to exclude certain data, such as names, addresses, or other personally identifiable information. Enter the column names of the data to exclude.

6. Parse String: If the data to be collected contains JSON, include the JSON text attributes to be parsed.

7. Truncate: Truncate deletes all the current data stored in the destination tables, but not the tables themselves. Afterwards Panoply will recollect all the available data for this data source.

8. Click Save Changes and then Collect.

  • The data source appears grayed out while the collection runs.
  • You may add additional data sources while this collection runs.
  • You can monitor this collection from the Jobs page or the Data Sources page.
  • After a successful collection, navigate to the Tables page to review the data results.

Was this article helpful?