- 30 May 2023
- 3 Minutes to read
- Print
- DarkLight
- PDF
Google Drive Advanced Settings
- Updated on 30 May 2023
- 3 Minutes to read
- Print
- DarkLight
- PDF
We do not recommend changing advanced settings unless you are an experienced Panoply user.
For users who have some experience working with their data in Panoply, there are a number of items that can be customized for this data source.
Files Encoding: The following encodings are available:
- UTF-8
- ISO-8859-1
- Windows-1251
- Windows-1252
- Windows-1254
- Other
WhenOther
is selected then during the extraction the data source will detect automatically the file’s encoding. This is useful when the user does not know the file encoding or if he is selecting multiple files in multiple different encodings.
Destination Schema: This is the name of the target schema to save the data. The default schema for data warehouses built on Google BigQuery is
panoply
. The default schema for data warehouses built on Amazon Redshift ispublic
. This cannot be changed once a source has been collected.Destination Prefix: This is the prefix that Panoply will use in the name of the tables included in the collection.
- The default prefix for Google Drive is
googledrive
. - The naming convention is
googledrive_<file>
wherefile
is the name of the file collected, such asgoogledrive_metrics
. - For spreadsheet files, such as .xlsx and Google Sheets, Panoply will also append the sheetname to the table name, such as
googledrive_metrics_january
.
- The default prefix for Google Drive is
Primary Key: The primary key is an
id
field that defines the column that contains the table's Primary Key. If this option is left blank and the sheet does not contain anID
column, Panoply will insert anid
, formatted as a GUID, such as2cd570d1-a11d-4593-9d29-9e2488f0ccc2
.Google Drive id
columnEnter a primary key Outcome yes no Panoply will automatically select the id
column and use it as the primary key.yes yes Not recommended. Panoply will use the id
column but will overwrite the original source values.
If you want Panoply to use your database table'sid
column, do not enter a value into the Primary Key field.no no Panoply creates an id
column formatted as a GUID, such as2cd570d1-a11d-4593-9d29-9e2488f0ccc2
.no yes Panoply creates a hashed id
column using the primary key values entered, while retaining the source columns.
Any user-entered primary key will be used across all the Google Drive files selected.
- Incremental Load: The incremental key allows Panoply to only load changes/additions to a file that happened after it was last successfully completed. Panoply uses the modified date on the files (not the date it was added to the drive). The incremental load is only used when selecting Collect All Files. When selecting specific files, Panoply will collect the entire file.
- Delimiter: For character-delimited files like .csv or .txt, that do not use a comma or a tab for the delimiter, use the dropdown to indicate the correct delimiter to use.
- Skip XML attributes: When collecting XML files, some of the returning XML fields might have attributes attached to them. Select this option to skip all of the XML attributes and ingest only the XML values. For example, for the data
100 , Panoply will ingest the value 100 to the score column - Exclude: The Exclude option allows you to exclude certain data, such as names, addresses, or other personally identifiable information. Enter the column names of the data to exclude.
- Parse String: If the data to be collected contains JSON, include the JSON text attributes to be parsed.
- Truncate: Truncate deletes all the current data stored in the destination tables, but not the tables themselves. Afterwards Panoply will recollect all the available data for this data source.
- Click Save Changes then click Collect.
- The data source appears grayed out while the collection runs.
- You may add additional data sources while this collection runs.
- You can monitor this collection from the Jobs page or the Data Sources page.
- After a successful collection, navigate to the Tables page to review the data results.