Google Sheets

Google Sheets

This document describes the Google Sheets data source. Continue reading to learn more about:

  1. Collecting - what should you know about adding the data source.
  2. Data Dictionary - what data is available and how it is structured.

Collecting

NOTE: Google requires the logged-in user to have permissions to the data. If the permissions are not in place, some of the data will not be available.

NOTE: See our Google Sheets demo file with properly formatted sample data.

To configure this data source and collect Google Sheets data:

  1. From the Data Sources menu, click Add Data Source.
  2. Search for Google Sheets, then select that data source.
  3. Click Login and follow Google’s authorization process to allow Panoply to access Google Sheets data.
  4. Select the Google Sheets files from which to collect data.
    • The file must be a Google Sheets file. We do not support other file types.
    • For each file, Panoply collects each individul sheet (tab) as a unique table.
    • Sheets must include headers.
      • The first row that contains at least one value will be used as the header row. This row is converted into column names in the target tables. See our Google Sheets demo file with properly formatted sample data.
      • If there is a duplicated header, the column letter will be appended to the header name in the format of <header name> column <column letter>. For example, if there are two columns titled date, the second column titled date will be appended with the column letter, becoming date column b.
      • If a column contains data, but has no header, Panoply will fill in the header with column <column letter>, for example column b.
      • If a column has a header but no data, the column will not be collected.
  5. (Optional) Set the Advanced Settings.
    • We do not recommend changing advanced settings unless you are an experienced Panoply user.
    • Destination:
      • Panoply selects a default destination. These are the tables where data is stored. The default naming convention is sheets_<filename>_<sheetname>. For example if you had a spread sheet named “App Install Metrics” and it contained a sheet (tab) named “app_installs”, it would be stored in Panoply as sheets_app_install_metrics_app_installs.
      • To prefix all table names with your own prefix, use this syntax: prefix_<__tablename>, where prefix is your desired prefix name and <__table_name> is a variable that represents the <filename>_<sheetname>.
    • Primary Key - Users can define which column contains the table’s Primary Key. If this option is left blank and the sheet does not contain an ID column, Panoply will insert an id, formatted as a GUID, such as 2cd570d1-a11d-4593-9d29-9e2488f0ccc2.
    • Truncate - Use truncate to delete any data collected previously, and then add new data to the same destination table(s) based on a new collection. This is useful when you don’t have a primary key and do not want to append rows to an existing data set.
  6. Click Save Changes then click Collect.
    • The data source appears grayed out while the collection runs.
    • You may add additional data sources while this collection runs.
    • You can monitor this collection from the Jobs page or the Data Sources page.
    • After a successful collection, navigate to the Tables page to review the data results.

Data Dictionary

Because Google Sheets data comes from a spreadsheet file Panoply cannot provide a data dictionary. But Panoply does automate the data schema for the collected data. This is the useful information to know about the Panoply automations:

  • A column in a table uses the same data type for all values in that column. Panoply automatically chooses the data type for each column based on the available values. This is important to note for this data source. If even one value in a column has text, then the entire column is considered data type Text.
    • For example, the following combination of values in a single column will be data type Number:
      • 10000
      • 10,000
      • 10.10
    • For example, the following combination of values in a single column will be data type Text:
      • 10000
      • 10,000
      • 10.10
      • 10000x
  • Regarding data types, values using commas as a decimal place (such as “12,45”) can be imported as data type Number with some restrictions.
    • The “location” of the Google Sheet determines if “12,45” is a number or a text. See the discussion of decimal point and comma and the Google Sheets API documentation on ValueRenderOption.
    • Someone in the United States, and using the United States version of Google Sheets, enters “12,45” into a Google Sheet cell then Google will automatically format that value as a Text. Even if you manually change the cell format to Number, Google will treat it as a Text when added to Panoply.
    • Someone in the France, and using the French version of Google Sheets, enters “12,45” into a Google Sheet cell then Google will automatically format that value as a Number.
  • Dates are formatted as formatted strings.
  • For each sheet, Panoply opens the individual sheet (tab) and collects the values row by row.
  • A column with a header but without values will be ignored. This is a limitation built into the Data Engine.
  • Empty columns and empty rows are not collected.
  • The following metadata columns are added by Panoply to the destination table(s):
    • id - If the user does not enter a primary key, and no id column exists in the source, Panoply will insert an id. Formatted as a GUID, such as 2cd570d1-a11d-4593-9d29-9e2488f0ccc2
    • __updatetime - Formatted as a datetime, such as 2018-06-26T01:26:14.695Z
    • __senttime - Formatted as a datetime, such as 2018-06-26T01:26:14.695Z
    • __tablename - The name of the sheet (tab), in Google Sheets, where the data originated. Formatted as <filename>_<sheet name>, such as app_install_metrics_app_installs.
Getting started is easy! Get all your data in one place in minutes.
Try Free