Salesforce

Salesforce

This document provides instructions for integrating Salesforce data into Panoply. The following items will be covered:

Salesforce Data Integration

To integrate Salesforce data into Panoply using default selections, complete the following steps. For more advanced options, complete the following and refer to the subsequent sections for detailed information.

  1. Click Data Sources in the navigation menu.
  2. Click the Add Data Source button.
  3. In the Data Sources – Choose Source Type window, select Salesforce. Salesforce is listed under APIs.
  4. (Optional) To connect to your Salesforce sandbox instance, select the Sandbox checkbox.
  5. In the Data Sources – Salesforce screen, click Login with Salesforce.
  6. If you are not already logged into Salesforce, you will be prompted to log in. Enter your Salesforce username and password, and then click Log In.
  7. On the Salesforce permission screen, click Allow to grant Panoply access to your Salesforce data so that your data can be ingested.
  8. Select the objects you want ingest.
  9. (Optional) To customize the ingestion from your data source, review the advanced options.
  10. Click Collect.

The Data Sources – Salesforce window will appear grayed out while the data integration is pending. A small green progress bar appears below Salesforce.

You will be prompted to set up the integration of another data source. You can set up multiple data integrations without impacting the ingestion of the already scheduled or pending data integrations.

From the Data Sources main menu, you can monitor the data ingestion status of the scheduled and pending data integrations. After the data ingestion is complete, you can clean or transform your data in the Tables menu.

Advanced Options

Clicking Show next to Advanced will expand the Data Sources - Salesforce window to include Destination, Primary Key, Incremental Key, Exclude, Parse string, and Truncate table.

  • Destination - Default is salesforce. Panoply includes a _{__collection} suffix, where {__collection} is a dynamic field. For example, for any data where __collection is customers, the default destination table is salesforce_customers. If you specify a destination, do not add the _{__collection} suffix manually. Panoply adds the suffix automatically. For example, if you enter my_sf_data as your destination, the destination table would be my_sf_data_customers for any data where __collection is customers.

  • Primary Key - Default is id provided by Salesforce.

  • Incremental Key - By default, Panoply fetches all of your Salesforce data on each run for the Salesforce objects you selected. If you only want to collect some of your data, select Updated Since from the dropdown menu. If you leave the starting value empty, Panoply will collect all of the data on its first run and then set the starting value to the date of the last successful run. You can also enter a date and time starting value in the following format: yyyy-mm-ddTHH:MM:SS.MSZ.

    When the starting value is defined, Panoply collects data that was added or updated since the starting value, up to a maximum of 30 days. The 30-day maximum is a limitation imposed by Salesforce. However, this restriction is rarely a problem because it is common to schedule this data source to run weekly, daily, or even hourly.

    Possible incremental key configurations:

    1. No incremental key: Collects all data.
    2. Set incremental key but not starting value: Collects all data and sets starting value at the end of the run.
    3. Set incremental key and starting value (manually or after #2): Collects data added or updated since the starting value (30 days max).

Data Schema

Panoply uses Salesforce’s REST API to ingest data into your Panoply data warehouse. This article explains the details of that integration. For complete details regarding the Salesforce REST API, see the Salesforce REST API Developer Guide.

For a detailed explanation of all Salesforce objects, see the Salesforce Object Reference documentation. For detailed entity relationship diagrams for Salesforce objects, see the Salesforce Data Model.

Nested data - Because some Salesforce data is nested, such as the billing address details for an account (see the BillingAddress description in the Account object for details), you should be aware of what Panoply does with nested data. By default, Panoply transforms nested data into a set of many-to-many or one-to-many relationship tables. This is the classic solution for nested structures in relational databases, where the nested model is transformed into several flat tables that can be joined together. In the case of the billingaddress data for an account, if you use the default destination then Panoply would create a table called salesforce_account_billingaddress to store the billing address information associted with the accounts stored on the salesforce_account table.

Deleted records - The data import can include deleted records, which are indicated by a value of 1 in the isdeleted column.