Your Data in Panoply

Your Data in Panoply

Data Structure

After you collect your data into Panoply, your data will be organized into one or multiple tables, structured like this:

Schema: public (collection of tables and other “objects”):

  • Table: your_crm_contacts
    • Column: id
    • Column: email
    • Column: created_date
  • Table: your_crm_companies
    • Column: id
    • Column: name
    • Column: city
  • Table: your_crm_deals
    • Column: id
    • Column: company_id -- can join to
    • Column: signed_date
  • Table: your_analytics_tool_data
    • Column: event_id
    • Column: timestamp
    • Column: user_id
  • Views: Saved queries can be queried like tables (if defined)
  • Other Objects: User Defined Functions, Triggers, etc

To view the collected data, click Tables in the navigation menu.

The Table Rows count includes rows deleted today. Our automated process permanently removes deleted rows once per day. You can sort this list of tables by name, size, schema and row count.

When you click on a table, you can see the following:

  • Data: At the top of the page is a sample of the table’s data so that you can get a feel for its structure.
  • Metadata: This contains the list of the table’s columns and their data types. Clicking on the fields will allow you to further configure each field. Panoply also generates additional fields for each table to denote when each row was last retrieved from the source (__senttime) and when the row was last written into Panoply (__updatetime), among others. See the data sources documentation for your data source to see what metadata fields are added to those tables.

Different sources make different tables

When Panoply collects data from a connected data source, it will generate a table, or multiple tables, in your database that will contain all the data from that data source. There are several reasons why Panoply might create multiple tables from a single source. The most common is that the data source has multiple endpoints that all serve different segments of your data. For example, if you collect data from Facebook Ads, you’ll see at least three tables in your Panoply database after the collection is finished: Insights, Ads and Adsets. That’s the way the data is organized when Facebook API provides the data, and we try to make sure that structure is preserved, as closely as possible to the source.

Another common way for multiple tables to be created from a single data source involves subtable creation. Panoply is designed to optimize your data for analysis, which means that all data points need to be put into standardized tables. For many data sources, like CSVs and MySQL databases this is just a matter of porting over whatever table structures already exist. However, if your data source is a NoSQL database or an API that serves nested JSON, Panoply will automatically unpack the nested objects into their own tables.

In practice, this means that ingesting from these types of sources will create multiple, smaller tables in your Panoply database. Twitter makes for a great example of nested object flattening. When Panoply collects data from Twitter, which serves its data in highly nested structures, it will create multiple subtables. The table twitter_tweets will also generate twitter_tweets_user, twitter_tweets_retweeted—and multiple other tables—as the nested objects are each extracted and collected into their own tables.

This is Panoply’s default behavior, but you can opt to turn on table flattening if you want to avoid table proliferation. This will unpack nested data directly into columns in its parent tables, rather than creating new subtables. For more detailed information on how Panoply deals with nested data, check out the docs page.

Advanced data management

Tables and data types

Now that you’ve gotten a chance to get familiar with your data, you can start to use some of Panoply’s more advanced features to tune it to your needs. First, head back to the tables screen.

Select a table you’d like to modify or explore in more depth to view the table’s columns and associated Metadata.

Clicking a column header from the Metadata view allows you to dig deeper into the table’s inner structure. You can also rename the column, delete it, and change its type or other technical details.

Generally, there’s no need to change these values because Panoply automatically assigns them for you.

Job and query management

After running a query, you can save the query as a transformation or export the resulting table into your own system. Panoply offers two types of transformation views: simple views and materialised views. A transformation typically starts as a simple view. When needed, based on performance, simple views can be transformed into materialised views, which are always accessible and up to date.

To track your job progress, select Jobs from the navigation pane. You can filter the jobs list according to status (All Statuses, Pending, Running, Error, Success, and Error) and job type (All Types, Collect, Collect-Direct, Materialize, Alter, and Export).

Scheduling data collection

Once you get your data sources connected and your analysis pipeline built, you’ll want to make sure you’re working with the most up-to-date data you have at your disposal. You can manage future collections manually as new data flows into your individual data sources, but this can quickly become labor intensive as the number of sources increases. The easier option is to let Panoply automatically collect new data for you.

Panoply provides a simple-to-manage data collection scheduling interface that you can access from the data sources pane, on Schedule Collections on the left hand navigation menu.

Find the data source you’d like to schedule for collection, and select it. Click on the Schedule Collections on the left hand navigation menu, enable the schedule and set the day and time for collection.

Remember to save your changes.

If you’d like to take your Panoply data warehouse to the next level, be sure to set up a data review with a member of our Data Architect Team.

Getting started is easy! Get all your data in one place in minutes.
Try Free