Documentation

Nested Data

Nested data may appear in many data sources that produce nested arrays or objects, like document-stores (Mongo), APIs, and even relational databases like Postgres JSON fields. Additionally, some of Panoply.io’s data enhancement features may produce nested values for flat data. For example, a record that contains an IP field may be enhanced to include a nested object with the geo-location information for that IP address.

In order to be fully compatible with most popular analysis tools, like BI suite, and standard SQL queries, Panoply.io uses a strongly relational model that doesn’t allow these nested values to appear in the data. Therefore, Panoply.io supports several ways to transform these nested structures into a flat relational model:

  • Subtables (default)
  • Flattening
  • Skipping

Subtables

When presented with nested data, Panoply.io attempts to transform it into a flat relational model in order to be compatible with most analysis tools, like BI suites, which are unable to query nested data effectively.

By default, Panoply.io transforms nested data into a set of many-to-many or one-to-many relationship tables. This is the classic solution for nested structures in relational databases, where the nested model is transformed into several flat tables that can be joined together.

For example, consider the following record in the users destination table:

{id: 30, friends: [ ‘one’, ‘two’ ] }

Would be transformed to these new records:

// table: users table

{id: 30}

// table: users_friends

{users_id: 30, value: ‘one’}

{users_id: 30, value: ‘two’}

Similarly, this transformation can also be applied to nested objects or nested lists of objects. Panoply.io extracts every nested value and moves it to its own table, suffixed by the attribute name of the nested data, and creates the relevant foreign key to allow queries to easily join the two tables together.

Flattening

When flattening is enabled, Panoply.io flattens the nested structure onto the record that contains it, by prefixing the nested attributes with the name of the outer attribute that contains them.

For example, the following record contains a nested location object:

{location: {city: ‘San Francisco’, state: ‘CA’, country: ‘USA’}}

It would be transformed into this new, flat format:

{location_city: ‘San Francisco’, location_state: ‘CA’, location_country: ‘USA’}

Each nested property in the location object is replaced by a new attribute on the containing record, where the attribute name is prefixed with location_.

Note that flattening is applicable for nested objects only, not for nested lists. For example, the following record cannot be flattened:

{locations: [ ‘one’, ‘two’, ‘three’ ]}