Data Ingestion Engine

Data Ingestion Engine

As your data travels from a data source into your Panoply database, it passes through Panoply’s Data Ingestion Engine. This article explains the Data Ingestion Engine’s constraints, standards it adheres to, and transformations it performs.

For example, you may have three data sources that each format dates differently. As data passes from those sources into your Panoply database, the Data Ingestion Engine standardizes the disparate formats into one consistent date format.

Data Ingestion Engine Specifications

The following sections explain how the Data Ingestion Engine handles destinations, dates, timestamps, numbers, materialization, and deleted source records.

Dates

Dates are converted to strings and saved in the format: YYYY-MM-DDThh:mm:ss.sssZ. This is compliant with ISO-8601.

Panoply supports these date formats:

Date format Example
ANSI C Mon Jan _2 15:04:05 2006
Unix Date Mon Jan _2 15:04:05 MST 2006
Ruby Date Mon Jan 02 15:04:05 -0700 2006
RFC 1123 Mon, 02 Jan 2006 15:04:05 -0700
RFC 3339 (ISO 8601 profile) 2013-03-31T10:05:04.9385623+03:00
year/month/day 2013-03-28 10:05:00 +0000 UTC
Date without day 2014-04

Timestamps

Timestamps are an exact point in time with a microsecond precision regardless of location.

Panoply on Redshift supports both string and integer timestamps that are between 8 and 14 bytes. Longer or shorter timestamps are not considered applicable. Timestamp resolution is in seconds. The Data Ingestion Engine resolves 1432399705 and 1432399705000 to the same UTC date of 2015-05-23T16:48:25Z.

Panoply on BigQuery supports 8 byte timestamps in the format:

YYYY-[M]M-[D]D[( \|T)[H]H:[M]M:[S]S[.DDDDDD]][time zone]

  • YYYY: Four-digit year
  • [M]M: One or two-digit month
  • [D]D: One or two-digit day
  • ( \|T): Space or a T separator
  • [H]H: One or two-digit hour (valid values from 00 to 23)
  • [M]M: One or two-digit minutes (valid values from 00 to 59)
  • [S]S: One or two-digit seconds (valid values from 00 to 59)
  • [.DDDDDD]: Up to six fractional digits (microsecond precision)

Numbers

Panoply uses a double-precision floating-point format for numbers. This means the largest number Panoply can parse is 9,007,199,254,740,991.

 

Next Steps

Getting started is easy! Get all your data in one place in minutes.
Try Free