THE TECHNOLOGY

Our Platform

Panoply Platform Dashboard
Panoply Data Sources screen
Panoply Data Sources screen
Panoply Workbench CLI
Panoply Jobs screen

Panoply provides end-to-end data management-as-a-service. Its unique self-optimizing architecture utilizes machine learning and natural language processing (NLP) to model and streamline the data journey from source to analysis, reducing the time from data to value as close as possible to none.

The Engine

ETL-less Data Integration

Panoply automatically aggregates data as it streams in, allowing you to analyze everything in seconds – regardless of scale, and without data configuration, schema, or modeling.

Panoply offers a collection of pre-defined data source integrations to all of the popular databases and services – open-sourced – and provides an array of SDKs in many of the most common programming languages, so that you can easily tailor the platform to your needs and connect to any data source.

Auto-Generated Schemas

When you insert data into Panoply, the platform scans through the data and discovers the underlying schema and metadata that best describe it – including all columns, data types, and foreign keys. It constructs this schema based on the data, or alters the existing schema in real time (when necessary), thereby eliminating the need to explicitly design database tables and columns.

Panoply makes it easy to change data types or add columns – you can simply input different value sets into the platform. If necessary, manual adjustments and customizations can also be made.

Real Time Transformations

Panoply uses common transformations automatically, including the identification of data formats like CSV, TSV, JSON, XML, and many log formats – and flattens nested structures like lists and objects into different tables with a one-to-many relationship.

Remodeling and reindexing are also automatic processes, taking place whenever the system detects changes in query patterns. Panoply uses statistical analysis to identify the columns and tables that are used most frequently in filters and group-bys, and uses that information to rebuild indexes.

3-Tier Storage Architecture

Panoply has a 3-tier stack of storage systems abstracted away behind a single JDBC end point: AWS S3 is used at the backend, as a massively scalable storage engine for semi-structured data; Redshift is used for most of the data, and especially for structured and frequently accessed tables and rows; and Elasticsearch provides fast access and searches through data and aggregations, and handles the indexing and storage of common daily queries.

Streamlined Data Utilization

Panoply delivers a set of pre-integrated, cloud-based analysis tools through a Data Apps framework, which is easily extendable to your own tools and platforms.

Panoply exposes a standard JDBC end point with ANSI-SQL support, providing plug-in support to your Tableau, Spark, or R analytics tools. The platform also allows you to write your own SQL code and build apps on top of the data.

Simplified User Management

Panoply provides streamlined management of users and permissions, avoiding the cumbersome SQL configuration generally required to manage lists of users, passwords, grants, and denies – and allowing you to send out Invites via an easy-to- use UI.

Panoply allows you to specify what permissions users have and which tables they can access, and to view a complete activity log of activities per user – making it easy to pinpoint why changes were made.

Enhanced Privacy & Security

Built on top of AWS, Panoply uses the latest security patches and encryption capabilities provided by the underlying platform including permission controls, TLS, and hardware accelerated RSA encryption.

Panoply also offers an extra layer of security built to enhance data protection and privacy, that includes columnar encryption, two-step verification, anomaly detection, and handling expiring accounts.

Efficient Monitoring

Panoply is a fully managed analytical data platform that provides maximum transparency about everything from uptime and average query time, to low- level details such as the IO throughput of the physical disks.

Panoply’s monitoring capabilities include an analysis of all queries performed on the data by all users, making it easier to identify bottlenecks, catch unexpected behaviors, and “rewind” a database to any previous point in time.

Adaptive Auto-Scaling

Panoply handles the entire data infrastructure, eliminating the traditional concerns about scale, caching, IOPS, and memory. The platform auto-scales clusters seamlessly to keep up with the organization’s needs while reducing server costs.

Panoply adapts server configurations over time based on data scale and query patterns – scaling up or scaling out servers, as necessary. Scale changes take place on a regular basis and can occur multiple times throughout the week, optimizing the system’s performance.

ARCHITECTURE

Under the Hood

CAPABILITIES

Smart Data Infrastructure

Use-Case Optimization

Analyzes queries and data – identifying the best configuration for each use case, adjusting it over time, and building indexes, sortkeys, diskeys, data types, vacuuming, and partitioning.

Query Optimization

Identifies queries that do not follow best practices – such as those that include nested loops or implicit casting – and rewrites them to an equivalent query requiring a fraction of the runtime or resources.

Server Optimization

Optimizes server configurations over time based on query patterns and by learning which server setup works best. The platform switches server types seamlessly and measures the resulting performance.

Updates, Upserts, and Deletions

Supports standard SQL update and upset operations out of the box – without worrying about vacuuming or rebuilding- unlike many analytical databases.

Semi-Structured Data Parsing

Supports semi-structured text values like nested JSON, user-agent strings, some standard log formats, CSV, and serialized Ruby objects – parsing these objects and normalizing them into a relational database design.

Nested Structures

Handles nested structures automatically, flattening them into several tables with a one-to- many relationship. The result is a ready-to- use relational database design for all current and future datasets.

Columnar Storage

Provides seamless data storage and management in a multitiered, columnar storage based on Amazon Redshift, Elasticsearch, and Hadoop or SS3.

Data Tracking and Alerts

Simplifies how you keep track of vast amounts of data – by identifying patterns, providing notification of anomalies, and generating alerts when the results of arbitrary SQL queries exceed predefined thresholds.

Backup and Recovery

Automatically backs up changes to data to a redundant S3 storage, optionally saved in two different availability zones across continents – enabling full recovery to any point in time.

From raw data to analysis in under 10 minutes

Sign up now for a demo or a free trial of the Panoply platform

Learn more about platform integrations