MongoDB

MongoDB

This document describes the MongoDB data source. Continue reading to learn more about:

  • Collecting - what should you know about adding the data source.
  • Data Dictionary - what data is available and how it is structured.

Collecting

Before you start

  • Note the name, the host, and the port of the Mongo database.
  • Note the username and password for the user connecting to the MongoDB database.
  • Note: MongoDB stores documents in collections. Collections are analogous to tables in relational databases.

SRV: Panoply supports MongoDB Atlas connections that use SRV. MongoDB describes SRV here:. In technical terms, the use of SRV eliminates the requirement for every client to pass in a complete set of state information for the cluster. Instead, a single SRV record identifies all the nodes associated with the cluster (and their port numbers) and an associated TXT record defines the options for the URI.

  1. If necessary, whitelist Panoply.
    • Mongo databases with production data are typically not publicly available. To allow Panoply to access your data, see Whitelisting.
  2. Click Data Sources in the navigation menu.
  3. Click the Add Data Source button.
  4. Search for MongoDB and select it.
  5. Enter the credentials to connect to MongoDB. If you’re not sure what your connection details are, contact your administrator. To find the information on the hostname, ask your administrator or see Host Info in the MongoDB documentation.
    • Host Address - The URL of the database or the IP address of the host server.
      • URL example: your.server.com
      • IP example: 123.45.67.89
    • Mongo allows you to specify more than one host by separating them with commas. When the port is provided in the host, it will be used, otherwise the value from the port parameter will be used. For example, if the host is a.b.com,d.e.net:1234,f.g.ukport=27017, the connection string will eventually be formed as a.b.com:27017,d.e.net:1234,f.g.uk:27017.
    • Port - The port number of the MongoDB server. This is 27017 for most connections.
  6. Enter your MongoDB username and password. This user must have permission to access the data in the collections to be used. If the permissions are not in place, some of the data will not be available.
  7. Select the MongoDB database to collect the data from. This will load the collections (tables and views) that the user has permissions to access.

Note: This may take time to load

8. Select one or more collections (tables or views).

9. (Optional) Set the Advanced Settings. We recommend not changing advanced options unless you are an experienced Panoply user.

  • Auth Source: For most sources this will be “admin”.
  • Replica Set: Enter your replicate set name.
  • Destination: Panoply selects a default destination, the tables where data is stored. The default name of each destination table in Panoply is comprised of the prefix mongo and _<collection>, where collection is a dynamic field that represents the name of the table in your Mongo database. For example if collection name is customers, then the resulting table will be mongo_customers.
  • For more detailed descriptions of Advanced Settings for the MongoDB Data Source, see the Data Dictionary below.

10. Click Save Changes and then click Collect.

  • The data source appears grayed out while the collection runs.
  • You may add additional data sources while this collection runs.
  • You can monitor this collection from the Jobs page or the Data Sources page.
  • After a successful collection, navigate to the Tables page to review the data results.

Data Dictionary

Because MongoDB data comes from a database system, Panoply cannot provide a data dictionary. But Panoply does automate the data schema for the collected data. This section includes useful information about the Panoply automations. You can adjust these settings in your data source under Advanced Settings.

  • Destination - Panoply selects the default destination for the tables where data is stored.
    • The default destination is mongo_<table or view name> , where <table or view name> is a dynamic field. For example, for a table or view name customers, the default destination table is mongo_customers.
    • To prefix all table names with your own prefix, use this syntax: prefix_<table or view name> where prefix is your desired prefix name and <table or view name> is a variable representing the tables and views to be collected.
  • Primary Key - The Primary Key is the field or combination of fields that Panoply will use as the deduplication key when collecting data. Panoply sets the primary key depending on the scenario identified in the following table. To learn more about primary keys in general, see Primary Keys.
MongoDB id column Enter a primary key Outcome
yes no Panoply will automatically select the _id column and use it as the primary key.
yes yes Not recommended. Panoply will use the _id column but will overwrite the original source values.
If you want Panoply to use your database table’s id column, do not enter a value into the Primary Key field.
no no Panoply creates an id column formatted as a GUID, such as 2cd570d1-a11d-4593-9d29-9e2488f0ccc2.
no yes Panoply creates a hashed id column using the primary key values entered, while retaining the source columns.

WARNING: Any user-entered primary key will be used across all the MongoDB tables selected.

  • Incremental Key - By default, Panoply fetches all of your MongoDB data on each run. If you only want to collect some of your data, enter a column name to use as your incremental key. The column must be logically incremental. Panoply will keep track of the maximum value reached during the previous run and will start there on the next run.
    • Incremental Key configurations
      • If no Incremental Key is configured by the user, by default, Panoply collects all the MySQL data on each run for the MySQL tables or views selected.
      • If the Incremental Key is configured by column name, but not the column value, Panoply collects all data, and then automatically configures the column value at the end of a successful run.
      • If the Incremental Key is configured by column name and the column value (manually or automatically), then on the first collection, Panoply will use that value as the place to begin the collection.
        • The value is updated at the end of a successful collection to the last value collected.
        • In future collections, the new value is used as the starting value. So in future collections Panoply looks for data where the IK value is greater than where the collection ended.
    • When an Incremental Key is configured, Panoply will look for that key in each of the selected tables and views. If the table or view does not have the column indicated as the Incremental Key, it must be collected as a separate instance of the data source.
    • A table or view may have some records may have a ‘null’ value for the incremental key, or they may not capture the incremental key at all. In these situations Panoply omits these records instead of failing the entire data source.

WARNING: If you set an incremental key, you can only collect one table per instance of MongoDB.

A column in a table uses the same data type for all values in that column. Panoply automatically chooses the data type for each column based on the available values. This is important to note for this data source. If even one value in a column has text, then the entire column is considered data type Text.

  • The following metadata columns are added to the destination table(s):

    • __databasename - The name of the MongoDB database where the data originated.
    • __collection - The name of the source table in MongoDB.
    • id - If you do not select a primary key, and no id column exists in the source table, Panoply will insert an id. Formatted as a GUID, such as 2cd570d1-a11d-4593-9d29-9e2488f0ccc2.
    • __senttime - Formatted as a datetime, such as 2020-04-26T01:26:14.695Z.
    • __updatetime - Formatted as a datetime, such as 2020-04-26T01:26:14.695Z.
    • __state - Reserved for internal Panoply use.

Data Type Mapping

Getting started is easy! Get all your data in one place in minutes.
Try Free