Connecting Jupyter Notebook

Connecting Jupyter Notebook

To connect to your Panoply data warehouse using Python and a Jupyter notebook:

  1. Install SQLAlchemy and pandas in your Python environment:

     pip install sqlalchemy pandas
    

    or

     conda install sqlalchemy pandas
    

    if conda is your package manager of choice.

  1. Create a new Jupyter Notebook and import the packages you’ll need:

     import pandas as pd
     import sqlalchemy
     from sqlalchemy import create_engine
    
  1. Enter the connection parameters for your Panoply data warehouse:

     POSTGRES_ADDRESS = 'db.panoply.io' ## INSERT YOUR DB ADDRESS IF IT'S NOT ON PANOPLY
     POSTGRES_PORT = '5439'
     POSTGRES_USERNAME = 'username' ## CHANGE THIS TO YOURPANOPLY/POSTGRES USERNAME
     POSTGRES_PASSWORD = '*****' ## CHANGE THIS TO YOUR PANOPLY/POSTGRES PASSWORD
     POSTGRES_DBNAME = 'database' ## CHANGE THIS TO YOUR DATABASE NAME
    
     postgres_str = ('postgresql://{username}:{password}@{ipaddress}:{port}/{dbname}'.format(username=POSTGRES_USERNAME,
        password=POSTGRES_PASSWORD,
        ipaddress=POSTGRES_ADDRESS,
        port=POSTGRES_PORT,
        dbname=POSTGRES_DBNAME))
    
  1. Initiate the connection to your Panoply data warehouse:

     cnx = create_engine(postgres_str)
    
  1. Collect your data into a pandas dataframe:

     df = pd.read_sql_query('''SELECT * FROM tablename''', cnx)
    

Note that the above example will import the entire table into memory, so take care to create a SQL query that will pull only the data you want to analyze.

Getting started is easy! Get all your data in one place in minutes.
Try Free