Panoply on Redshift
On Redshift, you can save queries as tables and then use those views like tables.
The views that you save can rely on tables that you’ve ingested into Panoply (using Panoply data sources or external ingestion tools) and they can also rely on other views that you’ve created. By saving a view that relies on another view you will basically create a chain of views (chain of dependencies). Usually, in these cases, the user will query the last view in that chain (or tree if it’s a more complex configuration) but it’s definitely not a must and you can query each view separately as well.
When working with the classic ETL processes, you create the complete data flow from extraction to how the data will be saved. This entire process is usually saved in the same ingestion tool and is relatively easy to view, examine, and understand. But when you’re working with views, you can easily lose track of the view’s flow and what is in charge of what and which view depends on which table/view.
To overcome this issue, you can identify the dependencies per view and find the actual dependency tree for your specific setup. To see the dependencies, you can use the following query, but change the
schema_name to match the specific view you’re looking for:
SELECT DISTINCT c_p.oid AS tbloid, n_p.nspname AS dependee_schema, c_p.relname AS dependee, n_c.nspname AS dependent_schema, c_c.relname AS dependent, c_c.oid AS viewoid FROM pg_class c_p JOIN pg_depend d_p ON c_p.relfilenode = d_p.refobjid JOIN pg_depend d_c ON d_p.objid = d_c.objid JOIN pg_class c_c ON d_c.refobjid = c_c.relfilenode LEFT JOIN pg_namespace n_p ON c_p.relnamespace = n_p.oid LEFT JOIN pg_namespace n_c ON c_c.relnamespace = n_c.oid WHERE d_p.deptype = 'i' AND dependee_schema + dependee <> dependent_schema + dependent AND dependee = 'view_name_or_table_name' AND dependee_schema = 'schema_name';