Query Greenplum Database from Pivotal HAWQ

I was getting lot of requests on this so I decided to write a separate post. The question was how do I query a Greenplum Database (GPDB) from Pivotal HAWQ.

The way to do that is by creating an external web table in HAWQ. This would be true to query any external database like Oracle, Teradata or Sqlserver as long as you have client utilities installed on the HAWQ cluster.

Lets look at how to query Greenplum Database from HAWQ.

Pre-requisites:

1. Greenplum database running on a cluster
2. HAWQ database running on a cluster

Step 1: Create a test table in GPDB and insert data

create table public.demo
(empid bigint,
ename character varying(100)) ;

insert into public.demo values (1, ‘Jerry’) ;
insert into public.demo values (2, ‘Adam’) ;

Step 2: Build a command line script to get data from GPDB on HAWQ client

Using web external tables you can execute a command like shell script/sqlplus/plsql and capture the output. In this example I will execute a psql command that will query data
from a Greenplum Database (demo). By default when you execute a select query in psql
the output would look like this

select * from public.demo ;

empid |  ename
——-+———
1 | Jerry
2 | Adam
(2 rows)

I don’t want the header and count of rows in my output. I can also run psql by passing
the select statement as a command line argument.

$ psql -d demo -h <gpdb_master_hostname> -c “select * from public.demo” -A -t

1|Jerry
2|Adam

This is a output that I can stream to a web external table.

Explanation of each flag.
-d <gpdb_databasename>
-h <gpdb_master_hostname>
-c
-A unaligned table output mode
-t print rows only

Step 3: On HAWQ database create a web external table using EXECUTE option

create external web table public.demo_web_external
(empid bigint,
ename character varying(100))
EXECUTE ‘psql -d demo -h <gpdb_master_hostname> -c “select * from public.demo” -A -t’
ON MASTER
FORMAT ‘CSV’ ( DELIMITER ‘|’ NULL ‘ ‘) ;

Step 4: Query from web external table

demo# select * from public.demo_web_external ;

empid |  ename
——-+———
1 | Jerry
2 | Adam
(2 rows)

Hope this helps.

 

Advertisements

About Diwakar Kasibhotla

Oracle Database Tuning, VLDB designing, ETL/BI architect, Data Modelling, Exadata Architect.
This entry was posted in Greenplum Database, MPP, Pivotal Database, Pivotal HAWQ, Web External table and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s