Installing Greenplum Database: Community Edition on a Mac OS X 10.7

Some of the key features of Greenplum Database are:

  •     Massively Parallel Processing (MPP) Architecture for Loading and Query Processing
  •     Polymorphic Data Storage-MultiStorage/SSD Support
  •     Multi-level Partitioning with Dynamic Partitioning Elimination

If you want to test this database on your Mac you can get a community edition that works
on single node.
Here are some installation steps that worked for me . The installation gives you an idea
of all the components of a MPP system.

You can download the software here

http://www.greenplum.com/products/greenplum-database

Step 1: Create group and user gpadmin

When a Greenplum Database system is first initialized, the system contains one predefined superuser role (also referred to as the system user), gpadmin. This is the user who owns and administers the Greenplum Database.

We a need find a free group id to create gpadmin group

Open a terminal window as a superuser.

$ sudo dscl . -list /Groups PrimaryGroupID | cut -c 34-40 | sort -n

This returns used group id in ascending order.

I have group id 1000 which is unused.

Create the group

$ sudo dscl . -create /Groups/gpadmin PrimaryGroupID 1000

Confirm the creation

$ sudo dscl . -list /Groups PrimaryGroupID | grep 1000 gpadmin
1000

Group id is created.

Now lets create user “gpadmin”. First lets check free unique id’s available.

$ sudo dscl . -list /Users UniqueID | cut -c 25-30 | sort –n

I found 1000 to be available.

$ sudo dscl . -create /Users/gpadmin UniqueID 100
$ sudo dscl . -create /Users/gpadmin PrimaryGroupID 100
$ sudo dscl . -create /Users/gpadmin UserShell /bin/bash
$ sudo dscl . -create /Users/gpadmin RealName “Greenplum Server”
$ sudo dscl . -create /Users/gpadmin NFSHomeDirectory /Users/gpadmin
$ sudo dscl . -append /Users/gpadmin RecordName gpadmin
$ sudo mkdir -p /Users/gpadmin
$ sudo chown -R gpadmin:gpadmin /Users/gpadmin
$ sudo dscl  . -passwd /Users/gpadmin

Step 2: Download Greenplum Community edition from

http://www.greenplum.com/community/downloads/database-ce/

Select Mac version and save the file under /Users/gpadmin.

$ unzip greenplum-db-4.2.1.0-build-3-CommunityEdition-OSX-i386.zip

This will create greenplum-db-4.2.1.0-build-3-CommunityEdition-OSX-i386.bin

Step 3: Configure Host settings

Open a terminal window and connect to root and modify /etc/sysctl.conf file.

$ sudo –s
$ vi /etc/sysctl.conf

kern.sysv.shmmax=2147483648
kern.sysv.shmmin=1
kern.sysv.shmmni=64
kern.sysv.shmseg=16
kern.sysv.shmall=524288
kern.maxfiles=65535
kern.maxfilesperproc=65535
net.inet.tcp.msl=60

Save and exit the file.

Add the following line in /etc/hostconfig

$ vi /etc/hostconfig

HOSTNAME=localhost

Save and exit the file

Step 4: Restart your Mac

Step 5: Create install folders

Open a terminal window and connect to root to create installation and database folders.

$ sudo –s
$ mkdir /usr/local/ greenplum-db-4.2.1.0
$ chown gpadmin:gpadmin /usr/local/greenplum-db-4.2.1.0

Create master and segment folders . This is where the database files will
be stored.

$ mkdir /greenplumdb
$ chown gpadmin:gpadmin /greenplumdb

Exit from root.

Login as “gpadmin”

$ su – gpadmin
$ mkdir /greenplumdb/master
$ mkdir /greenplumdb/data1

Step 6: Install Greenplum
As gpadmin user go to /Users/gpadmin

$ ls
greenplum-db-4.2.1.0-build-3-CommunityEdition-OSX-i386.bin.

$ bash greenplum-db-4.2.1.0-build-3-CommunityEdition-OSX-i386.bin

Follow the prompts and make sure that the installation folder is the same that you created
in step 5.

You will notice an error failing to create a file link.

Login as root and create the link

$ ln -fs /usr/local/greenplum-db-4.2.1.0 /usr/local/greenplum-db
$ exit

Open a terminal window to set Greenplum related session variables.

$ su – gpadmin
$ vi ~/.bashrc
Add this line

source /usr/local/greenplum-db/greenplum_path.sh

Save and exit

$ source  ~/.bashrc

This will set all greenplum related variables.

Step 7: Configure ssh with localhost

Generate ssh keys.

$ su – gpadmin
$ gpssh-exkeys -h localhost

You should see a similar output

[STEP 1 of 5] create local ID and authorize on local host
[STEP 2 of 5] keyscan all hosts and update known_hosts file
[STEP 3 of 5] authorize current user on remote hosts… send to localhost
[STEP 4 of 5] determine common authentication file content
[STEP 5 of 5] copy authentication files to all remote hosts
… finished key exchange with localhost
[INFO] completed successfully

Note: If you get permission error then change permission on /Users/gpadmin folder and run the ssh key gen again.

$ chmod go-w ~/

Step 8: Configure Greenplum database and install

As gpadmin copy gpinitsystem_singlenode and hostlist_singlenode

$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_singlenode
/Users/gpadmin/.
$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/hostlist_singlenode /Users   /gpadmin/.

Open hostlist_singlenode and replace the exisiting line with localhost

$ vi hostlist_singlenode
localhost

Save and exit

Open gpinitsystem_singlenode and make the following changes

#MACHINE_LIST_FILE=./hostlist_singlenode
declare -a DATA_DIRECTORY=(/greenplumdb/data1)
MASTER_HOSTNAME=localhost
MASTER_DIRECTORY=/greenplumdb/master

Save and exit
Make sure that you are connected as gpadmin.

$ source ~/.bashrc
$ gpinitsystem -c gpinitsystem_singlenode -h hostlist_singlenode

20120605:12:20:40:010144
gpinitsystem:localhost:gpadmin-[INFO]:—————————————-
20120605:12:20:40:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum Primary Segment
Configuration
20120605:12:20:40:010144
gpinitsystem:localhost:gpadmin-[INFO]:—————————————-
20120605:12:20:40:010144 gpinitsystem:localhost:gpadmin-[INFO]:-localhost
/greenplumdb/data1/gpsne0        40000            2          0

Continue with Greenplum creation Yy/Nn>

y

20120605:12:20:44:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Building the Master
instance database, please wait…
20120605:12:21:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Starting the Master in
admin mode
20120605:12:21:30:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Commencing parallel build
of primary segment instances
.
.
.20120605:12:22:12:010144
gpinitsystem:localhost:gpadmin-[WARN]:-***************************************************
****
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Greenplum Database
instance successfully created
20120605:12:22:12:010144
gpinitsystem:localhost:gpadmin-[INFO]:—————————————————-

20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-To complete the
environment configuration, please
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-update gpadmin .bashrc
file with the following
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-1. Ensure that the
greenplum_path.sh file is sourced
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-2. Add “export
MASTER_DATA_DIRECTORY=/greenplumdb/master/gpsne-1”
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-   to access the Greenplum
scripts for this instance:
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-   or, use -d
/greenplumdb/master/gpsne-1 option for the Greenplum scripts
20120605:12:22:12:010144 gpinitsystem:localhost:gpadmin-[INFO]:-   Example gpstate -d
/greenplumdb/master/gpsne-1
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Script log file =
/Users/gpadmin/gpAdminLogs/gpinitsystem_20120605.log
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-To remove instance, run
gpdeletesystem utility
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-To initialize a Standby
Master Segment for this Greenplum instance
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Review options for
gpinitstandby
20120605:12:22:13:010144
gpinitsystem:localhost:gpadmin-[INFO]:—————————————————-
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-The Master
/greenplumdb/master/gpsne-1/pg_hba.conf post gpinitsystem
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-has been configured to
allow all hosts within this new
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-array to intercommunicate.
Any hosts external to this
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-new array must be
explicitly added to this file
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-Refer to the Greenplum
Admin support guide which is
20120605:12:22:13:010144 gpinitsystem:localhost:gpadmin-[INFO]:-located in the
/usr/local/greenplum-db/./docs directory
20120605:12:22:13:010144
gpinitsystem:localhost:gpadmin-[INFO]:—————————————————-

Step 9: Verify installation

$ su – gpadmin
$ vi .bashrc

Add the following line after source

export MASTER_DATA_DIRECTORY=/greenplumdb/master/gpsne-1

Save and exit

$ source ~/.bashrc
$ psql postgres

psql (8.2.15)
Type “help” for help.

postgres=#

Check the install by listing all the default databases.
postgres=# \l

List of databases

Name    |  Owner  | Encoding |  Access privileges
———–+———+———-+———————
postgres  | gpadmin | UTF8     |
template0 | gpadmin | UTF8     | =c/gpadmin
: gpadmin=CTc/gpadmin
template1 | gpadmin | UTF8     | =c/gpadmin
: gpadmin=CTc/gpadmin

(3 rows)

Quit from psql

Postgres=# \q

You can also check the installation and the stage of the database by using

$ gpastate

0120605:12:24:31:020661 gpstate:localhost:gpadmin-[INFO]:-Starting gpstate with args:
20120605:12:24:31:020661 gpstate:localhost:gpadmin-[INFO]:-local Greenplum Version:
‘postgres (Greenplum Database) 4.2.1.0 build 3 Community Edition’
20120605:12:24:31:020661 gpstate:localhost:gpadmin-[INFO]:-master Greenplum Version:
‘PostgreSQL 8.2.15 (Greenplum Database 4.2.1.0 build 3 Community Edition) on
i386-apple-darwin9.8.0, compiled by GCC gcc (GCC) 4.4.2 compiled on Feb 27 2012 17:31:15’
20120605:12:24:31:020661 gpstate:localhost:gpadmin-[INFO]:-Obtaining Segment details from
master…
20120605:12:24:31:020661 gpstate:localhost:gpadmin-[INFO]:-Gathering data from segments…
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-Greenplum instance status
summary
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Master instance
= Active
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Master standby
= No master standby configured
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total segment instance count
from metadata     = 1
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Primary Segment Status
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total primary segments
= 1
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total primary segment valid
(at master)        = 1
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total primary segment
failures (at master)     = 0
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of
postmaster.pid files missing   = 0
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of
postmaster.pid files found     = 1
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of
postmaster.pid PIDs missing    = 0
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of
postmaster.pid PIDs found      = 1
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of /tmp lock
files missing        = 0
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number of /tmp lock
files found          = 1
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number postmaster
processes missing      = 0
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Total number postmaster
processes found        = 1
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Mirror Segment Status
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–
20120605:12:24:32:020661 gpstate:localhost:gpadmin-[INFO]:-   Mirrors not configured on
this array
20120605:12:24:32:020661
gpstate:localhost:gpadmin-[INFO]:—————————————————–

To remove greenplum installations

$ gpdeletesystem -d /greenplumdb/master/gpsne-1
$ rm -rf /usr/local/greenplum-db-4.2.1.0
$ rm /usr/local/greenplum-db

Advertisements

About Diwakar Kasibhotla

Oracle Database Tuning, VLDB designing, ETL/BI architect, Data Modelling, Exadata Architect.
This entry was posted in Greenplum Database, Mac, MPP and tagged , , , , , , , . Bookmark the permalink.

7 Responses to Installing Greenplum Database: Community Edition on a Mac OS X 10.7

  1. In .bashrc file its greenplum_path.sh and not greenplum-path.sh.

  2. Shrikant: Thanks for catching the typo. I have changed to the correct path.

  3. “Installing Greenplum Database: Community Edition on
    a Mac OS X 10.7 | Data Warehouse” was a quite awesome post, .
    Continue writing and I’m going to continue browsing! Thanks for your effort -Jake

  4. Lolo says:

    Excellent post. Step by step, all works very well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s