GridDB Quick Start Guide

Revision: CE-20180525

Table of Contents


1 Introduction

1.1 The Purpose and Structure of This Document

This document describes basic operational procedures for GridDB(TM).

This is intended for engineers working on system development using GridDB and administrators in charge of operations and maintenance of GridDB.

This document contains the following:

  • System design and configuration
    • Covers how to install and set up GridDB to make a basic environment.
  • Operations
    • Covers basic operations, such as starting and stopping GridDB, management operations while running GridDB, and essential actions to be taken in the event of a failure.

1.2 What is GridDB

1.2.1 Overview

GridDB is a distributed NoSQL database which manages a set of data (called Row), each consisting of a key and multiple values.

  • GridDB performs in-memory data management, allowing high-speed processing.
    • Provides fast update and search capabilities, by storing a set of Rows in memory.
  • GridDB can be scaled out to enlarge storage capacity, in spite of performing in-memory processing.
    • Storage capacity can be enlarged by distributively storing data in multiple machines. Additionally, data management can be combined with disk storage, which is not covered by this document.Accordingly, even in a single node, storage capacity can be enlarged irrespective of its memory size.
  • GridDB provides high availability.
    • Can continue processing by using replicate data if a failure occurs in any node in a cluster storing replicate data. Additionally, each node stores persistent data update information in its disk and can restore previous data in the event of a failure.
  • GridDB can be scaled out up to about 1,000 nodes.
    • Provides high scalability by improving parallelism in a cluster, where each node performs per-Container transactions only.
  • GridDB requires no manual operations for managing a cluster.
    • GridDB performs autonomous control of its cluster, where nodes communicate with one another using the distribution protocol.
  • GridDB supports time-series data used by the social infrastructure.

1.2.2 Features

Below, you can see the overview of the features of GridDB(commnunity edition).

RequirementsDescription
[Basic requirements]
Large capacity (in the order of petabytes)Data storage utilizing the characteristics of in-memory storage and SSD in order to achieve both high-speed performance and large capacity.
High-speed (in-memory) performanceIn-memory processing
High scalabilityScalable up to more than 1,000 servers.
High availabilityAvailability can be improved by storing replicate data in multiple servers and using HDD in combination.
High autonomyAutonomous control on replicating data and balancing data layout.
[Requirements for social infrastructure]
Time-series dataProvision of specialized timeseries containers
Guaranteed consistencySupports ACID transactions in a single Container.

1.3 Description of Terms

Below are descriptions of terms used to explain GridDB.

TermsMeaning
NodeA server process which performs data management in GridDB.
ClusterA node or a set of nodes which work together to perform data management.
PartitionA logical area for storing data, which is only prepared wthin GridDB and cannot be directly seen by users.
RowA piece of data managed by GridDB, which is a unit of data consisting of a key and multiple values.
ContainerA receptacle which stores a set of Rows. Two types are available: Collection and TimeSeries.
CollectionA type of Container storing Rows with general type keys.
TimeSeriesA type of Container storing Rows with time-type keys, provided with a special function to operate Rows with time-type keys.
Master nodeA node which controls clustering behaviours.
Follower nodeA node other than a master node participating in a cluster.
Owner nodeA node holding a master Container among replicate Containers.
Backup nodeA node holding a replica Container among replicate Containers.

2 System Design and Configuration

This chapter shows a basic flow of system design and configuration.

The design and construction of GridDB nodes and clusters is carried out according to the process below.

  1. Make sure that required resources are available.
  2. Install and set up GridDB.(Node)
  3. Configure environment-dependent parameters.
  4. configure tuning parameters.
  5. Distribute the definition file to each node

Refer to the items below for the client settings.

2.1 Make sure that required resources are available.

GridDB is a scalable database and requiring no deliberate system design and sizing, unlike conventional DBs. However, you should consider the following as a guide of initial system design.

  • Memory usage
  • Number of nodes constituting a cluster
  • Disk usage

The following subsections show how to estimate the these factors.

The calculation of memory size shown below, however, take no account of the function of enlarging capacity using SSD or other external strage.

2.1.1 Total Memory Usage

Here is shown how to estimate memory usage based on the predicted amount of data to be stored in Containers.

First, predict the amount of data to be stored by your application. Predict the following size and quantity:

  • Data size of a Row
  • Number of Rows to be stored

Next, estimate the memory usage required to store the predicted amount of data.

  • Memory usage = Row data size × Number of Rows ÷ 0.75 + 8 × Number of Rows × (Number of indexes + 2) ÷ 0.66 (bytes)

Make an estimate for all Collections created and used by your application as well. The sum of both amounts is the memory usage for your GridDB cluster.

  • Total memory usage = Sum of memory usage for all Collections

The estimated figure should be used only as a guide, because precise memory usage varies depending on the frequency of update.

2.1.2 Number of Nodes Constituting a Cluster

Here is shown how to estimate the number of nodes used by GridDB. The estimation below is based on the assumption that one node runs on one machine.

First, assume the memory size for one machine.

  • Memory size per machine

Next, assume the number of replicas to create. You can set the number of replicas as a parameter in GridDB.

  • Number of replicas

The default value of the number of replicas is 2.

  • Number of nodes = (Total memory usage ÷ Memory size per machine) × Number of replicas

The estimated figure should be used only as a guide, because larger number of nodes are preferrable in view of load balancing and higher availability.

2.1.3 Disk Usage

Here is shown how to estimate the size of files created by GridDB and then the disk space required for a machine running a node. Two kinds of files are created: a checkpoint file and a transaction log file.

The memory usage in a single node can be calculated as below:

  • Memory usage per node = (Total memory usage × Number of replicas) ÷ Number of nodes (bytes)

Based on the calculation above, estimate the size of a checkpoint file as below:

  • File size = Memory usage per node × 2 (bytes)

And, since the size of a transaction log file varies depending on the frequency of update, predict the following:

  • Row update frequency (per second)

Then, assume a checkpoint interval. You can set the checkpoint interval as a parameter in GridDB.

  • Checkpoint interval

The default value of the checkpoint interval is 1200 seconds (20 minutes).

Based on the calculation above, estimate the size of a transaction-log file size as below:

  • File size = Row data size × Row update frequency × Checkpoint interval (bytes)

Estimate the disk space for a single node by summing up these calculated figures.

  • Disk usage per node = Transaction log file size + Checkpoint file size

2.2 Install and set up GridDB.(Node)

This section shows how to install GridDB on a single machine. For information about clusterintg, seeOperations.

2.2.1 Confirming the Environment

We have confirmed the operation on CentOS 6.7

$ lsb_release -id
Distributor ID: CentOS
Description:    CentOS release 6.7 (Final)

[Note]

  • Select the following option at the minimum for Package Group Selection while installing OS.
    • Basic Server

2.2.2 Installing a Node

Download the GridDB source code package build to build the nodes and clusters.

$ git clone git://github.com/griddb/griddb.git
$ cd griddb
$ sh bootstrap.sh
$ ./configure
$ make
$ export GS_HOME=$PWD
$ export GS_LOG=$PWD/log

Two environment variables are defined as below.

Environment variableValueMeaning
GS_HOMEDirectory where source code file is decompressedGridDB home directory
GS_LOG$GS_HOME/logEvent log file output directory
[Note]
  • These environment variables are referenced by the operational commands shown in the following subsections.

2.2.3 Confirmation After Installation

#Confirm the directory structure of the installed GridDB node. #First, check that the GridDB home directory and related directory and files have been created.

The file below is created when the installation is completed normally.

$GS_HOME/bin/gsserver

Supplementary

If you start a GridDB node by taking the steps shown later, the following files are created.

[Database file]

$GS_HOME                                # GridDB home directory
                   data/                # Directory storing database files
                        gs_log_n_m.log  # File recording transaction logs (n, m: positive number)
                        gs_cp_n_p.dat   # Checkpoint file recording data regularly (n, p: positive number)

[Event log file]

$GS_HOME                                       # GridDB home directory
                   log/                        # Directory storing event log files
                       gridstore-%Y%m%d-n.log  # Event log file
                       gs_XXXX.log             # Operating tool log file

You can change the directories to store files by editing the relevant parameters in the node definition file.

2.2.4 Setting up an administrator user (Mandatory)

An administrator user is used for authentication purposes in nodes and clusters. Administrator user information is stored in the User definition file. The default file is as shown below.

  • $GS_HOME/conf/password

The following default users exist just after installation.

UserPassword
adminNo settings

Administrator user information including the above-mentioned default users can be changed using the user administration command in the operating commands.

CommandFunction
gs_adduserAdd an administrator user
gs_deluserDelete an administrator user
gs_passwdChange the password of an administrator user

Change the password as shown below when using a default user. The password is encrypted during registration.

[Note]

  • Default user password has not been set. Be sure to change the password as the server will not start if the administrator user password is not set.
$ gs_passwd admin
Password:(Input password)
Retype password:(Input password again)

When adding a new administrator user except a default user, the user name has to start with gs#.

One or more ASCII alphanumeric characters and the underscore sign “_” can be used after gs#.

An example on adding a new administrator user is shown below.

$ gs_adduser gs#newuser
Password:(Input password)
Retype password:(Input password again)

[Note]

  • A change in the administrator user information using a user administration command becomes valid when a node is restarted.
  • User information is used for client authentication, so the common user information must be registered in all nodes. Make sure that the common user information is referred to by all nodes, by copying the user definition file.

2.3 Configure environment-dependent parameters.

After installation, configure the parameters required to run GridDB.

  1. Configuration of the network environment
  2. Configuration of the cluster name

You can configure GridDB by editing the following definition files

  • Cluster definition file(gs_cluster.json)
  • Node definition file(gs_node.json)

The cluster definition file is a file which defines the parameters commonly used in the entire cluster.

The node definition file is a file which defines different parameters for each node.

Templates for these definition files are installed as shown below.

$GS_HOME                            # GridDB home directory

               conf/                # Directory storing definition files
                    gs_cluster.json # Template for cluster definition file
                    gs_node.json    # Template for node definition file

[Note]

  • The cluster definition file is a file which defines the parameters commonly used in the entire cluster. Accordingly, all the nodes participating in a cluster must share the same settings. A node with a different setting will fail to participate in the cluster, causing an error, which is shown later.

2.3.1 Configuration of the Network Environment (Mandatory)

First, configure the network environment. There are roughly two types of setting parameters as follows:

  • (1)Address information serving as the interface with a client
  • (2)Address information for cluster management

Although these settings need to be set to match the environment, basically default settings will also work.

However, an IP address derived in reverse from the host name of the machine needs to be an address that allows it to be connected from the outside regardless of whether the GridDB cluster has a multiple node configuration or a single node configuration.

Normally, this can be set by stating the host name and the corresponding IP address in the /etc/hosts file.

Setting /etc/hosts

First, check with the following command to see whether the setting has been configured. If the IP address appears, it means that the setting has already been configured.

$ hostname -i
192.168.11.10

The setting has not been configured in the following cases.

$ hostname -i
hostname: Unknown host

In addition, a loopback address that cannot be connected from the outside may appear.

$ hostname -i
127.0.0.1

If the setting has not been configured or if a loopback address appears, use the following example as a reference to configure /etc/hosts. The host name and IP address, and the appropriate network interface card (NIC) differ depending on the environment.

  1. Check the host name and IP address.
$ hostname
GS_HOST
$ ip route | grep eth0 | cut -f 12 -d " " | tr -d "\n"
192.168.11.10
  1. Add the IP address and corresponding host name checked by the root user to the /etc/hosts file.
192.168.11.10   GS_HOST
  1. Check that the settings have been configured correctly.
$ hostname -i
192.168.11.10

*If the displayed setting remains the same as before, it means that a setting higher in priority is given in the /etc/hosts file. Change the priority order appropriately.

Proceed to the next setting after you have confirmed that /etc/hosts has been configured correctly.

(1)Address information serving as an interface with the client

In the address information serving as an interface with the client, there are settings in the Node definition file and Cluster definition file.

Node definition file

ParameterData typeMeaning
/transaction/serviceAddressstringListening address for transactions
/transaction/servicePortstringListening port for transactions
/system/serviceAddressstringConnection address for operational commands
/system/servicePortstringConnection port for operational commands

The listening addresses and ports for transactions are used for a client to request a transaction of a GridDB cluster. Although this address is used to compose a cluster with a single node, it is not used explicitly when composing a cluster with multiple nodes using the API.

The connection address and port of the operational command is also used in specifying the process request destination of the operating command.

You do not have to define these listening / connection addresses unless you need to use more than one interface for different purposes.

Cluster definition file

ParameterData typeMeaning
/transaction/notificationAddressstringInterface address between a client and a cluster
/transaction/notificationPortstringInterface port between a client and a cluster

A multi-cast address and port are specified in the interface address between a client and cluster. This is used by a GridDB cluster to send cluster information to its clients and for the clients to send processing requests via the API to the cluster. See the description of the GridStoreFactory class/method in (GridDB_API_Reference.html) for details.

(2)Address information for cluster administration and processing

In the address information for the cluster to autonomously perform cluster administration and processing, there are settings in the Node definition file and Cluster definition file. These addresses are used internally by GridDB to exchange the heart beat (live check among clusters) and information among the clusters. These settings are not necessary so long as the address used is not duplicated with other systems on the same network or when using multiple network interface cards.

Node definition file

ParameterData typeMeaning
/cluster/serviceAddressstringListening address for cluster management
/cluster/servicePortstringListening port for cluster management

Cluster definition file

ParameterData typeMeaning
/cluster/notificationAddressstringMulticast address for cluster management
/cluster/notificationPortstringMulticast port for cluster management
  • Although a synchronization process is carried out with a replica when the cluster configuration is changed, a timeout time can be set for the process.
    • /sync/timeoutInterval

[Note]

  • An address or port that is not in use except in GridDB has to be set.
  • The same address can be set for the node definition file gs_node.json /transaction/serviceAddress, /system/serviceAddress, and /cluster/serviceAddress for operations to be performed. If a machine has multiple network interfaces, the bandwidth can be increased by assigning a separate address to each respective interface.

2.3.2 Setting the cluster name (mandatory)

Set the name of the cluster to be composed by the target nodes in advance. The name set will be checked to see if it matches the value specified in the command to compose the cluster. As a result, this prevents a different node and cluster from being composed when there is an error in specifying the command.

The following settings in the Cluster definition file are specified in the cluster name.

Cluster definition file

ParameterData typeMeaning
/cluster/clusterNamestringName of cluster to create

[Note]

  • Node failed to start with default value ("").
  • A unique name on the sub-network is recommended.
  • A cluster name is a string composed of 1 or more ASCII alphanumeric characters and the underscore “_”. However, the first character cannot be a number. The name is also not case-sensitive. In addition, it has to be specified within 64 characters.

2.4 configure tuning parameters.

The main tuning parameters are described here. These parameters are not mandatory but affect the processing performance of the cluster.

2.4.1 Configuring Tuning Parameters

GridDB creates a transaction log file and a checkpoint file for persistence. Since writing data to these files would have an impact on update performance, you can change creation behaviors by specifying the parameters below. However, as a disadvantage, there might be a high probability of losing data in the event of a failure.

Below are the relevant parameters.

Node definition file

ParameterData typeMeaning
/dataStore/persistencyModestringPersistence mode
/dataStore/logWriteModeintLog write mode

The persistence mode specifies whether to write to files at the time of updating data. The log write mode specifies the timing of writing to a transaction log file.

The following values are available to the persistence mode.

  • "NORMAL"
  • "KEEP_ALL_LOGS"

"NORMAL" indicates writing to a transaction log file and a checkpoint file at every update. Transaction log files no longer required due to a particular checkpoint are removed. "KEEP_ALL_LOGS" indicates writing to files at the same timing as in "NORMAL" but leaving all transaction log files. The default value is "NORMAL".

[Note]

[Note]

The following values are available to the log write mode.

  • 0: SYNC
  • 1 or larger integer: DELAYED_SYNC

"SYNC" indicates writing to a log file at every commit or abort of an update transaction. "DELAYED_SYNC" indicates writing to a log file with delay every specified seconds, irrespective of update timing. The default value is "1 (DELAYED_SYNC 1 second)."

2.4.2 Parameters Related to Performance and Availability

GridDB can improve search performance and availability by storing replicate data in multiple nodes of a cluster. Since replicating data would have impact on update performance, you can change replecation behaviors by specifying the parameters below. However, as a disadvantage, there might be a high probability of losing data in the event of a failure.

Below are the relevant parameters.

Cluster definition file

ParameterData typeMeaning
/transaction/replicationModeintReplication mode

The replication mode indicates the method of replication. This mode must be shared by all nodes in a cluster.

  • "0": Asynchronous replication
  • "1": Semi-synchronous replication

"Asynchronous replication" performs replication asynchronously with the timing of an update transaction. "Semi-synchronous replication" performs replication synchronously with the timing of an update transaction, but does not wait for completion of replication. The default is "0".

2.4.3 Other Parameters

An explanation of the other parameters is given. Refer to the list of parameters in the annex for the default value.

Node definition file

ParameterData typeMeaning
/dataStore/dbPathstringDirectory storing database files
/dataStore/storeMemoryLimitstringMemory buffer size
/dataStore/concurrencyintConcurrency level
/dataStore/affinityGroupSizeintNumber of data affinity groups
/checkpoint/checkpointIntervalintCheckpoint interval (in seconds)
/system/eventLogPathstringEvent log file output directory
/transaction/connectionLimitintUpper limit of connections
/trace/categorystringEvent log output level
  • The database file directory is a directory storing transaction log files and checkpoint files which are created to make in-memory data persistent.
  • The memory buffer size is a memory size used for data management. Specify with a string with the unit attached (example: "2048MB").
  • The concurrency level is an upper limit number of concurrent I/Os to secondary storage in GridDB.
  • In data affinity, specify the number of groups when collecting related data and managing the layout.
  • A value from 1 to 64 can be specified for the number of groups. Note that the larger the number of groups, the lower the memory operating efficiency will be.
  • The checkpoint interval is an interval at which checkpoint operations (related to data persistence) are performed internally and periodically.
  • The event log output directory is a directory storing messages about events, such as an Exception occurring in a node (event message files).
  • Set an upper limit of at least twice the number of expected clients as a guide for the number of connections.
  • The event log output level is the output level for each category of the event log.

2.5 Distribute the definition file to each node

Among the definition files, the user definition file and cluster definition file need to have the same settings in all the nodes composing a GridDB cluster.

As a result, when composing a cluster with 2 or more nodes, follow the procedure below to set all the nodes. (When composing a cluster with a single node, the settings of the node and cluster are completed with the procedure so far. )

  1. Perform Set up administrator user, Set up environment-dependent parameters on either of the machines installed with nodes.
  2. Copy and overwrite the Cluster definition file and User definition file to the definition file directory of another node.
  3. Copy the Node definition file as well when configuring settings common to all the nodes.
  4. Configure settings that differ among the nodes separately. (set up network environment, etc.)

A

2.6 Installing and Setting Up GridDB (Client)

This section shows how to install client libraries.

2.6.1 Confirming the Environment

We have confirmed the operation on CentOS 6.7

$ lsb_release -id
Distributor ID: CentOS
Description:    CentOS release 6.7 (Final)

[Note]

  • Select the following option at the minimum for Package Group Selection while installing OS.
    • Software Development WorkStation

We have confirmed the operation on Oracle Java 7 as a Java development environment.

$ java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

2.6.2 Installing a client library

Client library are installed by running 'make' in Installing a Node.

2.6.3 Confirmation After Installation

The file below is created when the installation is completed normally.

$GS_HOME/bin/gridstore.jar              # Java libraries

2.6.4 Setting Up Libraries

If you use a Java-based client, add the client library path to CLASSPATH.

$ export CLASSPATH=${CLASSPATH}:$GS_HOME/bin/gridstore.jar

2.6.5 Setting Up a Client

There is no definition file for setting up a client. Specify the connection point and user/password in the client program.

For details on the NoSQL specifications, refer to "GridDB API Reference" (GridDB_API_Reference.html)

3 Operations

This chapter shows the operational procedures for GridDB.

The following cases are covered:

  • Operations from starting to stopping

The following commands are available for operations.

[Command list]

CommandFunction
gs_startnodeStars a node.
gs_joinclusterCreates a cluster / joins a node to a cluster.
gs_stopclusterStops a cluster (makes a cluster stop working).
gs_stopnodeStops (shuts down) a node.
gs_leaveclusterIsolates a node from a cluster.
gs_statObtains internal information of a node.

[Points to note when using operating commands]

  • If the proxy environment variable "http_proxy" is defined, set the addresses of nodes to "no_proxy" to specify that the proxy should not be consulted for those addresses;otherwise, a REST/HTTP communication invoked by an operational command will be wrongly sent to the proxy server and the command will not work.
  • In the case of a command that has the option "CONNECTION_SERVER:PORT," you do not have to specify this option unless you have changed the setting of a port number from the default.If you specify the option "CONNECTION_SERVER:PORT," you can execute the command on a computer other than the comuter on which you run a node.

The following sections show how to use the operational commands.

3.1 Operations from Starting to Stopping

3.1.1 Basic Flow

Below is shown an flow of regular operations from starting to stopping a GridDB cluster, after installing and setting up a GridDB node.

  1. Start each node.
  2. Configure a cluster.
  3. Use GridDB services.
  4. Stop the cluster.
  5. Stop each node.

[Usage note]

  • The instructions shown below presuppose that the operations administrator is aware of the hostnames (or addresses) of all machines running nodes.
  • They also presuppose that the administrator is aware of the number of nodes participating in a cluster.
  • User “admin” and password “admin” are used as examples in the user authentication option (-u).

3.1.2 Starting Each Node

Execute the "gs_startnode" command to start a node on a machine on which to run the node. You need to execute this command for each node.

Use the command below to start a node.

  • gs_startnode

Use the node definition file, cluster definition file and user definition file settings under the conf director of GridDB home directory file to start the node. A command execution example is shown below.

[Example of command execution]

$ gs_startnode

You need to start a node on each machine constituting a cluster.

[Note]

  • In cluster configuration, all participant nodes must share the same definitions in their Cluster definition file. Make sure that all nodes have the same definitions in their cluster definition files.

Also, all nodes must share the same definitions in their User definition file.

3.1.3 Configuring a Cluster

Join the started node to a cluster to constitute a cluster. This operation is necessary even if you run GridDB on a single node (not on multiple nodes of a cluster).

To join a node to a cluster, execute the "gs_joincluster" command as below:

  • gs_joincluster [-s CONNECTION_SERVER:PORT] -n|–-nodeNum NUM_OF_NODES -c|-–clusterName CLUSTER_NAME -u USERNAME/PASSWORD

Specify "CLUSTER_NAME" and "NUM_OF_NODES" as options.

Specify the number of nodes constituting a GridDB cluster for "NUM_OF_NODES." This value is used as a threshold in various services when starting GridDB for the first time.

Below is shown an example of executing the command on a computer on which a node runs. Create a cluster with the cluster name “setup_cluster_name” and “1” being the number of nodes constituting the cluster.

[Example of command execution]

$ gs_joincluster -c setup_cluster_name -n 1 -u admin/admin

Below is shown an example of executing the command on other than a computer on which a node runs. This example shows the case of joining to a cluster named "example_three_nodes_cluster," initially consisting of "3" nodes, on a computer with the address "192.168.10.11" on which a node runs.

[Example of command execution]

$ gs_joincluster -s 192.168.10.11:10040 -c example_three_nodes_cluster -n 3 -u admin/admin

A cluster is composed by correctly specifying and executing the cluster name for each of the 3 machines that make up the cluster. Cluster service will start when the number of nodes participating in a cluster is equal to the number of nodes constituting the cluster. Once service is started, you will be able to access the cluster from the application.

This command returns control immediately after its request is received. Since the connection from the application may fail before the cluster is constituted, specify the -w option at the last unit that compose the cluster and wait for the cluster constitution to be completed.

An example to compose a cluster with 3 nodes by executing the command the same way to the other 2 machines is shown below.

[Example of command execution]

$ gs_joincluster -s 192.168.10.12:10040 -c example_three_nodes_cluster -n 3 -u admin/admin
$ gs_joincluster -s 192.168.10.13:10040 -c example_three_nodes_cluster -n 3 -u admin/admin -w
...
Joined node

[Note]

  • Specify 1 for the number of nodes constituting a cluster in a single node configuration.
  • If the cluster participation command ends in an error, it means that there is a discrepancy in the cluster definition file of the node. Check the cluster definition file again and adopt the same definition.
  • The cluster service will not start when the number of nodes participating in a cluster does not reach the number of nodes constituting the cluster. When service is not started, check whether the number of nodes is correct.

Separate the nodes from the cluster if a wrong number of nodes constituting a cluster is specified. Execute the following cluster separation command.

  • gs_leavecluster [-s CONNECTION_SERVER:PORT] -u USERNAME/PASSWORD

An example of the command execution in a machine in which the nodes to be separated from the cluster have been started is shown below.

[Example of command execution]

$ gs_leavecluster -u admin/admin

[Note]

  • If this command is used for the purpose of stopping the cluster, there is a possibility that the data may no longer be viewable after the cluster comes into operation again.
  • If the cluster is already in operation, use the cluster stop command (gs_stopcluster).

3.1.4 Using a Service

After configuring a cluster, you can use data storage and search services in GridDB from a client program, using a registered user account.

For detail on creation of a client program, see

"GridDB API Reference"(GridDB_API_Reference.html).

3.1.5 Stopping a Cluster

Stop a GridDB cluster. To stop each node, you need to first stop the GridDB cluster adminstration process, and then stop nodes one by one.

First, stop the cluster administration process. To do so, execute the "gs_stopcluster" command. Execute the following command in one of the nodes participating in the cluster.

  • gs_stopcluster [-s CONNECTION_SERVER:PORT] -u USERNAME/PASSWORD

Below is shown an example of executing the command on a computer on which a node of the cluster to be stopped runs.

[Example of command execution]

$ gs_stopcluster -u admin/admin

After the command is executed, all the nodes participating in the cluster will stop their data storage and search services.

Then, stop (shut down) nodes. To do so, execute the "gs_stopnode" command as below:

  • gs_stopnode [-w [WAIT_TIME]][-s CONNECTION_SERVER:PORT] [-f|–force] -u USERNAME/PASSWORD

Below is shown an example of executing the "gs_stopnode" command on a computer on which a node runs.

[Example of command execution]

$ gs_stopnode -w -u admin/admin

After executing the "gs_stopnode" command, it might take a while for checkpoint operations (writing data on the memory to files) before the process actually stops. We recommend that you wait for the command to end by specifying the -w option.

3.1.6 Restarting a Stopped Cluster

After shutting down a GridDB cluster, you can restart it by following the same procedure as for normal startup, as follows:

  • Confirm beforehand the number of participant nodes at the time of shutdown.
  • Start node(s).
  • Join node(s) to the cluster specifying the number of nodes at the time of shutdown.

Below is shown an example of restarting a single-node cluster.

[Example of command execution]

$ gs_startnode
...
$ gs_joincluster -c setup_cluster_name -n 1 -u admin/admin
...
  • Specify Setup cluster name for the cluster name in the cluster definition file.
  • Specify 1 for the number of nodes constituting a cluster in a single node configuration. For a multiple unit configuration, specify the number of nodes at the shutdown point.
  • The number of nodes participating in the cluster is output to the event log file at the shutdown point.

If you restart a GridDB cluster, it will read database files (transaction log files and checkpoint files) to restore the state at the time of shutdown. It will start services after nodes in the number specified by "NUM_OF_NODES" participate in the cluster.

[Note]

  • You must correctly specify the number of nodes at the time of shutdown for "NUM_OF_NODES." If you specify the number less than the value of "NUM_OF_INITIAL_NODES" specified when initially configuring a cluster, the cluster will not start any services. If no service is started, make sure that you specify the correct number of nodes.
  • If the wrong “Number of nodes constituting a cluster” is specified, separate the nodes from the cluster with a cluster separation command when the cluster is not in operation and specify the right “Number of nodes constituting a cluster” again before letting the nodes participate in the cluster.
  • If the wrong “Number of nodes constituting a cluster” is specified, there is a possibility of starting service in the wrong state when the cluster goes into operation. In this case, carry out the procedure to stop the cluster and then perform the restart procedure.
  • If the number of nodes changed after shutdown owing to a machine failure etc. (decreased after shutdown), go through the restarting procedure specifying the number of nodes restartable.Then, data will be reallocated as in the case of a failure occurring in operations.However, if the number of nodes decreases considerably, you might fail to access data.
  • You can change the IP addresses and port numbers of machines already participating in the cluster (/xxx/serviceAddress、 and /xxx/servicePort in the node definition file).

3.2 Obtaining Various Information

3.2.1 Obtaining Cluster Information

Obtain cluster information (cluster configuration information and internal information). To do so, execute the "gs_stat" command as below:

  • gs_stat [-s CONNECTION_SERVER:PORT] -u USERNAME/PASSWORD

Below is shown an example of executing the command on a computer on which a node runs.

[Example of command execution]

$ gs_stat -u admin/admin
{
                :
                :
    "cluster": {
        "activeCount": 3,
        "clusterName": "defaultCluster",
        "clusterStatus": "MASTER",
                :
                :
}

The cluster status (clusterStatus) indicates as follows:

  • MASTER : Master
  • SUB_MASTER : Master candidate when there is a master failure
  • FOLLOWER : Follower
  • SUB_FOLLOWER : Follower candidate when there is a master failure
  • SUB_CLUSTER : Cluster is not in operation

The system status (nodeStatus) indicates as follows:

  • INACTIVE : The node is down.
  • ACTIVATING : The node is starting.
  • ACTIVE : The node is running.
  • DEACTIVATING : The node is stopping.
  • ABNORMAL : The node has stopped abnormally.
  • NORMAL_SHUTDOWN : The node is stopping normally.

See Parameter List for the descriptions of the other items.

4 Notice

The following notice are only for community edition.

  • The compression function is not supported.
  • Only the very simple user authentication is supported.
  • Only one database called "public" which all the registered users can access is supported.
  • Default building environment repeals the trigger function. Add the following option in build to enable a trigger function.
$ ./configure --enable-activemq

5 Annex

5.1 Parameter List

The list of parameters in the node definition file and cluster definition file in GridDB are shown below.

5.1.1 Node definition file(gs_node.json)

ParameterData typeMeaningDefault
/dataStore/dbPathstringDirectory storing database files"data"
/dataStore/storeMemoryLimitstringMemory buffer size"1024MB"
/dataStore/concurrencyintConcurrency level4
/dataStore/logWriteModeintLog write mode1
/dataStore/persistencyModestringPersistence mode"NORMAL"
/dataStore/affinityGroupSizeintNumber of data affinity groups4
/checkpoint/checkpointIntervalstringCheckpoint execution interval"60s"
/checkpoint/checkpointMemoryLimitstringCheckpoint memory buffer size"1024MB"
/checkpoint/useParallelModebooleanCheckpoint parallel operation (false: invalid, true: valid)false
/cluster/serviceAddressstringListening address for cluster management"127.0.0.1"
/cluster/servicePortintListening port for cluster management10010
/sync/serviceAddressstringReception address used in data synchronization"127.0.0.1"
/sync/servicePortintReception port used in data synchronization10020
/system/serviceAddressstringConnection address of operational command"127.0.0.1"
/system/servicePortintConnection port of operational command10040
/system/eventLogPathstringEvent log file output directory"log"
/transaction/serviceAddressstringReception address of transaction process"127.0.0.1"
/transaction/servicePortintReception port of transaction process10001
/transaction/connectionLimitintUpper limit of connections5000
/trace/defaultstringEvent log output level"LEVEL_ERROR"
/trace/dataStorestring"LEVEL_ERROR"
/trace/collectionstring"LEVEL_ERROR"
/trace/timeSeriesstring"LEVEL_ERROR"
/trace/chunkManagerstring"LEVEL_ERROR"
/trace/objectManagerstring"LEVEL_INFO"
/trace/checkpointFilestring"LEVEL_ERROR"
/trace/checkpointServicestring"LEVEL_INFO"
/trace/logManagerstring"LEVEL_WARNING"
/trace/clusterOperationstring"LEVEL_INFO"
/trace/clusterServicestring"LEVEL_ERROR"
/trace/syncServicestring"LEVEL_ERROR"
/trace/systemServicestring"LEVEL_INFO"
/trace/transactionManagerstring"LEVEL_ERROR"
/trace/transactionServicestring"LEVEL_ERROR"
/trace/transactionTimeoutstring"LEVEL_WARNING"
/trace/sessionTimeoutstring"LEVEL_WARNING"
/trace/replicationTimeoutstring"LEVEL_WARNING"
/trace/recoveryManagerstring"LEVEL_INFO"
/trace/eventEnginestring"LEVEL_WARNING"
/trace/triggerServicestring"LEVEL_ERROR"

5.1.2 Cluster definition file(gs_cluster.json)

ParameterData typeMeaningDefault
/dataStore/partitionNumintNumber of partitions128
/dataStore/storeBlockSizestringBlock size("64KB", "1MB")"64KB"
/cluster/clusterNamestringCluster name""
/cluster/replicationNumintNumber of replicas2
/cluster/notificationAddressstringMulticast address for cluster management"239.0.0.1"
/cluster/notificationPortintMulticast port for cluster management20000
/cluster/notificationIntervalstringMulti-cast interval for cluster administration"5s"
/cluster/heartbeatIntervalstringHeart beat interval"5s"
/cluster/loadbalanceCheckIntervalstringLoad balance check interval"180s"
/sync/timeoutIntervalstringShort-term synchronization timeout time"30s"
/transaction/notificationAddressstringMulticast address to clients"239.0.0.1"
/transaction/notificationPortintMulticast port to clients31999
/transaction/notificationIntervalstringMulti-cast interval to client"5s"
/transaction/replicationTimeoutIntervalstringReplication/timeout time"10s"
/transaction/replicationModeintReplication method (0: non-synchronous, 1: quasi-synchronous)0

5.2 Build/execution method

An example on how to build and execute a program is shown.

[Note]

  • The user and password in the sample program need to be changed appropriately.

[For NoSQL DB]

For Java

  1. Setting the environmental variables
  2. Copy the sample program to the gsSample directory
  3. Build
  4. Run
$ export CLASSPATH=${CLASSPATH}:$GS_HOME/bin/gridstore.jar
$ mkdir gsSample
$ cp $GS_HOME/docs/sample/program/Sample1.java gsSample/.
$ javac gsSample/Sample1.java
$ java gsSample/Sample1 239.0.0.1 31999 setup_cluster_name admin your_password

6 Trademark

  • GridDB is a trademark of Toshiba Digital Solutions Corporation.
  • Oracle and Java are registered trademarks of Oracle and/or its affiliates.
  • Linux is a trademark of Linus Torvalds.
  • Other product names are trademarks or registered trademarks of the respective owners.

Copyright (C) 2017 TOSHIBA Digital Solutions Corporation