For an on-premises production installation, you will need three physical or virtual machines with the following specifications:
4 x 2.0 Ghz or higher
300 GB direct attached storage for data. SSD preferred.
Dedicated Cassandra Servers
It is advised not to run other applications alongside Cassandra. However, if you choose to do so, you will need to adjust the specifications to accommodate the application.
- 24 GB direct attached storage for the commit log. SSD preferred and on a different filesystem than the data.
- Solid State Drives (SSD) are preferred for production clusters. However, if hard disk drives are used:
- Use RAID-0 or JBOD than RAID-1 or RAID-5 if using a server with multiple disk drives.
- Place the data, saved_caches, and hints directory on one disk (or set of disk drives) and the commit log on another.
- Avoid using NFS or a SAN for data directories. Shared storage is a single point of failure and in some instances cause performance issues.
- It is important to keep 50% free space on the disks used to store Cassandra data. Cassandra has background processes which require sufficient disk space.
Kinetic Supports running Cassandra on one of the following Linux distributions:
- Ubuntu 16.04 or higher
- Debian 8 & 9
- CentOS and RedHat Enterprise Linux (RHEL) including 6.6 to 7.7
On all Linux distributions, you should
- Disable Swap
- Set maximum open files to 100000
It is important not run Cassandra on any of the less popular Linux distributions unless you conduct extensive testing.
Do not deploy Cassandra on older versions unless you have previous experience with the older distribution in a production environment.
It is strongly recommended to avoid running Cassandra on Windows hosts.
On each of the servers which will be running Cassandra, verify or install :
- Java 8: The latest version of Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8. To verify that you have the correct version of java installed, type
- Python 3: The latest version of Python 3 (currently 3.9). To verify that you have the correct version of Python installed, type
- Package Installer (optional):
- APT for Debian and Ubuntu systems
- YUM for RHEL system
- Otherwise, a binary tarball installation can be used as outlined in Installation
- If you are installing Cassandra using a binary tarball
- Create a user and group named
- Create a folder
cassandrathe owner of
- Create a user and group named
- Create an xfs filesystem named
/cassandrawith 300GB allocated. If you are installing Cassandra using a binary tarball, this will also be your installation directory
cassandra:cassandrathe owner of
- Create a folder named
/cassandra/tmpif you prohibit applications from mounting
Should some of the above be added into the installation steps directly?
Cassandra uses the following TCP ports for inbound and outbound traffic:
if using internode encryption (recommended)
native protocol clients (cql)
encrypted native protocol clients
Yum and Apt
For a production cluster, the minimum number of nodes required for high availability is three (3). A three node cluster will be sufficient for most Kinetic Data Platform workloads.
For initial development work, a one node cluster will work. For example, you can set up a one node cluster on a desktop using products such as Oracle VirtualBox or Docker Desktop.
Updated over 1 year ago