Skip to main content
Kinetic Community

Casandra Configuration

Overview

This article describes how to protect Cassandra’s communication protocols: Thrift, CQL, Gossip, and JMX.

Cassandra Configuration

The primary Cassandra configuration file in located in the {cassandra_home}/conf directory:

  • cassandra.yaml: This file provides global configuration options. There are a lot of options in this file, but only a few typically require modification from their defaults.

Cassandra Security Options

This section describes how to protect Cassandra’s communication protocols: Thrift, CQL, Gossip, and JMX.

Securing the Cassandra API

There are two application protocols for Cassandra: Thrift and CQL. Thrift is the original protocol and is not used directly by Kinetic Request. CQL is now considered the native protocol, with is what Kinetic Request uses. Both protocols support TLS for encryption.

By default, both APIs use an unencrypted connection and allow any process to connect and authenticate. To prevent unauthorized applications from directly accessing Cassandra, you can enable TLS.

The general steps for enabling TLS are described below:

  1. In the cassandra.yaml file on each Cassandra node: enable TLS by setting enabled to true underclient_encryption_options. Require client authentication by setting require_client_auth to true. When client authentication is enabled, the truststore and truststore_password options must also be set. Finally, cipher_suitesshould be set to one or more cipher suites that are accessible to the JRE. If your environment is setup to only allow AES 256 bit strong encryption, (TLS_RSA_WITH_AES_256_CBC_SHA), you will need to ensure the Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files are installed into the ${java.home}/jre/lib/security/directory.

    An example of the required settings is shown below:

      client_encryption_options:
        enabled: true
        keystore: /path/to/.keystore.jks
        keystore_password: cassandra
        require_client_auth: true
        truststore: /path/to/.truststore.jks
        truststore_password: cassandra
        cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA]
    
  2. Create a certificate that will be used by Kinetic Request and add it to the Cassandra trust store on each node.

  3. In the Kinetic Request cassandra.properties file for each Kinetic Request instance: enable TLS by settingcassandra.ssl to true, and configure the trust store and key store settings. An example of these settings is shown below:

    cassandra.ssl=true
    cassandra.ssl.truststore=/path/to/.truststore.jks
    cassandra.ssl.truststorePassword=cassandra
    cassandra.ssl.keystore=/path/to/.keystore.jks
    cassandra.ssl.keystorePassword=cassandra
    
  4. Add the certificate created in step 2 to the key store for each Kinetic Request instance. This requires that the keystoreand keystorePassword in each Kinetic Request cassandra.properties file is also set.

    More information about enabling TLS and creating certificates can be found in the Cassandra documentation such as the following:

- https://docs.datastax.com/en/cassandra/3.x/cassandra/configuration/secureSSLClientToNode.html
- https://docs.datastax.com/en/cassandra/3.x/cassandra/configuration/secureSSLCertificates.html

Authentication / Authorization Options

By default Cassandra does not require users to login when accessing the API. The following options describe how to setup authentication and authorization to require users to login.

  • authenticator: The authentication backend. It implements IAuthenticator for identifying users. The available authenticators are:

    • AllowAllAuthenticator: Disables authentication; no checks are performed. This is the default value.
    • PasswordAuthenticator: Authenticates users with user names and hashed passwords stored in thesystem_auth.credentials table. If you use this setting it is highly recommended to increase thereplication_factor on the system_auth keyspace to equal the number of nodes in your cluster. This ensures the keyspace will be replicated and user credentials will be available on all nodes.
  • internode_authenticator: Internode authentication backend. It implements org.apache.cassandra.auth.AllowAllInternodeAuthenticator to allow or disallow connections from peer nodes. The default value is enabled.

  • authorizer: The authorization backend. It implements IAuthenticator to limit access and provide permissions. The available authorizers are:
    • AllowAllAuthorizer: Disables authorization; allows any action to any user. This is the default value.
    • CassandraAuthorizer: Stores permissions in the system_auth.permissions table. If you use this setting it is highly recommended to increase the replication_factor on the system_auth keyspace to equal the number of nodes in your cluster. This ensures the keyspace will be replicated and user credentials will be available on all nodes.
  • permissions_validity_in_ms: How long permissions in cache remain valid. Depending on the authorizer, such as CassandraAuthorizer, fetching permissions can be resource intensive. This setting is disabled when set to 0 or when AllowAllAuthorizer is set. The default is 2000ms. It is recommended to increase this value if permissions do not frequently change in your environment.
  • permissions_update_interval_in_ms: Refresh interval for permissions cache (if enabled). After this interval, cache entries become eligible for refresh. On next access, an async reload is scheduled and the old value is returned until it completes. If permissions_validity_in_ms is 0, then this property must be non-zero. The default is 2000ms. It is recommended to increase this value if permissions do not frequently change in your environment.

Securing the Cassandra Gossip API

This is an inter-node communication protocol used by Cassandra to replicate data, coordinate schema changes, and perform other activities. The protocol can be configured to use TLS for encryption.

In a multi-node cluster, each Cassandra node communicates with peer nodes using the Gossip protocol. For non-encrypted connections, the Gossip protocol uses a TCP port defined by the following cassandra.yaml option:

storage_port: 7000

When SSL is enabled for the Gossip protocol, the following cassandra.yaml file option defines the port number used:

ssl_storage_port: 7001

All nodes in a cluster should be configured to use the same storage_port and ssl_storage_port. To prevent eavesdropping or unauthorized disruptions, the gossip protocol should be secured in production environments. However, because the protocol is used for high-performance operations such as replicating data between nodes, encryption is not recommended except for communication between remote locations.

For co-located nodes, the easiest way to secure the Gossip API is to deploy all Cassandra nodes on the same subnet and disallow access to the Gossip port from outside the subnet.

In large Cassandra deployments where multiple "racks" or “data centers” are deployed, each having some number of Cassandra nodes, the Gossip protocol can be secured for cross-rack or cross-data center communication. This is done with the following options in the cassandra.yaml file:

encryption_options:
    internode_encryption: rack
    keystore: /path/to/.keystore.jks
    keystore_password: cassandra
    truststore: /path/to/.truststore.jks
    truststore_password: cassandra

Internode encryption (over the Gossip API) is enabled or disabled by the setting of the internode_encryption option. The following options are recognized:

  • none: This disables all inter-node encryption, meaning Cassandra nodes use unencrypted communication using the defined storage_port.

  • all: This enables encryption for all inter-node communication using the defined ssl_storage_port.

  • rack: This uses non-encrypted communication for nodes defined to be in the same rack (cabinet) and encrypted communication between nodes defined to be in different racks.

  • dc: This uses non-encrypted communication for nodes defined to be in the same data center and encrypted communication between nodes defined to be in different data centers.

When any encryption is enabled for the Gossip protocol, all authentication, key exchange, and data transfer occurs with TLS v1 using either RSA 1024 bit keys or RSA 2048 bit keys. This encryption suite is referred to as either TLS_RSA_WITH_AES_128_CBC_SHA or TLS_RSA_WITH_AES_256_CBC_SHA. This requires that key store and trust store files are defined and initialized. These files are password-protected using the keystore_password and truststore_passwordoptions. Instructions for creating these files can be found publicly, such as in this link:http://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/JSSERefGuide.html#CreateKeystore.

Securing the Cassandra JMX API

Cassandra also uses the JMX protocol for monitoring and to perform certain operational functions. JMX can be configured to require authorization and/or to encrypt data using TLS.

TODO: configuration

Cassandra API Configuration Options

In addition to the options described in this section, there are other options in the cassandra.yaml file that you might want to change in certain circumstances. Here a list of the most common options:

The first option is common to both the Thrift and CQL APIs:

  • rpc_address: This value controls the address(es) to which Cassandra binds when listening for client connections. The same value is used to control both the Thrift and CQL APIs. The value localhost will allow only local connections. A specific IP address can be used, or the address 0.0.0.0 can be used to cause Cassandra to accept connections on all network interfaces.

  • broadcast_rpc_address: RPC address to broadcast to drivers and other Cassandra nodes. This cannot be set to 0.0.0.0. If blank, it is set to the value of the rpc_address or rpc_interface. If rpc_address is set to 0.0.0.0, this property must be set. Default is unset.

The next options are specific to the CQL API:

  • start_native_transport: When this option is true, Cassandra enables the CQL API.

  • native_transport_port: This is the CQL API port that Cassandra facing applications connect to. You can change it from its default of 9042, but it should be the same on all nodes.

The next options are specific to the Thrift API:

  • start_rpc: When true, this option causes Cassandra to initialize the Thrift API.

  • rpc_port: This is the Thrift API port that some Cassandra facing applications connect to. You can change it from its default of 9160, but it should be the same on all nodes. (And you must configure the Cassandra facing applications to know what port to use.)

  • thrift_framed_transport_size_in_mb: This value controls the maximum size of a Thrift message that Cassandra will accept.

Cassandra Cluster Configuration Options

By default Cassandra assumes that it is operating as a stand-alone node. It must be configured to operate in a cluster. The following cassandra.yaml options affect a node’s participation in a cluster:

  • cluster_name: All nodes in the cluster must have the same name, which differentiates the cluster from other nodes that might be working in the same network or even on the same machine. The default name is "Test Cluster", so you should change this to something else specific to your use case.

  • initial_token: This value defines the beginning range of key values for which the node will be the primary owner. It is not set by default, and it may be valid to leave it unset when configuring a new node. However, for a "balanced" cluster, you will need to set this value for each node.

  • seeds: Seeds are IP addresses of neighboring nodes that this node can contact using the gossip protocol. The addresses provide only an initial set: after a node is running, it will memorize the addresses of other nodes in the network and contact them when necessary. The seeds are therefore necessary for the initial execution of a new node. Cassandra provides a generalized "seed provider" interface, but the built-in “simple seed provider” is sufficient for most situations.

  • listen_address: This is the IP address that tells other nodes what IP address to use to communicate to this node. To participate in a cluster, you must change this from its default of "localhost". A host name can be used but is not recommended. The “any address” 0.0.0.0 will not work. You should use a static IP address visible to all other nodes.

  • partitioner: The default for this parameter is Murmur3Partitioner. This random partitioning algorithm is more efficient than the older RandomPartitioner scheme, although the two are incompatible. All nodes in the cluster should use the same partitioning scheme. If you upgrade from an older Cassandra release, you’ll need to ensure this parameter matches your existing value.

For more details on Cassandra configuration, see http://wiki.apache.org/cassandra/Operations.