Cassandra Interview Questions And Answers

cassandra interview questions and answers   Top 25 Cassandra Interview Questions And Answers

1) What is Apache Cassandra?

Apache Cassandra is a highly scalable, high performance distributed database which is designed to handle large amounts of data across multiple servers, provides high availability with no single point of failure. It is a type of NoSQL database.

● Cassandra was developed at Facebook for inbox search.
● It was open-sourced by Facebook in July 2008.
● Cassandra was accepted into Apache Incubator in March 2009.

2) What is NoSQL database?

A NoSQL database provides mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have very simple APIs,eventually consistent and can handle huge amounts of data.

3) What is the primary objective of NoSQL databases?

Following are the primary objectives of NoSQL databases:

● These are simple in designs.
● Horizontal scalable.
● Finer control over availability.

4) What are the different features of Cassandra?

Following are the features of Cassandra:

● Elastic scalability
● Always on architecture
● Flexible data storage
● Fast linear-scale performance
● Easy data distribution
● Transaction support
● Fast writes

5) What is the design architecture of Cassandra?

In Cassandra the design goal is to handle big data workloads across multiple nodes without any single point of failure. Cassandra has peer-to-peer distributed system across its nodes, and data is distributed among all the nodes in a cluster.
● All the nodes in a cluster play same role. Each node is independent and at the same time they are interconnected to other nodes.
● Each node in a cluster can accept read and write requests, regardless of where the data is located in the cluster.
● When a node goes down, read/write requests can be served from other nodes in the network. This provides continuous availability.

6) What is Data Replication in Cassandra?

In Cassandra, one or more nodes in a cluster act as replicas for a given data. If it is detected that some of the nodes responded with an out-of-date value, Cassandra will return the most recent value to the client. After returning the most recent value, Cassandra performs a read repair in the background to update the stale values.

7) What is gossip protocol in Cassandra?

Cassandra uses the Gossip Protocol in the background which allow the nodes to communicate with each other and detect any faulty nodes in the cluster.

8) What are the different components of Cassandra?

Following are the key components of cassandra:

● Node
● Data center
● Cluster
● Commit log
● Mem-table
● SSTable
● Bloom filter

9) What is Cassandra Query Language?

Users can access Cassandra through its nodes using Cassandra Query Language (CQL). CQL treats the database as a container of tables. Programmers use cqlsh: a prompt to work with CQL or separate application language drivers.

10) What is cluster in Cassandra data model?

Cassandra database is distributed over several machines that operate together. The outermost container is known as the Cluster. For failure handling, every node contains a replica, and in case of a failure, the replica takes charge. Cassandra arranges the nodes in a cluster, in a ring format, and assigns data to these clusters.

11) What is keyspace in Cassandra?

It is the outermost container for data in Cassandra. The basic attributes of a Keyspace in Cassandra are:
● Replication factor
● Replica placement strategy
● Column families

12) What is Column Family in Cassandra?

It is a container for an ordered collection of rows. Each row, in turn, is an ordered collection of columns. Following are the basic attributes of column family:

● keys_cached
● rows_cached
● preload_row_cache

13) What is super column in Cassandra?

A super column is a special column, therefore, it is also a key-value pair. But a super column stores a map of sub-columns.

14) What are the key points of data model of Cassandra?

Here are the key points:
● Cassandra deals with unstructured data whereas RDBMS deals with structured data.
● Cassandra has a flexible schema.
● In Cassandra, a table is a list of "nested key-value pairs". (ROW x COLUMN key x COLUMN value)
● Keyspace is the outermost container that contains data corresponding to an application.
● Tables or column families are the entity of a keyspace.
● Row is a unit of replication in Cassandra.
● Column is a unit of storage in Cassandra.

15) What is the significance of cluster class in Cassandra?

Cluster class is the main entry point of the driver which is in com.datastax.driver.core package. Following are the key methods of this class:

● Session connect()
● void close()
● static Cluster.Builder builder()

16) What is the function of Cluster.Builder class in Cassandra?

It is used to instantiate the Cluster.Builder class. Following are the key methods of this class:

● Cluster.Builder addContactPoint(String address)
● Cluster build()

17) What is session in Cassandra?

Session is an interface in Cassandra. It holds the connections to Cassandra cluster. Using this interface, you can execute CQL queries. It belongs to com.datastax.driver.core package. Following are the key methods of this class:

● ResultSet execute(Statement statement)
● ResultSet execute(String query)
● PreparedStatement prepare(RegularStatement statement)
● PreparedStatement prepare(String query)

18) What is prepare() method in Cassandra?

This method prepares the provided query. The query is to be provided in the form of a Statement.

19) What is CQLSH in Cassandra?

CQLSH stands for Cassandra query language shell - By default, Cassandra provides a prompt Cassandra query language shell (cqlsh) that allows users to communicate with it. Using this shell, you can execute Cassandra Query Language (CQL).

Using cqlsh following things can be done:

● Define a schema
● Insert data
● Execute a query

20) What is the function of consistency cqlsh command in Cassandra?

This command is used to show the current consistency level, or sets a new consistency level.

21) What is the use of expand cqlsh command in Cassandra?

It expands the output of a query vertically.

22) What is the use of paging cqlsh command in Cassandra?

It enables or disables query paging.

23) What is the use of tracing cqlsh command in Cassandra?

It enables or disables request tracing.

24) What are the different CQL data definition commands in Cassandra?

Following are the CQL data definition commands:

● Create Keyspace
● Drop Keyspace
● Alter Keyspace
● Create Table
● Drop Table
● Alter Table
● Create Index
● Drop Index

25) What are the different CQL data manipulation commands in Cassandra?

Following are the CQL data manipulation commands:

● Insert
● Delete
● Update
● Batch