Deploying Cassandra on a
single Azure Virtual Machine
Creating virtual machine
Through the
Windows Azure management portal it is fairly easy to create a new virtual
machine. Just click the FROM GALLERY option like on image below:
Operating
system that I have used is Ubuntu 12.04 LTS. It can be easily picked from the
gallery:
Next just
follow the virtual machine creation wizard steps. It is important to open SSH
port( it should be opened by default) because it will be the main way of
communication with the machine.
Connecting to the machine
When VM
will be up, we can connect to it using widely known tool – Putty. Just run it and create a SSH
session with Hostname and port as in Azure management portal:
We will be
prompted for credentials. Use same as during machine creation. After it we
should successfully get logged to the machine:
Installation of Cassandra on VM
Cassandra can
be installed in many ways, in this guide I will describe how to build it
straight from sources. At first we have to install Oracle’s java. It is needed
for running Cassandra(JRE) and also for building it with Ant(JDK). Easiest way
to install java is to get it through the debian package. We have to add
appropriate repository to Advanced Packaging Tool(apt) sources list:
$ sudo add-apt-repository ppa:webupd8team/java
Previous
step always have to be followed by :
$ sudo apt-get update
It downloads
the package lists from the newly added repository( and also updates package
lists of the other already used repos). After these steps the Oracle’s Java
package will be reachable for us. To install it use the following command:
$ sudo apt-get install oracle-java7-installer
During
installation we will have to accept the license:
Next step
will be installation of Apache Ant build tool and Git:
$ sudo apt-get install ant
$ sudo apt-get install git
To download
latest Cassandra sources from its repository type:
To build Cassandra, go to that folder:
$
sudo git clone http://git-wip-us.apache.org/repos/asf/cassandra.git
It will copy Cassandra
git repository to “cassandra” folder under the current location.To build Cassandra, go to that folder:
$ cd cassandra
and use:
$ sudo ant build
Cassandra needs
two following folders, one for data and second for logs. Create them with:
$ sudo mkdir /var/lib/cassandra /var/log/cassandra
And change their
owner to the current user and group:
$ sudo chown -R $USER:$GROUP /var/lib/cassandra
$ sudo chown -R $USER:$GROUP /var/log/cassandra
$ sudo chown -R $USER:$GROUP /var/log/cassandra
Now we have
a machine with raw Cassandra installation but do not run it, we want to have it
raw, as it is now.
Last step
is to create a startup script that will add the machine name to the /etc/hosts file. Why doing it? It is kind of workaround. When we will capture an
image of this machine and use it to create another, with different name, Azure won’t
add it for us and we will run into :
Error: Exception thrown by
the agent : java.net.MalformedURLException: Local host name unknown:
java.net.UnknownHostException: <name>:<name>
while starting
Cassandra.
To create this script open some text editor:
To create this script open some text editor:
$ nano
And paste the following code to it:
#!/bin/bash
local_address=`hostname -I`
cassandra_yaml="$HOME/cassandra/conf/cassandra.yaml"
sed -i "/^127.0.0.1 localhost/ c\127.0.0.1
localhost $HOSTNAME" /etc/hosts
sed -i "/^rpc_address: / c\rpc_address:
$local_address" $cassandra_yaml
sed -i "/^listen_address: / c\listen_address:
$local_address" $cassandra_yaml
|
Save it
with .sh extension. We have to make this script executable:
$ sudo chmod +x <script_name>.sh
To make it
run during boot, create a symbolic link in /etc/init.d:
$ sudo ln -s /path/to/script/<script_name>.sh
/etc/init.d/<script_name>.sh
And add
script to the startup time:
$ sudo update-rc.d <script_name>.sh defaults
As you
probably noticed, this script does more than I described above. It changes also
some of the Cassandra configuration options which are mandatory for further
Cassandra execution:
- listen_address – an address on which Cassandra node communicates with other nodes
- rpc_address – an address on which node will listen for clients
Script sets
values, for both above options, to virtual machine internal address. Why setting
rpc_address to internal address, not 0.0.0.0? It depends
on client, more specifically on protocol. If you are going to use client that
supports only Thrift protocol, set it to 0.0.0.0 then Thrift will listen on all
interfaces. But if you are going to use new CQL native protocol and place
client within cluster service use the internal machine address. Cassandra client
drivers provided by DataStax( for Java, C# and Python) supports CQL native protocol
and are fully asynchronous. Moreover you can configure retry, reconnection and
load balancing policies, so you have full control on cluster traffic.
But there
is still one more mandatory option that is not configured – seeds - which is in fact a comma delimited list of hosts addresses. Cassandra
nodes uses this list to find each other and learn the topology of the ring. At
this point virtual machine created from that image, won’t know anything about
other nodes in the service subnet, so later, you will have to manually add the
seed nodes addresses in cassandra/conf/cassandra.yaml.
Now our default
Cassandra node is almost finished. Last thing to do is to prepare it for
capturing by undoing the provisioning customization. Following command does the
trick:
$ sudo waagent –deprovision
And we are
ready for capturing.
Capturing Virtual Machine
Clicking it displays following window:
Set the
name for the image and tick the “I have run the Windows Azure Linux Agent on
the virtual machine” checkbox. As states in IMPORTANT NOTE section, this
virtual machine will be deleted.
Creating Cassandra cluster
Now when we
have image of machine with Cassandra on it, we can start deploying a cluster.
In this example I will describe scenario of deploying cluster in a single
service.
Creating nodes.
This step
is almost same as Creating virtual
machine on Azure, difference is that we will use our captured image. You
can find it in VM’s Gallery under MY IMAGES. Note to use same username as during
image creation! This will prevent creating a new profile on that machine.
Repeat this step as many times as many nodes you want in your cluster, but
remember to use same service for all of them.
Running the cluster
When all of
the machines will be up, you will have to connect to each of them in order to
configure cassandra.yaml file by adding seeds to it. You will have to select which
nodes will be treated as seeds, get their internal addresses and update the cassandra.yaml
with them. Next run Cassandra on each machine starting from machines considered
as seeds. To run Cassandra type:
$ cassandra/bin/cassandra
When Cassandra will be up on all machines, you can check with nodetool if all
of created nodes are in ring. Just execute following command on any machine:
$ cassandra/bin/nodetool ring
That’s it, using
this image we can add new nodes to cluster pretty fast and in easy way. In above
process of adding new nodes there is still alot of space for automatization, but it is a subject for another article.
Cognitum cooperates with Microsoft under prestigious Azure Circle program, where technology partners are invited with experience in Windows Azure. It provides IT solutions in the area of Cloud and BigData for customers both in Poland and abroad.
Cognitum is also a partner of DataStax, a major Cassandra vendor that provides worldwide training for Cassandra and Enterprise level appliances: DataStax Enterprise combining Cassandra, Hadoop, Hive, Solr into single solution.
Cognitum cooperates with Microsoft under prestigious Azure Circle program, where technology partners are invited with experience in Windows Azure. It provides IT solutions in the area of Cloud and BigData for customers both in Poland and abroad.
Cognitum is also a partner of DataStax, a major Cassandra vendor that provides worldwide training for Cassandra and Enterprise level appliances: DataStax Enterprise combining Cassandra, Hadoop, Hive, Solr into single solution.
This comment has been removed by the author.
ReplyDeleteThe deep you dig into the subject and give us the accurate data is appreciable.virtual assistant
ReplyDeleteNice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating. Azure Online Training Hyderabad
ReplyDelete.
Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating. Azure Online Training Hyderabad.
ReplyDeleteNice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating. Azure Online Training
ReplyDeleteI really loved reading your blog. It was very well authored and easy to understand. Unlike other blogs I have read which are really not that good.Thanks alot! Virtual receptionist
ReplyDeleteTrade FX At Home On Your PC: roboforex login Is A Forex Trading Company. The Company States That You Can Make On Average 80 – 300 Pips Per Trade. roboforex login States That It Is Simple And Easy To Get Started.
ReplyDeletehbar coin hangi borsada
ReplyDeletebtcst coin hangi borsada
vet coin hangi borsada
via coin hangi borsada
tron hangi borsada
juventus coin hangi borsada
beta coin hangi borsada
auto coin hangi borsada
mtl coin hangi borsada,
Get real time access to today's top verb Stocktwits changes And Live, Real-Time Stock Market Overview that provides the most accurate stock market data for stocks trading on the verb Exchange. Whether you are an active trader or simply love watching stock quotes, we have a feature for you!
ReplyDeleteSmm Panel
ReplyDeleteSmm Panel
iş ilanları blog
İnstagram takipçi satın al
hirdavatciburada.com
beyazesyateknikservisi.com.tr
servis
tiktok jeton hilesi
Excellent and informative post. Continue to post. Thank you for revealingOracle Fusion SCM Online Training
ReplyDeleteThe cloud gives telecoms the ability to automate processes and access their platforms from a distance, doing away with the requirement for physical equipment. Telcos may use cloud computing to replace the hardware component of their network infrastructure with software that can operate on servers in any location by utilizing network function virtualization (NFV) and software-defined networking (SDN). This will assist in overcoming restrictions brought on by the actual placement of hardware. Agile networks are essential for operators given the unpredictability of data growth. They may expand their activities as needed thanks to it. Scalability may be supported by the cloud via automated resource optimization. Operators will be able to increase services in response to consumer demand using this. Additionally, by making their technology and platform services accessible via the cloud. Play Ice Scream 4 Now!
ReplyDeleteAdam,
https://www.youtube.com/watch?v=RLsnwB851aE
using Windows Azure Virtual Machines to run Cassandra. Generally speaking, setting up Cassandra on Azure virtual machines involves following the same procedures as setting up Cassandra on any other virtual computer. Installing Java, downloading Cassandra, configuring it, and starting the service are all required.You must select the Azure virtual machine (VM) size and type that best suits your needs for performance, memory, and storage. Thank you for devoting your time to writing this. It is obvious that you spent a great deal of time researching and refining this essay. The information's facts and illustrations are really beneficial. I appreciate you giving your knowledge! Continue your fantastic effort! Keep on sharing. I invite you to browse my website.
ReplyDeleteabogado de divorcio en nueva jersey
Monitor system resources and Among Us Cassandra metrics to identify any performance bottlenecks and tune configuration settings accordingly.
ReplyDeletefnaf guide offers a comprehensive walkthrough for setting up Cassandra on an Azure Virtual Machine, making it accessible to users familiar with Cassandra and Azure.
ReplyDeletethat's a good blog. we collect more data from this site. it's very helpful for us.abogado tráfico rockbridge virginia
ReplyDeleteDeploy Cassandra on Windows Azure Virtual Machines to harness scalable, distributed database solutions in the cloud. Ideal for handling high-velocity data and real-time applications, this setup combines Cassandra's reliability with Azure's robust infrastructure. Easily configure, manage, and scale your deployment for optimal performance, ensuring flexibility and seamless integration with cloud services.
ReplyDeleteDui Lawyer Fredericksburg VA