Showing posts with label Azure. Show all posts
Showing posts with label Azure. Show all posts

Monday, 16 September 2013

Cassandra on Windows Azure Virtual Machines

In this guide I will show how to deploy Cassandra on Azure Virtual Machine. I will describe also how to make a pretty handy image of it, which will be used later for deployment of a Cassandra cluster. I assume that if someone get into here then knows what Cassandra and Windows Azure is. Just in case, here are the references for Cassandra and Azure.



Deploying Cassandra on a single Azure Virtual Machine


Creating virtual machine

Through the Windows Azure management portal it is fairly easy to create a new virtual machine. Just click the FROM GALLERY option like on image below:



Operating system that I have used is Ubuntu 12.04 LTS. It can be easily picked from the gallery:


Next just follow the virtual machine creation wizard steps. It is important to open SSH port( it should be opened by default) because it will be the main way of communication with the machine.  

Connecting to the machine

When VM will be up, we can connect to it using widely known tool – Putty. Just run it and create a SSH session with Hostname and port as in Azure management portal:


We will be prompted for credentials. Use same as during machine creation. After it we should successfully get logged to the machine:



Installation of Cassandra on VM

Cassandra can be installed in many ways, in this guide I will describe how to build it straight from sources. At first we have to install Oracle’s java. It is needed for running Cassandra(JRE) and also for building it with Ant(JDK). Easiest way to install java is to get it through the debian package. We have to add appropriate repository to Advanced Packaging Tool(apt) sources list:

$ sudo add-apt-repository ppa:webupd8team/java
Previous step always have to be followed by :

$ sudo apt-get update
It downloads the package lists from the newly added repository( and also updates package lists of the other already used repos). After these steps the Oracle’s Java package will be reachable for us. To install it use the following command:

$ sudo apt-get install oracle-java7-installer
During installation we will have to accept the license:


Next step will be installation of Apache Ant build tool and Git:

$ sudo apt-get install ant
$ sudo apt-get install git
To download latest Cassandra sources from its repository type:


It will copy Cassandra git repository to “cassandra” folder under the current location.
To build Cassandra, go to that folder:

$ cd cassandra
and use:

$ sudo ant build
Cassandra needs two following folders, one for data and second for logs. Create them with:

$ sudo mkdir /var/lib/cassandra /var/log/cassandra
And change their owner to the current user and group:

$ sudo chown -R $USER:$GROUP /var/lib/cassandra
$ sudo chown -R $USER:$GROUP /var/log/cassandra
Now we have a machine with raw Cassandra installation but do not run it, we want to have it raw, as it is now.

Last step is to create a startup script that will add the machine name to the /etc/hosts file. Why doing it? It is kind of workaround. When we will capture an image of this machine and use it to create another, with different name, Azure won’t add it for us and we will run into :

Error: Exception thrown by the agent : java.net.MalformedURLException: Local host name unknown: java.net.UnknownHostException: <name>:<name>

while starting Cassandra.

To create this script open some text editor:

$ nano
And paste the following code to it:

#!/bin/bash
local_address=`hostname -I`
cassandra_yaml="$HOME/cassandra/conf/cassandra.yaml"
sed -i "/^127.0.0.1 localhost/ c\127.0.0.1 localhost $HOSTNAME" /etc/hosts
sed -i "/^rpc_address: / c\rpc_address: $local_address" $cassandra_yaml
sed -i "/^listen_address: / c\listen_address: $local_address" $cassandra_yaml

Save it with .sh extension. We have to make this script executable:

$ sudo chmod +x <script_name>.sh
To make it run during boot, create a symbolic link in /etc/init.d:

$ sudo ln -s /path/to/script/<script_name>.sh /etc/init.d/<script_name>.sh
And add script to the startup time:

$ sudo update-rc.d <script_name>.sh defaults
As you probably noticed, this script does more than I described above. It changes also some of the Cassandra configuration options which are mandatory for further Cassandra execution:
  • listen_address – an address on which Cassandra node communicates with other nodes
  •  rpc_address – an address on which node will listen for clients
Script sets values, for both above options, to virtual machine internal address. Why setting rpc_address to internal address, not 0.0.0.0? It depends on client, more specifically on protocol. If you are going to use client that supports only Thrift protocol, set it to 0.0.0.0 then Thrift will listen on all interfaces. But if you are going to use new CQL native protocol and place client within cluster service use the internal machine address. Cassandra client drivers provided by DataStax( for Java, C# and Python) supports CQL native protocol and are fully asynchronous. Moreover you can configure retry, reconnection and load balancing policies, so you have full control on cluster traffic.

But there is still one more mandatory option that is not configured – seeds - which is in fact a comma delimited list of hosts addresses. Cassandra nodes uses this list to find each other and learn the topology of the ring. At this point virtual machine created from that image, won’t know anything about other nodes in the service subnet, so later, you will have to manually add the seed nodes addresses in cassandra/conf/cassandra.yaml.
Now our default Cassandra node is almost finished. Last thing to do is to prepare it for capturing by undoing the provisioning customization. Following command does the trick:

$ sudo waagent –deprovision
And we are ready for capturing.

Capturing Virtual Machine


In Azure management portal shutdown the machine we were working on. When it will be off, the Capture icon became enabled. 



Clicking it displays following window:

Set the name for the image and tick the “I have run the Windows Azure Linux Agent on the virtual machine” checkbox. As states in IMPORTANT NOTE section, this virtual machine will be deleted.

Creating Cassandra cluster

Now when we have image of machine with Cassandra on it, we can start deploying a cluster. In this example I will describe scenario of deploying cluster in a single service.


Creating nodes.

This step is almost same as Creating virtual machine on Azure, difference is that we will use our captured image. You can find it in VM’s Gallery under MY IMAGES. Note to use same username as during image creation! This will prevent creating a new profile on that machine. Repeat this step as many times as many nodes you want in your cluster, but remember to use same service for all of them.

Running the cluster

When all of the machines will be up, you will have to connect to each of them in order to configure cassandra.yaml file by adding seeds to it. You will have to select which nodes will be treated as seeds, get their internal addresses and update the cassandra.yaml with them. Next run Cassandra on each machine starting from machines considered as seeds. To run Cassandra type:

$ cassandra/bin/cassandra
When Cassandra will be up on all machines, you can check with nodetool if all of created nodes are in ring. Just execute following command on any machine:

$ cassandra/bin/nodetool ring

That’s it, using this image we can add new nodes to cluster pretty fast and in easy way. In above process of adding new nodes there is still alot of space for automatization, but it is a subject for another article.


Thursday, 18 April 2013

Automated testing in the cloud

Overview

Modern enterprise organization usually maintains at least one web application or Intranet system.  Critical issue for organization is to ensure that critical business functions will be available to customers, suppliers, regulators, and other entities that must have access to those functions.
Maintaining business continuity requires system test execution, in particular functional tests, performance tests, stress tests and continuously monitoring of services. The traditional approach to testing emerges a number of problems and the need to incur the necessary costs. Requires the ongoing commitment and maintain the validation team, infrastructure, tools and licenses to plan and execute testing and reporting, as in any organization meets the requirements of a limited budget, tight deadlines to provide tested solutions. If you also count the cost of a single test, the number of tests needed for a full test cycle, the need for regression testing, poor reusability and the lack of testing in a distributed environment with multiple locations then we find that is not possible to carry out some tests using traditional methods. Human resources and infrastructure costs are too high. Maintaining system continuity requires performing continuous actions for testing team, in particular functional tests, performance tests, stress tests and continuously monitoring of services.


Solution

Testing platform is the solution for all these problems. Let's imagine that you have a team of hundreds or even thousand validation engineers. Then imagine that your team executes for all day and the night, during several weeks, a lot of testing scenarios using various operating systems, browsers from distributed localization. Cognitum provides flexible automated cloud testing platform based on Microsoft Azure. Platform shifts testing of applications into virtual infrastructure and simulates real world user traffic from different location, operating systems, browsers and test cases.  Scalability of infrastructure gives possibility to employee hundreds or even thousand virtual computers on demand. Flexibility manages number and configurations of those virtual machines. Distributed testing environment allows simulating users, maintaining business continuity and executing almost all testing types as a cost-effective solution. The solution is scalable according to the needs of the company, to the maximum capability of infrastructure, database or Azure cost plan.

Customer’s issue

Blue chip company from energy sector which has a number of internal systems, websites and web portal for their customers. The company must deliver continuous operation of internal systems for proper operation of the accounting department, electric grid monitoring, HR system and the client department. In addition, continuous monitoring is required website, because informs about company, prices and urgent messages about failures, and network maintenance. In other site, the company has system for individual and business customers, which allows them to log on anywhere and control over their bills and payments.
In case of internal systems testing services we are using virtual network (based on Microsoft Azure), which allows secure access to the local network. Connecting via VPN gateway provides corporate data security, which is very important for any organization. Thus, our test platform has access to a live system and data in a production environment which cause solution more reliable and effective.
In case of websites and web applications, testing platform can simulate a massive user’s traffic, examine continuous system availability, and perform regression testing after each update.
Moreover, the company require duplicate production environment for testing purposes. Applications and system are moved and launched as a testing environment in the cloud. For security reasons, sensitive data are anonymized and reproduced by a statistical model of production data. In this environment, the testing can be performed using traditional methods, but using a testing platform in the cloud, both methods can be combined together. Test platform running in the cloud carries previously designed test scenarios. Execution of the test scenarios is managed by the special tool called Test Manager. In this way, the test scenarios can perform testing on duplicated applications in a virtual environment. It completely frees the company from having a physical test infrastructure.

Benefits

The implemented solution has brought a new quality to the issue of testing web and enterprise infrastructure. Based on test platform in the cloud, the organization has reduced employment of validation team and the total cost of system’s maintenance.


Cognitum cooperates with Microsoft under prestigious Azure Circle program, where technology partners are invited with experience in Windows Azure. It provides IT solutions in the area of Cloud and BigData for customers both in Poland and abroad. 
 

Thursday, 21 March 2013

When the cloud pays off?

Cloud (cloud computing) has recently become a buzz word. At this year's CeBIT in Hannover (the largest IT exhibition in the world), where we showed one of our solutions, many companies presented their services in the cloud, mainly lease infrastructure or applications. But is it just a fashionable term, or is it really a new quality?
 

When the cloud really pays off? Wherever utilization of computing power varies significantly over time and expanding own data-center does not belong to the core business. Examples?


  • Financial Industry: Generating monthly hundreds of thousands statement documents for the customers, transferring them electronically and archiving these documents for five years. In such a scenario, periodically, once a month, we need high computing power and large, secure data archive. With the cloud you pay only for what you use and we are able to fully scale solution to current needs. 
  • Maintain IT systems: Testing in the cloud. Instead of building up another data center to create a test environment, you can lease the infrastructure and systems necessary to reproduce it for the test and utilize of them only when they are needed. And in such size that reproduce reliably even large production environment. Cloud gives us the opportunity to virtualize the test process itself, so we can continuously monitor the availability of IT systems and study the behavior of systems under heavy load.
  • Market Research and PR: Monitoring and analysis of data from the Internet. With cloud computing, we can obtain large amounts of data from the Internet and effectively process them to get the interesting information (who, when, and how wrote about interesting topic; identifying opinion leaders; comparing the different channels of information; data collection and analysis of e-commerce; comparison activity of PR companies, brands, names, etc.). Azure cloud-based solution also allows for full integration with existing systems in the company (CRM, ERP, security), and all the obtained data remain the property of the company. 



Cognitum cooperates with Microsoft under prestigious Azure Circle program, where technology partners are invited with experience in Windows Azure. It provides IT solutions in the area of Cloud and BigData for customers both in Poland and abroad.