Category: Windows Azure


Continuing on where I left it on my previous post, I’m going to explain how the Announcement service works and why we choose that approach.

The way JBoss and mod_proxy work now is that every time something changes in the topology, either a new proxy is added or removed or a JBoss node, then the proxy list has to be updated and both the node and the proxy have to be aware of their existence.

Mod_proxy is using multicast to announce itself to the cluster but as this is not supported on Windows Azure, we created our own service that runs on the proxy and on the node also. Each time a new proxy or a node is added/removed, the service notifies the rest of the instances that something changed in the topology and they should update their lists with the new record.

The service is not running under a dedicated WorkerRole but it’s part of the same deployment as the proxy and the JBoss node. It’s a WCF service hosted inside a Windows NT Service listening on a dedicated port. That approach gives us greater flexibility as we keep a clear separation of concern between the services on the deployment and we don’t mix code and logic that has to do with the proxy, with the Announcement service. Originally the approach of using an NT Service caused some concerns as how this service is going to be installed on the machines and how can we keep one single code base for that service, running on both scenarios.

First of all, you should be aware that any port you open through your configuration is only available to the host process of the Role. That means if the port is not explicitly open again on the firewall, your service won’t be able to communicate as the port it’s blocked. After we realized that, we fix it by adding an extra line to our Startup Task which was installing the service on the machines. The command looks like this:

which is part of the installer startup task

To make our service even more robust and secure we introduced a couple of NetworkRules that they only allow communication between Proxies and Jboss nodes:

Any kind of communication between the services is secured by certificate based authentication and message level encryption. The service it’s a vital component in our approach and we want it to be as secure as possible.

The service is monitoring a couple of things that helps us also collect telemetry data from the Jboss nodes, but it’s also wired to a couple of RoleEnvironment events like OnStopping and OnChanged. Everytime there is an OnStopping, we send messages out to all of the other service instances to de-register that proxy from their list because it’s going down. Also, the service itself is checking on specific intervals if the others nodes are alive. If they don’t respond after 3 times, they are removed. The reason we do this, is to handle possible crashes of the proxy as fast as possible. Lastly, everytime there is an OnChanged event fired, we verify that everything is as we know they should be (nodes available etc).

Next post in the series, the cluster setup.

PK

I’m going to start a series of posts to explain how we made JBoss run on Windows Azure, not just on standalone mode but with full cluster support.

Let me start with one simple definition, I’m NOT a Java guy, but I work with some very talented people under the same roof and under the same practice at Devoteam.

So everything started when I posted on our internal collaboration platform a post about the new Windows Azure Starter Kit CTP being released for Java. Since my very beginning at Devoteam I’ve been brainwashing them to try it out, so that post was the kick-off they needed. Knowing our customer base, which is highly mixed and hybrid environments with Java and .NET and SAP systems etc etc, we want to create this:

“An integrated, enterprise-grade demo of a Windows Azure cloud setup, containing a .NET front-end, a JEE application in the cloud and a local mainframe instance; completely integrated.”

We used:

  • mod_cluster
  • jGroups
  • mod_proxy

The basic reason behind choosing JBoss, besides the fact that our customers use it also, was that it’s open and free and open means that we can change whatever we want to make it work and fit in our environment.

One of those talented guys Francois Hertay, modified the code for cluster discovery (jGroups) already provided to make it work in a more robust way and more important, make it work with JBoss 7 because it currenctly works only with version 6. We still use the BLOB approach but we changed it a little bit to make it more robust. In a typical enterprise scenario we have the proxies in front of the JBoss cluster and mod_proxy is also the one achieving the much needed state consistency as you might already know that Windows Azure is using a non-sticky load balancer. Also, based on the dynamic nature of how a Windows Azure instance behaves, it was impossible to have static IPs for the proxies and the instances and it was obvious we needed a little something for:

  • Discovering the proxies and announce them to the JBoss cluster
  • Make sure that this is removed when a proxy goes down or when a new node joins the cluster, it finds the proxies and registers them

We needed something different as mod_proxy uses multicast to announce itself to the cluster and this is not supported in Windows Azure. The solution was to create our own home-brewed announcer service and will take care of this.

Our final setup was 1 WorkerRole for the Proxy and 1 WorkerRole for the JBoss node.

We choose this setup so we can independently scale either the proxies or the JBoss nodes, which is pretty typical in an Enterprise environment.

On the next post, I will explain how the Announcement Service works and what can be improved in the future in the service.

‘Till then,

PK.

Today Windows Azure SDK 1.5 and Windows Azure AppFabric SDK 1.5 were released, fixing issues and bugs detected during the beta. There are also some new enhancements to it:

  • Re-architected emulator, which enables higher fidelity between local and cloud developments.
  • Support for uploading service certificates in csupload.exe.
  • A new csencrypt.exe tool to manage remote desktop encryption passwords.
  • Enhancements in the Windows Azure Tools for Visual Studio for developing and deploying cloud applications.
  • The ability to create ASP.NET MVC3 Web Roles and manage multiple service configurations in one cloud project.
  • Improved validation of Windows Azure packages to catch common errors like missing .NET assemblies and invalid connection strings.

and also some changes and new features on AppFabric:

  • Asynchronous Cloud Eventing – Distribute event notifications to occasionally connected clients (for example, phones, remote workers, kiosks, and so on)
  • Event-driven Service Oriented Architecture (SOA) – Building loosely coupled systems that can easily evolve over time
  • Advanced Intra-App Messaging – Load leveling and load balancing for building highly scalable and resilient applications

Best of all, is that the new features on Windows Azure AppFabric Service Bus, are free, you still only pay for the number of connections/relays although, you will see some new meters on your monthly bill, called “Entity Hours” and “Message Operations” but as I said, you’re not going to be billed for those.

I’ll be going into details in the next few posts, so stay tuned Smile

PK.

Earlier this week the Windows Azure platform was named Best Cloud Service at the Cloud Computing World Forum in London. Now in its third year, the Cloud Computing World Series Awards celebrate outstanding achievements in the IT market.  This year’s winners were selected by an independent panel of industry experts.

“It’s fantastic for us to see this type of recognition for the Windows Azure platform. We’re seeing companies creating business solutions in record times, reinforcing the new possibilities created by the cloud,” said Michael Newberry, Windows Azure lead, Microsoft UK.

Click here to read the press release about this award.

Source: Windows Azure Blog

Windows Azure SDK 1.4 was released yesterday with no breaking changes and a lot of stability fixes. You can get the bits from here. Among an important fix, specially if you use a source control system, where the web.config was being locked but Windows Azure Tools still needed write access to update the machine key, there are new features for the CDN (quoting from the release announcement):

Windows Azure CDN for Hosted Services
Developers can use the Windows Azure Web and VM roles as “origin” for objects to be delivered at scale via the Windows Azure Content Delivery Network. Static content in your website can be automatically edge-cached at locations throughout the United States, Europe, Asia, Australia and South America to provide maximum bandwidth and lower latency delivery of website content to users.

Serve secure content from the Windows Azure CDN
A new checkbox option in the Windows Azure management portal to enable delivery of secure content via HTTPS through any existing Windows Azure CDN account.

Get the bits and enjoy, I’m already updated :)

PK.

A lot of interesting things have been going on lately on the Windows Azure MVP list and I’ll be try to pick the best and the ones I can share and make some posts.

During an Azure bootcamp another fellow Windows Azure MVP, had a very interesting question “What happens if someone is updating the BLOB and a request come in for that BLOB to serve it?”

The answer came from Steve Marx pretty quickly and I’m just quoting his email:

“The bottom line is that a client should never receive corrupt data due to changing content.  This is true both from blob storage directly and from the CDN.

The way this works is:
·         Changes to block blobs (put blob, put block list) are atomic, in that there’s never a blob that has only partial new content.
·         Reading a blob all at once is atomic, in that we don’t respond with data that’s a mix of new and old content.
·         When reading a blob with range requests, each request is atomic, but you could always end up with corrupt data if you request different ranges at different times and stitch them together.  Using ETags (or If-Unmodified-Since) should protect you from this.  (Requests after the content changed would fail with “condition not met,” and you’d know to start over.)

Only the last point is particularly relevant for the CDN, and it reads from blob storage and sends to clients in ways that obey the same HTTP semantics (so ETags and If-Unmodified-Since work).

For a client to end up with corrupt data, it would have to be behaving badly… i.e., requesting data in chunks but not using HTTP headers to guarantee it’s still reading the same blob.  I think this would be a rare situation.  (Browsers, media players, etc. should all do this properly.)

Of course, updates to a blob don’t mean the content is immediately changed in the CDN, so it’s certainly possible to get old data due to caching.  It should just never be corrupt data due to mixing old and new content.”

So, as you see from Steve’s reply, there is no chance to get corrupt data, unlike other vendors, only old data.

PK.

In general, there are two kind of updates you’ll mainly perform on Windows Azure. One of them is changing your application’s logic (or so called business logic) e.g. the way you handle/read queues, or how you process data or even protocol updates etc and the other is schema updates/changes. I’m not referring to SQL Azure schema changes, which is a different scenario and approach but in Table storage schema changes and to be more precise only on specific entity types because, as you already now, Table storage is schema-less. As in In-Place upgrades, the same logic applies here too. Introduce a hybrid version, which handles both the new and the old version of your entity (newly introduced properties) and then proceed to your “final” version which handles the new version of your entities (and properties) only. It’s a very easy technique and I’m explaining how to add new properties and of course remove although it’s a less likely scenario.

During my presentation at Microsoft DevDays “Make Web not War”, I’ve created an example using a Weather service and an entity called WeatherEntry, so let’s use it. My class looks like this:

[DataServiceKey("PartitionKey","RowKey")]
   2: public class WeatherEntry : TableServiceEntity
   3: {
   4:     public WeatherEntry()
   5:     {
   6:         PartitionKey = "athgr";
   7:         RowKey = string.Format("{0:10}_{1}", DateTime.MaxValue.Ticks - DateTime.Now.Ticks, Guid.NewGuid());
   8:     }
   9:     public DateTime TimeOfCapture{ get; set; }
  10:     public string Temperature{ get; set; }
  11: }

There is nothing special at this class. I use two custom properties, TimeOfCapture and Temperature and I’m going to make small change and I’ll add “SchemaVersion” which is needed to achieve the functionality I want. When I want to create a new entry, all I do now is instantiate a WeatherEntry, set the values and use a helper method called AddEntry to persist my changes.

   1: public void AddEntry(string temperature, DateTime timeofc)
   2: {
   3:    this.AddObject("WeatherData", new WeatherEntry { TimeOfCapture = timeofc, Temperature = temperature, SchemaVersion = "1.0" });
   4:    this.SaveChanges();
   5: }

I’m using TableServiceContext from the newly released StorageClient and methods like UpdateObject, DeleteObject, AddObject etc, exist in my data service context where AddEntry helper method relies. At the moment my Table schema looks like this:

It’s pretty obvious there is no special handling during saving of my entities but this is about to change in my hybrid version.

The hybrid

I did some changes at my base class and I’ve added a new property. It’s holding the temperature sample area, in my case Spata where Athens International Airport is.

My class looks like this now:


   1: [DataServiceKey("PartitionKey","RowKey")]
   2: public class WeatherEntry : TableServiceEntity
   3: {
   4:     public WeatherEntry()
   5:     {
   6:         PartitionKey = "athgr";
   7:         RowKey = string.Format("{0:10}_{1}", DateTime.MaxValue.Ticks - DateTime.Now.Ticks, Guid.NewGuid());
   8:     }
   9:     public DateTime TimeOfCapture{ get; set; }
  10:     public string Temperature{ get; set; }
  11:     public string SampleArea{ get; set; }
  12:     public string SchemaVersion{ get; set;}
  13: }

So, this hybrid client has somehow to handle entities from version 1 and entities from version 2 because my schema is already on version 2. How do you do that? The main idea is that you retrieve an entity from table storage and you check if SampleArea and SchemaVersion have a value. If they don’t, put a default value and save them. In my case my schema version number has to be 1.5 as this is the default schema number for this hybrid solution. One key point to this procedure is before you upgrade your client to this hybrid, you roll-out an update enabling “IgnoreMissingProperties” flag on your TableServiceContext. If IgnoreMissingProperties is true, when a version 1 client is trying to access your entities which are on version 2 and have those new properties, it WON’T raise an exception and it will just ignore them.

   1: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
   2: var context = new WeatherServiceContext(account.TableEndpoint.ToString(), account.Credentials);
   3:
   4: /* Ignore missing properties on my entities */
   5: context.IgnoreMissingProperties = true;

Remember, you have to roll-out an update BEFORE you upgrade to this hybrid.

Whenever I’m updating an entity to Table Storage, I’m checking its version Schema and if it’s not “1.5” I update it and put a default value on SampleArea:

   1: public void UpdateEntry(WeatherEntry wEntry)
   2: {
   3:     if (wEntry.SchemaVersion.Equals("1.0"))
   4:     {
   5:         /* If schema version is 1.0, update it to 1.5
   6:          * and set a default value on SampleArea */
   7:         wEntry.SchemaVersion = "1.5";
   8:         wEntry.SampleArea = "Spata";
   9:     }
  10:     /* Put some try catch here to
  11:      * catch concurrency exceptions */
  12:     this.UpdateObject(wEntry);
  13:     this.SaveChanges();
  14: }

My schema now looks like this. Notice that both versions of my entities co-exist and are handled just fine by my application.

Upgrading to version 2.0

Upgrading to version 2.0 is now easy. All you have to do is change the default schema number when you create a new entity to version 2.0 and of course update your “UpdateEntry” helper method to check if version is 1.5 and update the value to 2.0.

   1: this.AddObject("WeatherData", new WeatherEntry { TimeOfCapture = timeofc, Temperature = temperature, SchemaVersion = "2.0" });

and

   1: public void UpdateEntry(WeatherEntry wEntry)
   2: {
   3:    if (wEntry.SchemaVersion.Equals("1.5"))
   4:    {
   5:        /* If schema is version 1.5 it already has a default
   6:         value, all we have to do is update schema version so
   7:         our system won't ignore the default value */
   8:        wEntry.SchemaVersion = "2.0";
   9:    }
  10:    /* Put some try catch here to
  11:     * catch concurrency exceptions */
  12:    this.UpdateObject(wEntry);
  13:    this.SaveChanges();
  14: }

Whenever you retrieve a value from Table Storage, you have to check if it’s on version 2.0. If it is, you can safely use its SampleArea value which is not the default any more. That’s because schema version is changed when you actually call “UpdateEntry” which means you had the chance to change SampleArea to a non-default value. But if it’s on version 1.5 you have to ignore it or update it to a new, correct value.

If you do want to use the default value anyway, you can create a temporary worker role which will scan the whole table and update all of your schema version numbers to 2.0.

How about when you remove properties

That’s a really easy modification. If you remove a property, you can use a SaveChangesOption called ReplaceOnUpdate during SaveChanges() which will override your entity with the new schema. Don’t forget to update your schema version number to something unique and put some checks into your application to avoid failures when trying to read non-existent properties due to newer schema version.

   1: this.SaveChanges(SaveChangesOptions.ReplaceOnUpdate);

That’s all for today!

P.K

Performing an in-place upgrade on Windows Azure to change your service definition file it’s not possible unless you stop the service, upgrade and then start it again. Otherwise you can do a VIP Swap. VIP stands for Virtual IP and VIP swaps can be either done by the Developer portal or using the Service Management API by calling “Swap Deployment” method.

If you use VIP Swap and as long as you the endpoints between the old and new service definition are identical, the upgrade process is seamless and pretty straightforward without any service interruption. But if, for example, you introduce a new endpoint or delete and older one, then this process is not possible and you have to stop, upgrade, start.

So, how can you perform this operation from the Developer portal. Simply logon to your account, go to Summary page, open your target project, open the service and then upload the new service definition file on Staging. Now, if you click Run on Staging, both versions of your service will work just fine (one in Production, one in Staging). When you hit the Upgrade button (the one with arrows in the middle) then Azure will upgrade your service from Staging to Production, changing the service definition file and complete the process just fine thus performing a VIP swap in the background.

You can perform the same operation using Service Management API and “Swap Deployment” method as I mentioned before.

For more information about VIP Swap using Service Management API, go here –> MSDN, Azure Service Management API, VIP Swap

Recently, I’ve been looking a way to persist the status of an idling Workflow on WF4. There is a way to use SQL Azure to achieve this, after modifying the scripts because they contain unsupported T-SQL commands, but it’s totally an overkill to use it just to persist WF information, if you’re not using the RDBMS for another reason.

I decided to modify the FilePersistence.cs of the Custom Persistence Service sample in WF 4 Samples Library and make it work with Windows Azure Blob storage. I’ve created two new methods to Serialize and Deserialize information to/from Blob storage.

Here is some code:


1: private void SerialiazeToAzureStorage(byte[] workflowBytes, Guid id)
2: {
3: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
4: var container = account.CreateCloudBlobClient().GetContainerReference("workflow_persistence");
5:
6: var blob = container.GetBlobReference(id.ToString());
7:
8: blob.Properties.ContentType = "application/octet-stream";
9: using (var stream = new MemoryStream())
10: {
11: stream.Read(workflowBytes, 0, workflowBytes.Length);
12: blob.UploadFromStream(stream);
13: }
14: }
15:
16: private byte[] DeserialiazeFromAzureStorage(Guid id)
17: {
18: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
19: var container = account.CreateCloudBlobClient().GetContainerReference("workflow_persistence");
20:
21: var blob = container.GetBlobReference(id.ToString());
22:
23: return blob.DownloadByteArray();
24: }

Just make sure you’ve created “workflow_persistence” blob container before using these methods.

PK.

Once again, Microsoft proved that it values its customers, either big enterprise or small startups. We’re a small private-held business and I personally have a major role in it as I’m one of the founders. Recently, I’ve been giving some pretty nice presentations and a bunch of sessions for Microsoft Hellas about Windows Azure and Cloud computing in general.

I was using my CTP account(s) I have since PDC 08 and I had a lot of services running there from times to times all for demo purposes. But with the 2nd commercial launch wave, Greece was included and I had to upgrade my subscription and start paying for it. I was ok with that, because MSDN Premium subscription has 750 hours included/month, SQL Azure databases and other stuff included for free. I went through the upgrade process from CTP to Paid, everything went smoothly and there I was waiting for my CTP account to switch on read-only mode and eventually “fade away”. So, during that process, I did a small mistake. I miscalculated my instances running. I actually missed some. That turned out to be a mistake that will cost me some serious money for show-case/marketing/demoing projects running on Windows Azure.

About two weeks ago, I had an epiphany during the day and I was like “Oh, crap.. Did I turn that project off? How many instances do I have running?”. I logged on the billing portal and, sadly for me, I was charged like 4500 hours because of the forgotten instances and my miscalculation. You see, I’ve did a demo about switch between instance sizes and I had some instances running like big VMs. That’s four (4) times the price per hour.

It was clearly my mistake and I had to pay for it (literally!). But then I tweeted my bad luck to help others avoid the same mistake and the thing I was been warning my clients all this time and some people from Microsoft got interested in my situation, I explained what happened and we ended up in a pretty good deal just 3 days after I tweeted. But, that was an exception and certainly DON’T count on it.

Bottom line is be careful and plan correctly. Mistakes do happen but the more careful we are, the more rare they will be.

* I want to publicly say thank you to anyone who was involved in this and helped me sort things out so quickly.

PK.

Follow

Get every new post delivered to your Inbox.