Tag Archives: Azure

Timestamp property on Windows Azure Table entity

Every entity in Azure Table Storage has PartitionKey, RowKey and Timestamp property. PartitionKey gives the “bucket” where the entity will be stored (entities in one partition can be on more machines, though). And the RowKey is simply the key of that row. Queries where you’re matching both keys are fast. Matches inside partition (you know PartitionKey but only part (or none) of RowKey) are slower. Cross partition matches even slower. You get the idea. But what about the Timestamp?

The Timestamp property is maintained by server and it’s the time the entity was last modified. Simple as it is. Because the property is always there, I thought it would be good to actually use it. But it’s not. This property is not being “indexed” (not even slightly :) ) or anything like that. That means any query including condition on it will be as slow as table scan (or any other property condition). And that’s something very very slow, especially if you have dozens of partitions with millions of records and you’re doing range condition (not mentioning the limit on one batch for result).

And I learned that by my experience. :) Because the secondary indices are not here yet, the only option, if you’re doing really a lot of queries based on date&time/last modified to (ab)use RowKey (or build and maintain secondary index yourself).

You might wonder how to create such secondary index. Easy, like in 1980s. You already have table (BTW did you noticed, how easy is to store something into table, but it’s order of magnitude harder (read slower) to get something from it, except if you’re doing PartitionKey&RowKey exact match?), if the inserts/updates/deletes are not super fast and you’re not fighting for throughput you can compute the key for secondary index directly while doing the operation and store it in another table with both keys from original table. But if you need the operation to be super fast, you can just put the data into queue message (if it’s too big for message (64kB in Jan 2013) you can put the data into blog and store just reference to blob) and use worker role(s) to process the data, compute secondary keys and insert/update/delete into table(s). Boring? Yes. Old school? Yes. But Azure Table Storage isn’t exactly smart storage, but it scales. :)

Let’s hope secondary indices (even stale would be good) will be added in the future.

Azure’s Silverlight SSMS to help manual reconnecting in sqlcmd

I’m a console guy. I like to work from whatever is text based (yes, I do remember DOS (the real one, not the black hole window in Windows), I’m old ;) ). That’s why I’m doing most of my MS SQL Server/Azure SQL database work though sqlcmd. But on Azure SQL I’m often being disconnected because I’m reading the results, testing something etc. and the connection is simply closed by server to save resources. Reconnecting is pain and slows me down.

But yesterday I was working in Silverlight version of SSMS that’s available on Azure (although you can connect to any other server), just to try it. Well it’s little bit slower, but it has a query window so I’m able to type command and that’s what I need (maybe some shortcut to execute it). And I realized, that even if I’m not doing anything for couple of minutes, the connection is reopened when I start doing something.

That’s nice. I don’t have to install full blown SQL Server Management Studio and still have comfortable work.

Using Firebird inside Azure (without VM role)

Recently I was playing with Amazon EC2 trying what I could use it for. I was playing with Firebird there as well. But in fact you have virtual machine you can use. So using Firebird there wasn’t that hard. But I got and idea about Azure, because it’s more platform for applications than computers.

Then the VM role was introduced and the challenge was somehow not so challenging. But …

Yes, it was still in my mind. My rough idea was abuse Firebird Embedded and load it inside Web or Worker role. Only problem was where to store the database. Sure, the scaling will be compromised, but it’s just “try it and see what could be done and let others (not) use it”. 8-) The CloudDrive solved my problems about where to store the database.

So it was no-brainer to try it.

First some helper for CloudDrive use.

public class CloudDriveHelper : IDisposable
{
	CloudDrive _drive;
	
	public string DriveLetter { get; private set; }

	public CloudDriveHelper(CloudStorageAccount storageAccount, string name, int cacheSize = 0, int driveSize = 1024)
	{
		CloudBlobClient client = storageAccount.CreateCloudBlobClient();
		CloudBlobContainer container = client.GetContainerReference("drives");

		container.CreateIfNotExist();

		string diskName = string.Format("{0}.vhd", name);
		_drive = storageAccount.CreateCloudDrive(container.GetPageBlobReference(diskName).Uri.ToString());
		try
		{
			_drive.Create(driveSize);
		}
		catch (CloudDriveException)
		{ }
		DriveLetter = _drive.Mount(cacheSize, DriveMountOptions.None);
	}

	public void Dispose()
	{
		Dispose(true);
	}

	~CloudDriveHelper()
	{
		Dispose(false);
	}

	void Dispose(bool disposing)
	{
		if (disposing)
		{
			_drive.Unmount();

			DriveLetter = null;
			_drive = null;
		}
	}
}

I added reference to FirebirdSql.Data.FirebirdClient assembly. I played with Firebird Embedded in package, but putting it to blob storage via CloudDrive would be easier, maybe next time. :) The environment in Azure is x64 (for Web and Worker roles, run by WaWebHost and WaWorkerHost respectively), so don’t forget to use proper version. Anyway, then I abused Web role to see some results.

Simple controller action.

public ActionResult Index()
{
	using (CloudDriveHelper drive = new CloudDriveHelper(Global.Account /* could be CloudStorageAccount.DevelopmentStorageAccount as well */, "firebird", driveSize: 1024))
	{
		string database = Path.Combine(drive.DriveLetter, "SomeDatabase.fdb");

		TestClass.CreateDatabase(database);

		ViewData["FirebirdVersion"] = TestClass.GetServerVersion(database);
		ViewData["SomeData"] = TestClass.SomeQuery(database).ToArray();

		return View();
	}
}

And some methods to do actual work.

public static void CreateDatabase(string databasePath)
{
	FbConnection.CreateDatabase(CreateConnectionString(databasePath), true);
}

public static string GetServerVersion(string databasePath)
{
	using (FbConnection conn = new FbConnection(CreateConnectionString(databasePath)))
	{
		conn.Open();
		return conn.ServerVersion;
	}
}

public static IEnumerable<Tuple<string, object>> SomeQuery(string databasePath)
{
	using (FbConnection conn = new FbConnection(CreateConnectionString(databasePath)))
	{
		conn.Open();
		using (FbCommand cmd = conn.CreateCommand())
		{
			cmd.CommandText = "select * from mon$database";
			using (FbDataReader reader = cmd.ExecuteReader())
			{
				if (reader.Read())
				{
					for (int i = 0; i < reader.FieldCount; i++)
					{
						yield return Tuple.Create(reader.GetName(i), reader[i]);
					}
				}
			}
		}
	}
}

Well, it did worked OK. Before you think how cool is that I have some bad news. The cloud computing is mainly about scaling and elasticity. With this you have one drive and (have to have) one instance of Firebird working with it – you’re not scaling. You can’t scale with this solution. So it’s more about being concept. However I came with two possible options, that are more realistic.

First is having one special Worker role processing some data and storing it in Firebird database (for whatever reason). The Azure machines are quite powerful and if you have everything there why to setup your own server… And I think this can be worth in some scenarios (apart not being fault tolerant etc.).

Other one builds on top of previous solution and abuses Firebird’s external tables. You can load or store data via external tables to blob storage via CloudDrive and (re-)use already written logic in stored procedures (or triggers). Little crazy, I know hence I’ve not tried it (so maybe it’s not doable). But technologies are here to use these on the edge.

As I said, the real world usage of all this isn’t big, but as a exploration project it was fun. :)

SQL Azure and Entity Framework

From time to time I get a question about SQL Azure and Entity Framework. Can I use it? Does it work? Etc. Because the SQL Azure is in fact a special version (you can check unsupported and partially supported T-SQL statements) of MS SQL Server it works without any hassle. That’s the short story.

The following lines will quickly guide you through the steps of using SQL Azure with Entity Framework. You need to have and Azure account ready. There’s no way to install SQL Azure to your own server or something like that. First login in to sql.azure.com and select some project you have there. This will bring you this screen.

By default only master database is created and it’s up to you to create additional database. Right now you can create from two editions and few sizes: Web and Business and from 1GB to 50GB. The price you’ll pay depends on this selection (but not only). For purpose of this article I created 1GB Web edition database.

On the main screen you have a button to get a connection string. I’m often using it, as it’s easier for me to get information from it than compose it from pieces on screen. :) Mine is:

Data Source=lskqxi46a0.database.windows.net;Initial Catalog=test;Persist Security Info=True;User ID=jiri@lskqxi46a0;Password=******;MultipleActiveResultSets=True;Encrypt=True

After these initial steps we can start using the database. Because I was too lazy to create tables and so on manually I tested one feature of Entity Framework I’m not much familiar with. It’s Model First. You simply create the conceptual model and later generate SQL script from it for your database (yes, Firebird has support too ;) ). I created basic “blogging system” model. Author has many Posts. Posts have many Tags. Right clicking on model and selecting Generate database from model resulted in my case in simple script.

-- --------------------------------------------------
-- Entity Designer DDL Script for SQL Server 2005, 2008, and Azure
-- --------------------------------------------------
-- Date Created: 01/29/2011 16:40:18
-- Generated from EDMX file: C:\Users\Jiri\Desktop\ConsoleApplication3\ConsoleApplication3\Model1.edmx
-- --------------------------------------------------

SET QUOTED_IDENTIFIER OFF;
GO
USE [test];
GO
IF SCHEMA_ID(N'dbo') IS NULL EXECUTE(N'CREATE SCHEMA [dbo]');
GO

-- --------------------------------------------------
-- Dropping existing FOREIGN KEY constraints
-- --------------------------------------------------

IF OBJECT_ID(N'[dbo].[FK_BlogPostTag_BlogPost]', 'F') IS NOT NULL
    ALTER TABLE [dbo].[BlogPostTag] DROP CONSTRAINT [FK_BlogPostTag_BlogPost];
GO
IF OBJECT_ID(N'[dbo].[FK_BlogPostTag_Tag]', 'F') IS NOT NULL
    ALTER TABLE [dbo].[BlogPostTag] DROP CONSTRAINT [FK_BlogPostTag_Tag];
GO
IF OBJECT_ID(N'[dbo].[FK_BlogPostAuthor]', 'F') IS NOT NULL
    ALTER TABLE [dbo].[BlogPosts] DROP CONSTRAINT [FK_BlogPostAuthor];
GO

-- --------------------------------------------------
-- Dropping existing tables
-- --------------------------------------------------

IF OBJECT_ID(N'[dbo].[BlogPosts]', 'U') IS NOT NULL
    DROP TABLE [dbo].[BlogPosts];
GO
IF OBJECT_ID(N'[dbo].[Authors]', 'U') IS NOT NULL
    DROP TABLE [dbo].[Authors];
GO
IF OBJECT_ID(N'[dbo].[Tags]', 'U') IS NOT NULL
    DROP TABLE [dbo].[Tags];
GO
IF OBJECT_ID(N'[dbo].[BlogPostTag]', 'U') IS NOT NULL
    DROP TABLE [dbo].[BlogPostTag];
GO

-- --------------------------------------------------
-- Creating all tables
-- --------------------------------------------------

-- Creating table 'BlogPosts'
CREATE TABLE [dbo].[BlogPosts] (
    [ID] int IDENTITY(1,1) NOT NULL,
    [Created] datetime  NOT NULL,
    [Heading] nvarchar(max)  NOT NULL,
    [Content] nvarchar(max)  NOT NULL,
    [Author_ID] int  NOT NULL
);
GO

-- Creating table 'Authors'
CREATE TABLE [dbo].[Authors] (
    [ID] int IDENTITY(1,1) NOT NULL,
    [Name_FirstName] nvarchar(max)  NOT NULL,
    [Name_LastName] nvarchar(max)  NOT NULL,
    [LastLoggedIn] datetime  NOT NULL
);
GO

-- Creating table 'Tags'
CREATE TABLE [dbo].[Tags] (
    [ID] int IDENTITY(1,1) NOT NULL,
    [Name] nvarchar(max)  NOT NULL
);
GO

-- Creating table 'BlogPostTag'
CREATE TABLE [dbo].[BlogPostTag] (
    [BlogPosts_ID] int  NOT NULL,
    [Tags_ID] int  NOT NULL
);
GO

-- --------------------------------------------------
-- Creating all PRIMARY KEY constraints
-- --------------------------------------------------

-- Creating primary key on [ID] in table 'BlogPosts'
ALTER TABLE [dbo].[BlogPosts]
ADD CONSTRAINT [PK_BlogPosts]
    PRIMARY KEY CLUSTERED ([ID] ASC);
GO

-- Creating primary key on [ID] in table 'Authors'
ALTER TABLE [dbo].[Authors]
ADD CONSTRAINT [PK_Authors]
    PRIMARY KEY CLUSTERED ([ID] ASC);
GO

-- Creating primary key on [ID] in table 'Tags'
ALTER TABLE [dbo].[Tags]
ADD CONSTRAINT [PK_Tags]
    PRIMARY KEY CLUSTERED ([ID] ASC);
GO

-- Creating primary key on [BlogPosts_ID], [Tags_ID] in table 'BlogPostTag'
ALTER TABLE [dbo].[BlogPostTag]
ADD CONSTRAINT [PK_BlogPostTag]
    PRIMARY KEY NONCLUSTERED ([BlogPosts_ID], [Tags_ID] ASC);
GO

-- --------------------------------------------------
-- Creating all FOREIGN KEY constraints
-- --------------------------------------------------

-- Creating foreign key on [BlogPosts_ID] in table 'BlogPostTag'
ALTER TABLE [dbo].[BlogPostTag]
ADD CONSTRAINT [FK_BlogPostTag_BlogPost]
    FOREIGN KEY ([BlogPosts_ID])
    REFERENCES [dbo].[BlogPosts]
        ([ID])
    ON DELETE NO ACTION ON UPDATE NO ACTION;
GO

-- Creating foreign key on [Tags_ID] in table 'BlogPostTag'
ALTER TABLE [dbo].[BlogPostTag]
ADD CONSTRAINT [FK_BlogPostTag_Tag]
    FOREIGN KEY ([Tags_ID])
    REFERENCES [dbo].[Tags]
        ([ID])
    ON DELETE NO ACTION ON UPDATE NO ACTION;

-- Creating non-clustered index for FOREIGN KEY 'FK_BlogPostTag_Tag'
CREATE INDEX [IX_FK_BlogPostTag_Tag]
ON [dbo].[BlogPostTag]
    ([Tags_ID]);
GO

-- Creating foreign key on [Author_ID] in table 'BlogPosts'
ALTER TABLE [dbo].[BlogPosts]
ADD CONSTRAINT [FK_BlogPostAuthor]
    FOREIGN KEY ([Author_ID])
    REFERENCES [dbo].[Authors]
        ([ID])
    ON DELETE NO ACTION ON UPDATE NO ACTION;

-- Creating non-clustered index for FOREIGN KEY 'FK_BlogPostAuthor'
CREATE INDEX [IX_FK_BlogPostAuthor]
ON [dbo].[BlogPosts]
    ([Author_ID]);
GO

-- --------------------------------------------------
-- Script has ended
-- --------------------------------------------------

Now we can finally write some code to let Entity Framework do some dirty job. Let's create some authors and issue a non-trivial query to see what's what.

using (Model1Container ctx = new Model1Container())
{
	Author a1 = new Author();
	a1.Name.FirstName = "Foo";
	a1.Name.LastName = "Bar";
	a1.LastLoggedIn = DateTime.Now;
	Author a2 = new Author();
	a2.Name.FirstName = "Jiri";
	a2.Name.LastName = "Cincura";
	a2.LastLoggedIn = DateTime.Now;
	ctx.Authors.AddObject(a1);
	ctx.Authors.AddObject(a2);
				
	ctx.SaveChanges();
	Debug.Assert(a1.ID != default(int));
	Debug.Assert(a2.ID != default(int));

	var blogPosts = ctx.Authors
		.Where(a => a.Name.LastName == "Cincura")
		.SelectMany(a => a.BlogPosts)
		.Where(bp => bp.Tags.Any(t => t.Name == "Databases" || t.Name == "Azure" || t.Name == "Cloud"))
		.Select(bp => new { bp.Heading, bp.Created });
	Console.WriteLine((blogPosts as ObjectQuery).ToTraceString());
	foreach (var blogPost in blogPosts)
	{
		Console.WriteLine(blogPost.Heading);
	}
}

Surprise. Nothing bloodthirsty. 8-) It works as expected. Two authors are created. You can check that by querying the table or by looking at database size at sql.azure.com. The query itself is standard T-SQL query you can run on MS SQL Server as well, nothing magic.

SELECT 
[Extent1].[ID] AS [ID], 
[Extent2].[Heading] AS [Heading], 
[Extent2].[Created] AS [Created]
FROM  [dbo].[Authors] AS [Extent1]
INNER JOIN [dbo].[BlogPosts] AS [Extent2] ON [Extent1].[ID] = [Extent2].[Author_ID]
WHERE (N'Cincura' = [Extent1].[Name_LastName]) AND ( EXISTS (SELECT 
	1 AS [C1]
	FROM  [dbo].[BlogPostTag] AS [Extent3]
	INNER JOIN [dbo].[Tags] AS [Extent4] ON [Extent4].[ID] = [Extent3].[Tags_ID]
	WHERE ([Extent2].[ID] = [Extent3].[BlogPosts_ID]) AND ([Extent4].[Name] IN (N'Databases',N'Azure',N'Cloud'))
))

Theory tells us, that if SQL Azure is more or less MS SQL Server database using more or less same wire protocol and more or less same T-SQL it should work. And it does. Here you have a small proof. Happy database-clouding.