Monthly Archives: April 2011

Using Firebird inside Azure (without VM role)

Recently I was playing with Amazon EC2 trying what I could use it for. I was playing with Firebird there as well. But in fact you have virtual machine you can use. So using Firebird there wasn’t that hard. But I got and idea about Azure, because it’s more platform for applications than computers.

Then the VM role was introduced and the challenge was somehow not so challenging. But …

Yes, it was still in my mind. My rough idea was abuse Firebird Embedded and load it inside Web or Worker role. Only problem was where to store the database. Sure, the scaling will be compromised, but it’s just “try it and see what could be done and let others (not) use it”. 8-) The CloudDrive solved my problems about where to store the database.

So it was no-brainer to try it.

First some helper for CloudDrive use.

public class CloudDriveHelper : IDisposable
{
	CloudDrive _drive;
	
	public string DriveLetter { get; private set; }

	public CloudDriveHelper(CloudStorageAccount storageAccount, string name, int cacheSize = 0, int driveSize = 1024)
	{
		CloudBlobClient client = storageAccount.CreateCloudBlobClient();
		CloudBlobContainer container = client.GetContainerReference("drives");

		container.CreateIfNotExist();

		string diskName = string.Format("{0}.vhd", name);
		_drive = storageAccount.CreateCloudDrive(container.GetPageBlobReference(diskName).Uri.ToString());
		try
		{
			_drive.Create(driveSize);
		}
		catch (CloudDriveException)
		{ }
		DriveLetter = _drive.Mount(cacheSize, DriveMountOptions.None);
	}

	public void Dispose()
	{
		Dispose(true);
	}

	~CloudDriveHelper()
	{
		Dispose(false);
	}

	void Dispose(bool disposing)
	{
		if (disposing)
		{
			_drive.Unmount();

			DriveLetter = null;
			_drive = null;
		}
	}
}

I added reference to FirebirdSql.Data.FirebirdClient assembly. I played with Firebird Embedded in package, but putting it to blob storage via CloudDrive would be easier, maybe next time. :) The environment in Azure is x64 (for Web and Worker roles, run by WaWebHost and WaWorkerHost respectively), so don’t forget to use proper version. Anyway, then I abused Web role to see some results.

Simple controller action.

public ActionResult Index()
{
	using (CloudDriveHelper drive = new CloudDriveHelper(Global.Account /* could be CloudStorageAccount.DevelopmentStorageAccount as well */, "firebird", driveSize: 1024))
	{
		string database = Path.Combine(drive.DriveLetter, "SomeDatabase.fdb");

		TestClass.CreateDatabase(database);

		ViewData["FirebirdVersion"] = TestClass.GetServerVersion(database);
		ViewData["SomeData"] = TestClass.SomeQuery(database).ToArray();

		return View();
	}
}

And some methods to do actual work.

public static void CreateDatabase(string databasePath)
{
	FbConnection.CreateDatabase(CreateConnectionString(databasePath), true);
}

public static string GetServerVersion(string databasePath)
{
	using (FbConnection conn = new FbConnection(CreateConnectionString(databasePath)))
	{
		conn.Open();
		return conn.ServerVersion;
	}
}

public static IEnumerable<Tuple<string, object>> SomeQuery(string databasePath)
{
	using (FbConnection conn = new FbConnection(CreateConnectionString(databasePath)))
	{
		conn.Open();
		using (FbCommand cmd = conn.CreateCommand())
		{
			cmd.CommandText = "select * from mon$database";
			using (FbDataReader reader = cmd.ExecuteReader())
			{
				if (reader.Read())
				{
					for (int i = 0; i < reader.FieldCount; i++)
					{
						yield return Tuple.Create(reader.GetName(i), reader[i]);
					}
				}
			}
		}
	}
}

Well, it did worked OK. Before you think how cool is that I have some bad news. The cloud computing is mainly about scaling and elasticity. With this you have one drive and (have to have) one instance of Firebird working with it – you’re not scaling. You can’t scale with this solution. So it’s more about being concept. However I came with two possible options, that are more realistic.

First is having one special Worker role processing some data and storing it in Firebird database (for whatever reason). The Azure machines are quite powerful and if you have everything there why to setup your own server… And I think this can be worth in some scenarios (apart not being fault tolerant etc.).

Other one builds on top of previous solution and abuses Firebird’s external tables. You can load or store data via external tables to blob storage via CloudDrive and (re-)use already written logic in stored procedures (or triggers). Little crazy, I know hence I’ve not tried it (so maybe it’s not doable). But technologies are here to use these on the edge.

As I said, the real world usage of all this isn’t big, but as a exploration project it was fun. :)

Some thoughts on denormalization

From time to time I have an idea for helpful project I could do to help myself or people around me to finish some task faster/easier. And, of course, it often involves database. And though I’m a strong believer in normalization (3NF is a must), at least during initial design, sometimes I see this project being partly simple and partly I want it done in no time (because I’m doing it in my own spare time) I think about storing data denormalized and doing the work in application.

Classical variation is user and his/her permissions. It’s a standard 1:N case (also might be M:N). The proper way is to create two tables and use foreign key to ensure data integrity. But often you feel it’s not so crucial function and maybe you’ll have less than five permissions, you think, storing it in column comma separated is good idea and will cut the time significantly.

You might be right. But you feel it’s not correct. It’s bad. It’s not for future extending of application (and we all know it’ll happen 8-)). Today I realized I can have all from both worlds – good design and quick development.

Design it normalized, because it’s what you should do. But then you can create a simple view where you use some kind of “LIST” function (i.e. Firebird has exactly that named one). This view will create you denormalized form of data. With a small help from triggers you can also update that view hence underlying data (left as an exercise for reader).

select list(r, ',') from
(
  select 'r' as r from rdb$database
  union all
  select 'r' from rdb$database
  union all
  select 'r' from rdb$database
);

Because you write this only once (yes, probably you have to rename some table next time), for next “fun” project you’re designing tables as it should be and because you have it already ready in tool-belt you have the ease of development too.

My head-split is solved. :)

Gopas TechEd 2011 – mé přednášky

Tento rok, podobně jako v předchozích letech, se uskuteční velká akce s názvem Gopas TechEd. Konkrétně 2.-5. května. Celý program je již dostupný na webu. Nicméně já bych vám doporučil středu 12:45 – 14:00 a 15:45 – 17:00. ;)

První přednáška nese název “Entity Framework Futures” a jde o představení novinek od vydání Entity Frameworku 4.0. Především tedy Code First, DbContext API atp. Celá anotace:

Entity Framework verze 4 se ještě nestihl ohřát a již se můžeme podívat, jakým směrem je budou další verze vyvíjet. Jednou z nejvíce skloňovaných novinek je Code First přístup. Ale nejen to. Podpora pro Table-Valued funkce, migrace DB, zjednodušený databinding, lokální dotazy, …, abych vyjmenoval některé. Prozkoumejme společně tyto připravované novinky.

Druhá v pořadí je přednáška s názvem “Task Parallel Library”. Moc originální název to není, prostě název balíku vylepšení v .NET Frameworku 4, které ulehčují psaní paralelních programů, především těch, které využívají fine-grained paralelismus. Celá anotace:

Vícejádrové procesory a stanice s více procesory už nejsou hudbou budoucnosti, ale realitou. A tato oblast se každým měsícem více a více rozvijí. Bohužel .NET Framework až do verze 4 neposkytoval mnoho prostředků pro příjemné využití všech výpočetních jednotek. Task Parallel Library tot vše mění. Od jednoduché paralelizace smyček, přes stromy paralelně běžících funkcí až po pipelining a vylepšené synchronizační prostředky. Zkusme využít procesory naplno.

Těším se na vás na (nejen) mých přednáškách uvidím a rád zodpovím vaše dotazy (pokud budu znát odpověď).

TeamCity, PowerShell and sqlcmd problem (and solution)

In my current project we’re using TeamCity as a continuous integration server and psake to run all out build and deployment tasks. Part of the deployment is execution of SQL scripts to create database and create structures in it. And as a heavy-duty console user, I’m using sqlcmd to do all my work with database. So I simply called sqlcmd with according parameters and to execute the scripts. Sadly for some strange reason, the PowerShell runner in TeamCity kept running in a loop eventually ending with timeout. Even worse, running it locally directly in PowerShell was fine. After small research done by my colleague we found the reason is sqlcmd.

The quick’n'dirty solution was to run it using Start-Process, sadly we lost the output (using -noNewWindow resulted in same problem), so any error was about guessing. Simple techniques, like redirecting output from cmdlet ended with same problem.

But I actually found a solution. I redirected the output from sqlcmd directly with -o switch and used Unicode using -u. Then I echoed the file via Get-Content cmdlet with -encoding Unicode switch.

After I rigorously specified the encodings for output and input, everything started running fine. Probably there was some problem with BOM being in input, causing headache to the runner.

Never mind, now it’s fine, and I’m happy, because I see errors in processing SQL scripts eventually.