Archives for August, 2011
From time to time I’m working on some method that returns some collection. Mainly processing some data from input. Often it’s really just couple of conditions, get something from there and here and return it. Because I’m composing these methods too, if I return IEnumerable<T> and later in other method I need to add something (if you’re lost, you’ll see what I mean in example below), I need to use some variable, like array or list and append (or prepend). Boring.
For a while I was wondering, how slow it will be, if I’ll be simply creating new IEnumerable sequences and concatenating these. I was expecting it to be slower, but is it only couple of percents or order of magnitude? Today, when I came to my office, I simply decided to test it.
The first version looks like
static IEnumerable Test1(int[] part1, int[] part2, int[] part3)
{
IEnumerable<int> result = Enumerable.Empty<int>();
result = result.Concat(part1);
result = result.Concat(part2);
result = result.Concat(part3);
return result;
}
And the other one
static IEnumerable<int> Test2(int[] part1, int[] part2, int[] part3)
{
IList<int> result = new List<int>();
foreach (var item in part1)
result.Add(item);
foreach (var item in part2)
result.Add(item);
foreach (var item in part3)
result.Add(item);
return result;
}
Although it, especially the other one, can be written in different way(s), as a measure I think it’s OK. And it’s close to how I often process the data.
I did couple of runs to eliminate some errors, with “Release” build, without attached debugger. The part1 and part3 parameters were always the same in size. The part2 I was playing with, because the size affects the speed too.
If the part2 size was roughly under 10k items, the speed difference was on the edge error of measurement. From 10k+ to 1M items it’s about 25% the Concat approach being slower. Some absolute numbers (averages from 20 runs, “Release” build, without attached debugger) from my laptop:
Size: 1 Time1: 0 Time2: 0 %: 0
Size: 2 Time1: 0 Time2: 0 %: 0
Size: 3 Time1: 0 Time2: 0 %: 0
Size: 4 Time1: 0 Time2: 0 %: 0
Size: 5 Time1: 0 Time2: 0 %: 0
Size: 6 Time1: 0 Time2: 0 %: 0
Size: 10 Time1: 0 Time2: 0 %: 0
Size: 100 Time1: 0 Time2: 0 %: 0
Size: 1000 Time1: 0 Time2: 0 %: 0
Size: 6000 Time1: 0,2 Time2: 0 %: 0
Size: 20000 Time1: 2,05 Time2: 0,5 %: 24,390243902439
Size: 60000 Time1: 6,55 Time2: 1,6 %: 23,3576642335766
Size: 100000 Time1: 11,8 Time2: 3,15 %: 24,7899159663866
Size: 1000000 Time1: 124,85 Time2: 33,2 %: 26,9609775325187
Conclusion? If the data is relatively small, the path you choose doesn’t really matter. For “bigger” collections the imperative approach provides better performance.
Sometimes I came to discussion about Entity Framework not being able to use (map) particular stored procedure somebody wrote to do something very quickly and/or efficiently (kind of
). You know, it’s boiling water for coffee, printing invoice and sending flowers to cafeteria girl down in a hall.
Not always this is a good optimization. Don’t get me wrong, I like stored procedures, if used properly. But sometimes the solution is easier. More and more are people forgetting about indices. Something databases are very good at using. And not only using, also maintaining and defining and so on. Proper index in heavily used query can make it order of magnitude faster. Especially for huge tables (when on proper fields).
The conclusion? Don’t immediately try to jump from sets and plain query definitions into imperative programming in stored procedures. Set operations are still very fast, database optimizers can do magic when it’s just query definition and indices are in place. And it’s way easier to live with index than to maintain stored procedure.
When I’m teaching my Entity Framework trainings, I’m always begging to look, at least from time to time or when you see the query looks complex, to generated SQL statement. And if you have (near to) real data, also execution plan. Although Entity Framework helps you with standard data access layer, it’s not magic – the query translation is complex process and sometimes what you capture in LINQ query isn’t exactly how you’d express it in SQL. You simply have different concepts in LINQ vs. in SQL.
Last week I was writing some decision algorithms based on data and I was accessing it, of course, using Entity Framework. Because the conditions we’re complex I was writing these as it came from my head to my fingers. The day after I was writing similar condition, only one or two options negated and I wrote it differently. Basically I was swapping All and Any methods. These two are interchangeable, if you change conditions accordingly.
As an example let’s have and condition: “All apples are green.” aka “All(apple => apple.Color == Green)“. But you can also say “No (any) apple is non-green.” aka “!Any(apple => apple.Color != Green)“.
Now the magic comes to play. You might think, well, if it’s interchangeable, then it’s good, as Entity Framework can always utilize EXISTS predicate from SQL. For simple queries maybe. But if you think about various places where the condition can occur and how easy is to negate the condition you immediately have a lot of problems in front of you. Add to this database engine optimized, where it can or can’t use properly indices, reorder conditions to create smaller intermediate result sets etc. A lot of places where the machine needs to (try to) figure out what’s best way of getting your data for you.
Sadly there’s no rule of thumb, like always use Any. Only one good and 100% working advice is to always check the query and execution plan. But even with i.e. All the result could be absolutely fine.
Kindle has a nice feature that keeps your furthest page read synchronized across all devices. Sadly it has one or two problems. First, it’s really furthest page read, hence if you start reading the book from start later again, it is still keeping the furthest location, which is basically the end of the book. And similarly, if you have more Kindles under one account and more people are reading the same book. But this one is kind of expected.
Though you can reset the furthest page read easily through Amazon support (didn’t tried) or by juggling with turning off and on synchronization and redownloading the book from archive (didn’t tried either), I found an easy solution. Before e-books we were using bookmarks, I mean real bookmarks. Piece of paper (or some fancy materials like leather) inserted between pages where you stopped reading. Voilà. Same concept we have in Kindle. It’s little bit more powerful, but the basics are same.
So my solution works like this. If stop reading, I put there a bookmark. When I later begin to read again, on different device, I simple go to last bookmark (if you have some other bookmarks further in book, you’ll need to recall from excerpt which one is correct, but I believe you, as me, often end/start on “milestones” like (sub)chapters or at least paragraphs where some idea ended). Bookmarks are synchronized across devices and you can have more than one in book – good when you’re reading the book with somebody else. Of course, from time to time, I remove previous bookmarks, to keep just the last one and have it clean.
Most of the time I’m reading the book only once, and then looking for specific passages, but sometimes it’s just worth read it again. Knowing where I ended bothered me and asking support to reset it, isn’t in my opinion good experience. But I think bookmarks are solving it pretty well.
It has been few days my first BlackBerry application was approved in AppWorld. I’m trying to learn how things are done in this world and real world application is in my opinion best. In fact I had same application previously on my Windows Mobile devices, though only I and, I believe, one friend used it. Anyway because I knew what I want, it was good starting point and good motivation, because I’m and I’ll be using it too.
So what is this application about? It’s pretty simple. There’s a local company called Student Agency that’s running nice bus lines between biggest cities and also allows you book the seat, change reservation, check availability etc. via SMS. Great if you need to change your plans during the day without access to internet. Only problem is that you need to send these text messages in specific and exact format. Learning these is boring and typing even more. Here the SA SMS Booking application comes handy. You simply select from options on screen and the message is created for you (you see it, so you can learn all the commands if you want) and you can immediately send it from application as well (no need to do copy and paste). And that’s it. Nothing more, nothing less. Actually one more thing. The application remembers your ticket card number (kind of your internal ID in the booking system), because that’s almost always the same.
The SA SMS Booking is free.
Here’s the screen (only one) of the SA SMS Booking 1.0:

It’s basic. Focusing only to do the thing you want to do as quickly as possible (I’m sometimes trying to book a seat while trying to catch the bus itself
).
The design is, yes, none. I’m not a designer, hence for 1.0 I used the default look of elements. Anyway, if you’re interested in creating a design for it, feel free to drop me a line. The application is free, so only paycheck will be your name on screen on something like that.
Maybe you still remember days where we were trying to make application smaller to fit it to floppy. UPX, ARJ and all this stuff. Then the internet came and the limits were transformed to to application size itself, but to a download time of the installer. Yes in these days every application had installer, even the simplest one. And do you remember the magic numbers around how quickly it will download on 56kbits modem? Nostalgia.
And I think the history repeats self now. Obviously nobody cares how big the application is if it’s distributed on CD or DVD, it’s big enough. Same for download. In fact I think only few of you are using CDs/DVDs now. It’s easier to download the image from internet (or the application itself). But more and more applications run in browser, using JavaScript (and HTML, CSS) for doing something useful. And though we have reasonably fast lines now it still matters how quickly the code files will be downloaded, because it might take significant waiting when i.e. there’s a lot of JavaScript files or the files are huge. Every good site is minimizing JS code and also compressing them, like the old days.
Wondering what the next step will be…
Yep, it’s done. Now you can download FirebirdClient from NuGet. From nuget.org/List/Packages/FirebirdSql.Data.FirebirdClient to be precise.
It took me a while to find some time to create the package and publish it. But recently I started using NuGet quite often, so assigned this task higher priority.
The build there is same as the default one (targets .NET 4 CLR) you can download from site. Later I’d like to incorporate into package other versions (different CLRs, Mono builds, …) too. Maybe the other pieces like WebProviders, DDEX (?) and unstable builds could be there too. I’ll think about it more.
Hope you’re excited as I’m and you’ll enjoy it.
Is it better to call ObjectContext.SomeEntitySet.AddObject() or ObjectContext.AddToSomeEntitySet()? Short answer is: It doesn’t matter.
Long answer. The AddToSomeEntitySet method calls base.AddObject("SomeEntitySet", someEntitySet);, you can see it from generated code. The other method calls base.Context.AddObject(this.FullyQualifiedEntitySetName, entity);. Hence it’s almost the same. Only difference is in FullyQualifiedEntitySetName property that is used. So it might be little bit slower, but I think it’s unmesurable difference. Also take into account other parts of your application, honestly, where you’re probably wasting more time.
What’s your preferred call?
I’m a console guy. I like to work from whatever is text based (yes, I do remember DOS (the real one, not the black hole window in Windows), I’m old
). That’s why I’m doing most of my MS SQL Server/Azure SQL database work though sqlcmd. But on Azure SQL I’m often being disconnected because I’m reading the results, testing something etc. and the connection is simply closed by server to save resources. Reconnecting is pain and slows me down.
But yesterday I was working in Silverlight version of SSMS that’s available on Azure (although you can connect to any other server), just to try it. Well it’s little bit slower, but it has a query window so I’m able to type command and that’s what I need (maybe some shortcut to execute it). And I realized, that even if I’m not doing anything for couple of minutes, the connection is reopened when I start doing something.
That’s nice. I don’t have to install full blown SQL Server Management Studio and still have comfortable work.