Some SQL Server Compact Toolbox usage graphs

The anonymously collected usage statistics provide by SmartAssembly in the latest release of the Toolbox also provides some general counters, that may be of general interest. Keep in mind that these figures are collected amongst Visual Studio users, and therefore do not represent the general public. The figures represent about 2000 Toolbox installations.

Visual Studio version used

image

More than 50% of the Toolbox users have Visual Studio 2012 installed.

OS platform

image

30% of the Toolbox users are using Windows 8, 66% Windows 7 and hardly any use Vista/XP

OS bitness

image

Only 15% of the Toolbox users run a 32 bit OS.

CPU cores

image

Toolbox users have modern machines – 75% of the users have a CPU with 4 cores or more

Fix for Entity Framework poor INSERT performance with SQL Server Compact and server generated keys

In this blog post I will describe the steps I took in order to find out why the title above was the case, and how it could be fixed.

On Stackoverflow the general opinion was that the reported slowness was “by design” and could not be fixed, but looking at recent tests posted on Stackoverflow pointed to the fact that something was not done right.

Since Entity Framework is now Open Source and available on CodePlex, I decided to have a deeper look.

To test if the process could be improved, I created the following console app:

 

   1:  namespace EF6SqlCETest
   2:  {
   3:      using System;
   4:      using System.Data.Entity;
   5:      using System.Diagnostics;
   6:   
   7:      class Program
   8:      {
   9:          static void Main(string[] args)
  10:          {
  11:              using (var db = new StudentContext())
  12:              {
  13:                  Stopwatch sw = new Stopwatch();
  14:                  db.Database.Delete();
  15:                  sw.Start();
  16:                  db.Database.CreateIfNotExists();
  17:                  db.Configuration.AutoDetectChangesEnabled = false;
  18:                  db.Configuration.ProxyCreationEnabled = false;
  19:                  Console.WriteLine(
  20:                      "Db created in {0}", sw.Elapsed.ToString());
  21:                  sw.Restart();
  22:                  for (int i = 0; i < 4000; i++)
  23:                  {
  24:                      var student = new Student { Name = Guid.NewGuid().ToString() };
  25:                      db.Students.Add(student);
  26:                  }
  27:                  Console.WriteLine(
  28:                      "Entities added in {0}", sw.Elapsed.ToString());
  29:   
  30:                  sw.Restart();
  31:                  int recordsAffected = db.SaveChanges();
  32:                  sw.Stop();
  33:                  Console.WriteLine(
  34:                      "Saved {0} entities to the database, press any key to exit.",
  35:                      recordsAffected);
  36:                  Console.WriteLine(
  37:                      "Saved entities in {0}", sw.Elapsed.ToString());
  38:                  Console.ReadKey();
  39:              }
  40:   
  41:          }
  42:      }
  43:   
  44:      public class Student 
  45:      {
  46:          public int Id { get; set; }        
  47:          public string Name { get; set; }
  48:      }
  49:   
  50:      public class StudentContext : DbContext
  51:      {
  52:          public DbSet<Student> Students { get; set; }
  53:      }
  54:   
  55:  }

The test project and the related app.config is available for download here: http://sdrv.ms/UCL2j5

The test code is a simple Code First DbContext model. For each run I start with a new blank database, and creates it before doing SaveChanges, so that part of the process can be timed individually. The 2 options on lines 17 and 18 are there to ensure that the for loop runs quickly, without these option the loop adding objects takes much longer (test for yourself).

The resulting table looks like this:

CREATE TABLE [Students] (
[Id] int NOT NULL IDENTITY (1,1)
, [Name] nvarchar(4000) NULL
);
GO
ALTER TABLE [Students] ADD CONSTRAINT [PK_dbo.Students] PRIMARY KEY ([Id]);
GO

In order to find out where time was spent during SaveChanges, I ran a Visual Studio Performance Analysis. It turned out that all the time was spent in sqlceqp40.dll, the SQL Server Compact 4.0 unmanaged query processor – so something was amiss.

As described in my earlier blogpost, the SQL statements generated in order to return the server generated id (the IDENTITY value), looked like the following:

SELECT [Id] FROM [Students] WHERE [Id] = @@IDENTITY

So using the SQL Server Compact Toolbox, I coulde analyze the 2 statements:

image

And got the following result:

image

So for every INSERT, a table scan was performed, as for some reason, the SQL Server Compact query processor could not figure out to do an Index Seek. And the more rows to be scanned, the worse the performance got. And all the time for the operation was spent doing this.

In order to avoid this, I decided that the goal of the statement executed should be to avoid table scans, but return a value with the exact same shape as the previous statement executed, that is; it should have the name of the IDENTITY column, and be of the correct type (only bigint and int are supported as IDENTITY types with SQL Server Compact).

The return value of @@IDENTITY is numeric, so simply using “SELECT @@IDENTITY AS [Id]” would not work. So the statement could be:

SELECT [Id] FROM [Students] WHERE [Id] = CAST(@@IDENTITY as int)

The type could then be either int or bigint and the column alias should of course be the correct column name.

I could then analyze the modified statement:

image

 

image

And no table scan, instead an index seek, consuming only 19% of the total query cost instead of 100%.

And so this is what I have implemented in my fix, that I now need to figure out how to “submit a pull request” for.

SQL Server Compact Code Snippet of the Week #7 : get the full path to a database file

A SQL Server Compact connection string allows you to specify the database name in various ways:

Data Source=c:datamydb.sdf

Data Source=mydb.sdf

Data Source=|DataDirectory|mydb.sdf

But sometimes you need the full path to the database file based on a user defined connection string. This week’s code snippet allows you to do exactly that, and it is a little bit quirky, as it works around a bug in SQL Server Compact 4.0 SP1, that causes use of |DataDirectory| to not be resolved correctly using only the SqlCeConnectionStringBuilder.

 public string PathFromConnectionString(string connectionString)
{
SqlCeConnectionStringBuilder sb = new SqlCeConnectionStringBuilder(GetFullConnectionString(connectionString));
return sb.DataSource;
}

public string GetFullConnectionString(string connectionString)
{
using (SqlCeReplication repl = new SqlCeReplication())
{
repl.SubscriberConnectionString = connectionString;
return repl.SubscriberConnectionString;
}
}

Notice that the code above only works with version 4.0, not 3.5

SQL Server Compact Code Snippet of the Week #6 : list all user tables in a database

This week’s short code snippet shows how to list all the user table objects in a SQL Server Compact database file. Notice that a SQL Server Compact database can also contain a number of system tables, these can be created by the Merge Replication/RDA/Sync Framework APIs. The metadata (like list of tables) is not exposed as tables, but as something called views, specifically the INFORMATION_SCHEMA views (despite the fact that SQL Server Compact does not support user defined views).

SELECT table_name AS Name FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE <> N’SYSTEM TABLE’

SQL Server Compact Code Snippet of the Week #5 : rename a table

The SQL Server Compact ADO.NET provider allows you to rename a table name by using the special sp_rename T-SQL “command” (SQL Server Compact does not support stored procedures, but this “command” corresponds to the sp_rename system stored procedure on SQL Server).

You cannot only use sp_rename against a table in SQL Server Compact using the ADO.NET interfaces, but using the native OLEDB interfaces, you can also rename columns, as Joao demonstrates here. This is used in his Data Port Console product. Otherwise you can just script a DROP and CREATE statement with the SQL Server Compact Toolbox, and rename both column and table names in the CREATE script.

To rename a table, use the following T-SQL code:

sp_rename 'OldName', 'NewName'

Fixing the Entity Framework designer “Generate Database from Model” T4 template

The Entity Framework Designer include a reverse engineer feature called “Generate Database from Model” that enables a “Model first” workflow, and also enable you to persist any model modification in a new database.

image

The T4 template used for this feature supports both SQL Server and SQL Server Compact, but unfortunately lacks consistency in it’s use of the GO keyword. GO is used after each statement, except FOREIGN KEY constraint creation statements. Apart for being inconsistent, this also prevents SQL Server Compact Toolbox from executing the script, without a number of manual edits.

I have proposed a fix on Codeplex for the next version of the designer, but it will not happen until version 7 (if ever).

So in the meantime, I have updated the template to fix this, you can start using it today as replacement for the current one as follows:

1: Download my updated T4 file from here. (The same file is used in both Visual Studio 2010 and 2012)

2: Copy the downloaded SSDLToSQL10GOFixed.tt file to the
C:Program Files (x86)Microsoft Visual Studio 11.0Common7IDEExtensionsMicrosoftEntity Framework ToolsDBGen folder
(for VS 2012)
or to
C:Program Files (x86)Microsoft Visual Studio 10.0Common7IDEExtensionsMicrosoftEntity Framework ToolsDBGen
(for VS 2010).

3: With the Entity Framework designer open, go to properties, and select the new file as the DDL Generation template:

image

4: Generate the script.