Rediscovering Postgres and EF core


Its ok folks, you can relax, yes I’m not ill, yes I have gotten around to writing another blog post, yes I know…. I don’t write enough anymore and I should 🙂

With that out the way, why am I suddenly getting all excited about Postgres and EF again (specifically EF core)

Well, it all started with the fact that after over 2 years of working with a very poorly designed database that has been organically grown on SQL server, I was finally asked to re-design it and trim it down to a more usable and sane model.

I didn’t design the original, so I’ve had to work around a lot of things that were poorly implemented, messy and well just darn right chaotic.  In the course of working with it, I’ve worked out in my head where improvements needed to be made, and how things needed to change for the better, and in my mind the best tool for the job of the new database had to be postgres.

Returning after a leave of absence

The last time I did anything serious with a PG database and entity framework was back on .NET core 2.0, and the interface between the two “npgsql” was, to say the least lacking.  The team tasked with making it all work had done a great job to this point, but as anyone who’s ever used Postgres will tell you, it’s a database that has some very serious tricks up its sleeve, in fact I’d even go as far to say that Postgres has some very surprising capabilities that you just don’t find in any other database system.  As of core 2.0, the EF provider for PG implemented pretty much only the very basics needed to use the standard ORM functionality, standard data types, standard object tracking and marshaling and similar things.

It was usable, but it lacked exposure for a lot of the clever things that PG can do, and so there wasn’t really any reason to use it over say SQL Server if all you wanted to do was standard data store operations.

With this recent change however, I’ve now come back to using Postgres in a major way, with .NET core 3 and the latest incarnation of “npgsql” and oh my goodness what a difference a version change makes.

And that brings me to the reason I decided to write this blog post, because now “npgsql” allows you to get at a lot more of Postgres’s functionality, specifically some of its very specialist data types, and in a very transparent fashion.

When is a datatype not a “standard” datatype

As many of you will be familiar with in data circles, there is a standard set of data types that all database servers implement, these fixed data types exist for a reason, so that if we do decide to swap out data storage providers mid project, the data types remain the same, and we don’t have to end up re-writing large chunks of our code, maintenance needs to be kept easy after all.

Unfortunately this often means that when you have a data storage technology that can implement all sorts of funky and useful types, it often takes a lot of extra direct SQL code to make use of them.

Postgres has a staggering amount of very useful extra data types, and if you’re doing any kind of mapping or GIS based work, there’s a few hundred more on top of that.

One of the greatest additions is the way that Postgres can natively handle “json” data, directly in a column, alongside standard, traditional data types such as string’s, integers, floats, bools and so on.

SQL server and many others can also handle JSON, but frequently only as a JSON string.  Postgres on the other hand can actually manipulate the JSON data, not just in string format but also in a highly efficient binary JSON format.  This might not seem all that exciting, until you realize that you can index, search, perform windowing and aggregate functions and a whole host more directly in that JSON column inside a regular RDBMs style table.

Those readers among you who understand the value of document style databases and “NO-SQL” might actually be sitting here reading, and realize that this means you can have the BEST of BOTH WORLDS.  You can actually manage document style free form data directly inside a column that sits inside a rigidly managed traditional database schema.

Rather than the new version of “npgsql” just acting as a serializer/deserializer (Which it does incredibly well) it also now exposes application level logic, in standard C#/EF code that exposes this ability, and that’s just skimming the tip of the iceberg.

 

Shutup shawty, give meh some codez

Fire up Visual Studio 2019 (It needs to be 2019 due to the use of .NET core 3), or your command line and VS-Code it doesn’t matter, as long as you create a new “.NET Core 3 console mode program”

Once you have the standard boiler plate “hello world” app loaded, check the project properties make sure your targeting core 3.0

And then, hop into your NuGet panel and add the “npgsql” entity framework core provider

Adding this should also add all the dependencies needed by the provider to work correctly.

I’m not going to go through setting up a Postgres database server or anything like that, I’m going to assume you know how to do that, or already have a Postgres instance available.

Since this is just a plain old console mode application, and not using the startup.cs class we’ll need to manually load the connection string in our data context, we do this by overriding our “onconfiguring” method in our database context.

Add the following code to the “DatabaseContext” class you created earlier:

The next thing we need to do is add some classes to model the data you wish your app to handle, add 2 new classes to your project as follows:

You don’t have to call your data objects “Person” and “Product”, and feel free to model the data anyway you see fit for experimentation, I do encourage you to use the same attributes and data types however so you get a good feel for this.

In “product.cs” add the following code:

And in “Person.cs” add the following:

As you can see, what you’ve done so far is just standard data modelling for Entity Framework, but look closer at the data objects.

In the Product class we have 2 columns that are string arrays, and one column that is strongly typed, and has the type name of “json” set on it.

Meanwhile, the “OnShiftPeriods” has a data type of “NpgsqlRange<T>”.

All 4 of these data types will get mapped directly and transparently to native data types inside the Postgres database.

Add references to the 2 table types to your data context:

Then add the final data model “CatalogSections” as a class to your project containing the following:

At this point you should be able to open the package manager console, and add a new migration step, before we do however, there’s something you need to be cautious of.

By default NpgSql will now use the new “identity column” type for primary keys.  This column type ONLY works on PostgresSQL version 10 onward.

Many of the Linux distributions are as of my writing this, still shipping the 9.6/9.8 versions in their OS package managers.

In my case the Database I’m using to demo this is only a 9.6 version, but I can’t upgrade at the moment as it’s also in use for other things.

If you have a situation where you’re going to be running this against an older version of PG, then you need to turn on compatibility mode, you do this by overriding the version when you set the connection string in “OnConfiguring” as follows:

If you get any strange exceptions while trying to apply your migrations to the database, or any strange SQL run time errors, then this is the most likely candidate, and you need to use backward compatibility mode.

All being well, if you open the Package Manager console, and add a migration step, everything should work as expected.

You should follow that with an “Update-Database” command, to apply your migration step.

If everything is successful you’ll see a lot of SQL statements fly past, and if you then look in your Database Management tool (In my case PGAdmin 4), you’ll see you’re newly created database.

Now we have a data context we can use, we can add/remove data in our console mode program, let’s add some new records:

If you run the program, then check your database management tool, you’ll see the data in Postgres using PG’s native data types.

At this point you can use all the fancy stuff inside your own PG functions, allowing you to slice/dice arrays, search on JSON and a whole heap of other stuff.

Better than that however, we can now take full advantage of these data types inside our standard EF code.

For example:

To select only “Red” products from the database.

No longer do you need to pull the delimited string back, split it, search it, make a list of the ID’s, then go and fetch those ID’s from the database.

Likewise, we can use the JSON data column in the same way

Do exercise caution though with the “json” querying operators, as many of them for efficiency reason, do only work with “binary json” (jsonb) rather than just plain old text based Json.

In a future post we’ll dig into this a bit further and cover some spatial and GIS operations, and the range type I added to the data model in this post, for now however I highly encourage you to read the official documentation at

https://www.npgsql.org/efcore/mapping/general.html

To better understand just what’s exposed here now, and how you can make better use of Postgres’s native types to better model your data store.

Shawty

Leave a comment