May 31, 2012

Spatiality Part 4, Fiddling with SQL Server Spatial SQL

SQL Server is quite potent when it comes to spatial queries. It follows the Open Geospatial Consortium Simple Features for SQL, Version 1.1 and implements about 70 functions covering areas such as intersects, unions and the like. Basic GIS-stuff.

It is a bit different syntax from what most people are used to, but once you get past that it's a breeze.

SQL Server 2008 supports two kinds of spatial data.

Geometry, that means flat.
Geography, that means spherical.
I still believe that the world is spherical even though I read a lot of hard evidence suggesting otherwise here. 


So I would choose the geography style. However, all my data is flat so...


In my spatial database I have lots and lots of areas. And it would be nice to get all areas within say 20 km from my position. (I can't believe some third world countries still hang on to miles)


You can probably do this the hard way and use a lot of math to get an answer, or you can do it the GIS way.


GIS (Geographical Information Systems) is a neat science. One of its strengths is to get new information by making calculations on thematic layers on a map. For me it started with reading the book Map Algebra by Dana Tomlin and I was hooked. If you are interested read more here.


Spatial SQL is a bit different but you can achieve a lot by using similar techniques.

Back to the task.


My plan is as follows:
  1. Create a geographical area covering a 20 km circle with it's center in the middle of stockholm.
  2. Use that circle to overlay my data and by SQL return all the objects that intersects my buffer.
  3. Show it in my blog.
Let's start with number one in the list.

DECLARE @stockholm geometry = geometry::STPointFromText('POINT (18.06861 59.32944)', 4326)

So... what do we do here? We (or actually I) create a point geometry by using the geometry::STPointFromText function. As parameter we have the coordinates as a string followed by the mystical 4326 which simply points out that we are using the reference system WGS84. The same reference system as my own data.

This will give us a one dimensional point.


To make it into a circle we use:

DECLARE @stockholmbuffer geometry = @stockholm.STBuffer(0.2)

We simply tell SQL Server to make a buffer around our point with the distance 0.2. 

The distance on my map is in degrees. Lats and longs. 

And a degree is around 110 km and 20% of that is just close enough 20 km to make me happy. 
I'm not picky.

So now we have declared a 20ish km radius circle with its center in Stockholm.

Next step is to overlay it with the existing data.

This is how we do it:


SELECT *
FROM areas
WHERE ogr_geometry.STIntersects(@stockholmbuffer) = 1

Just a simple call to the intersect function that returns 1 on an overlap.


So to summarize:


DECLARE @stockholm geometry = geometry::STPointFromText('POINT (18.06861 59.32944)', 4326)


DECLARE @stockholmbuffer geometry = @stockholm.STBuffer(0.2)


SELECT *
FROM areas
WHERE ogr_geometry.STIntersects(@stockholmbuffer) = 1

And the result is this:






May 24, 2012

Spatiality (a short interlude with a hint of bitterness)

Importing data to SQL Server needs its own entry in this blog. Especially when it comes to international characters. You know: 'Ö', 'Ü', 'Ñ',  and the likes. I am talking about UTF-8, UTF-16 and other ways to map a character to bits and bytes.

Character encoding.

The problems started in 1963 with ASCII (and probably earlier in the analog world with Morse codes, viking runes and smoke signals). ASCII and IBM's ECBDIC managed to fit a character into seven and eight bits respectively. Memory space was a very limited resource those days so all was good until someone realized that there actually exists other languages than English and they tried to hack it into ASCII using ISO 8859.

This time they forgot small regions such as the Arabic world, Japan, India and China...

1988 they finally started working on UNICODE and nowadays we only have one way to save international characters: UTF-8, UTF-16 and UTF-32...

Most stuff you find on Internet is UTF-8, SQL Server handles UTF-16 and when you import stuff into the database without spending an hour of googling ahead of the task everything ends up in the non UTF fields called varchar...

Some people probably love code pages and character encodings. I don't. For me it is one of those totally uninteresting side issues that distracts my creativity from really achieving something. It is fully in the line with issues such as coordinate transformations, spending time fixing web stuff to make it readable in any browser and trying to access web services not made for Microsoftians as myself.

I hate it.

Give me one type of coordinate, browser, service and one type of string!

I want to use my brain for content and functionality, not details.

So... this long rambling obviously cooks down to the fact that I goofed up and managed to import all that data in spatiality 3 to a varchar field instead of an nvarchar field. And as I have a blog I have the opportunity to channel my bitterness...

Getting the stuff all the way to Azure made it even less interesting to go through the whole process again and try to import the data 'the right way'.

So I googled.

Ye who google shall find.


I found this. A function that fixed my mistakes and saved a good day of work for me. I just added an nvarchar to the table, updated it with the function and dropped the old one. Worked like a charm.

I love google and I love people like Jason that cleans up after my mistakes. Thank you!



May 12, 2012

Spatiality Part 3

Finally, after a barrage of work, I have the possibility to do some own coding and make one more step toward finishing my next app.

So... were are we?

In part 2 I finally managed to export a shape file to Sql Server 2008, this time I plan to export that spatial data to my Azure SQL.

So what to do when you start with something you never done before?

Google!

My first finding was this link. Since I enjoy simplicity the Import and Export Wizard sounded perfect. I've used it countless times before and it has never failed me.

I follow the steps, opens up an ip address to azure and everything works brilliant until I get to the Review Data Type Mapping phase, when the wizard gives me something like this:


Reason? The Import and Export Wizard can't export spatial datatypes...

So time to google again!

This time I found the SQL Azure Migration Wizard.

This tool has a bit of crappy GUI (I use the 125% size on text and it doesn't handle it). But i transfers data using Bcp so I had least hope of getting it to work.

My first attempt crashed, another googling and I realized that I needed to upgrade my SQL Server to service pack 1.

I tried again and this time:



I ran  a query agains the table in SQL Server Management Studio and lo and behold, all is there!

Next up: exposing the data through an azure web service.

PS. If you like sharepoint follow this!

May 9, 2012

Another don't in SQL Server...

Found out the hard way that it's not a good idea to rename your default data base in SQL Server 2005 (yeah, some of us still work with ancient technology).

If you do, you will be stuck for some time googling and guessing sqlcmd commands. More info here.

May 3, 2012

Spatiality Part 2

As I almost promised in part 1, I got my hands dirty in converting shape files to SQL Server.
It wasn't as easy as I hoped.
I started out by downloading the source code for GDAL at http://www.gdal.org/ and tried my C++ skills. 

They were not too impressive.

After failing to build the project I started googling for "ogr2ogr binaries" and found some windows binaries here.

I unpacked the zipfile, navigated to the apps folder and tried ogr2ogr.exe. Of course it didn't work. It complained that it couldn't find the gdal19.dll file. 

I found that file in the bin catalogue, copied ogr2ogr.exe to the same directory and tried again.

Finally a response.

Trying out the full ogr2ogr command for my shape file and SQL Server was up next.

ogr2ogr -overwrite -f MSSQLSpatial "MSSQL:server=[my server name];database=[my database name];trusted_connection=yes" "[my shape file]"

Of course it complained of more missing files. This time it missed the gcs.csv file. 
I found it in the gdal-data folder and copied all of the folders content to the bin folder and tried again and voilá! 160000 objects imported into SQL Server like magic. Pretty quick too.

Maybe I was a bit drastic in copying files like a maniac, but hey, it worked! And now I can just type 

SELECT TOP 500 name, ogr_geometry FROM areas

Resulting in this in SQL Server (I love the spatial results tab):



May 1, 2012

Spatiality Part 1

I have an old love affair with large data sets. It really feels cool to have a database over ALL birds in the world, a network graph of all connecting flights in the world or detailed data over all the countries in the world.

When I was nine or ten years old all I wanted for Christmas was Statistisk Årsbok.

I got it. I loved it.

My bookshelves contains a plethora of different non fiction titles chosen to pick every subject possible. I got books on medieval weapons, about steam machines, gardening and tropical diseases.
I don't know why I am so fascinated on the thought of collecting all the world's data but I do suspect that collecting is one of the reasons. Another reason is probably the pursue of knowledge. A third the sense of having control over the world. The rest probably comes from my childhood, they say that most stuff originates from it. Mine was good.
Even better than data sets are data sets with a position. Maps. They say that 95% of all information can be positioned. I don't know if it's true, but I used it as a fact when I was teaching Geographic Information Systems(GIS) at the university.

GIS can be a troublesome experience for a developer. The main players are huge systems that not always are easy to develop in. If you ever tried coding in AML or MapBasic you know what I mean.
Maps, how easy they may seem, are complicated beasts and in a GIS system you add data layer by layer usually from different sources. Land data from one place, vegetation from another, roads from a third and so on. Each source can be of a different cartographic projection and level of detail meaning that your aggregated map very well can end up with roads in the ocean and country borders that overlap. I really hate cartographic projections by the way. It is old technique when you had to map the round world onto a flat paper. Let the world be round in a computer! Projections just makes it more complex and I want my life simple.  (I guess that is one reason why I love Google Earth)

Since version 2008 you can also put maps into  SQL Server, which opens up new hacking possibilities that I am eager to explore. The problem, however, is that most of the map data out there is in a different format.
The industry standard in map data is the Shape File used in ArcGIS, nowadays in competition with KMZ used by Google.

SQL Server supports none of these.

What's worse, I haven't really found any easy free way to import data. I installed GeoKettle after first having to learn how to install a .jar file with adminstrator rights. When I finally got it started, I didn't really understand it. I just want to import some data and I don't want a GUI that is more complex than choosing a source and a destination. I checked out FME, which is good but not free, but they can just read from SQL Server Spatial data. I need to write. I finally ended up with OGR2OGR from GDAL, which means using the good old DOS prompt instead of same fancy GUI. Not a simple solution (I even had to compile it myself) but I guess I'll just have to try it as soon as my huge data set comes in. This link made it seem promising. So hopefully I will soon control the world! :)