As probably a lot of people, I recently converted to “thinner clients”. Most of my geoprocessing is now done on a server or on my office workstation which I access via ssh
. Naturally, after having been in the geo-whatever business for ten years, I have of lot data laying around (some which I probably am not allowed to re-use any more but I still keep them around anyway for no reason in particular), and this data is laying around in numerous locations: first, there’s literally a dozen directories of unsorted, slightly sorted in one or another kind of ordering system or prepared to put into an archive and hand over to friends or colleagues, on my laptop’s disk. Then, there’s a number of project directories residing on the same disk, each containing more or less geodata. The same for my various servers and the two computers I have access to at work. And of course, there’s then – apart from the real backup disks – a bunch of external drives which might contain project directories or small sub-collections of geodata for one topic or another.
Enough said about the starting point of a long and slowly going effort I am currently going through: I’m consolidating my geodata into on place. I hope to eliminate a lot of duplicates, and in the end of the day be able to access them from my laptop, my tablet, or any of my colleagues’ or friends’ place easily and quickly. For the vector data, which I started with because I feel the raster data is an entirely different, much more difficult topic with more and different constraints to be met, I decided for a PostGIS database. It allows me to access the data r/w via an SSL encrypted connection, QGIS has more than decent support for it, and I can configure accounts to access only parts of the data for clients or friends or colleagues which I can directly embed into the QGIS project files without them having to configure anything.
PostGIS comes with a handy shp2pgsql
script to import data into the database, which unfortunately lacks one thing: it cannot determine the imported shapefile’s spatial reference. You have to supply it on the command line unless you want it to be set to 0.
Easy cheesy to accomplish that with a little Python script employing GDAL/OGR 🙂
The following script accepts one parameter which should be a filename to a vector dataset (such as a shapefile). It returns an EPSG code (or nothing in case it cannot resolve the embedded SRS to EPSG). That’s perfectly simple, and at the same time incredibly convienient if you have a bash script which sorts or imports or renames or [insert your use case here] a bunch of vector geo data files.
You find the source code – as always under your favourite open source license (I would prefer if you used GPLv2 or MIT) – in my bitbucket at https://bitbucket.org/christophfink/getepsg/. Have fun, use it, and leave me a comment if you do 🙂