Archive for the ‘national’ tag
Managing Regional Spatial Data
I had a bit of time to think about how regional GIS data was managed, in the context of a national system. That is, how national data sets, that are split between regional teams, were best controlled and potentially, eventually, merged – or at least, conceptualised as a single data set. Some time ago I wrote a requirements document on how a regional data store system could work. There are a couple of bogeymen at work here:
- Some regional data sets are captured to local data standards (i.e. locally decided attributes, locally decided positional accuracy)
- Some regional data sets are captured to national data standards
That’s just the data side. Add to that the presentation picture:
- Sometimes, users want to see regional portions of a data set
- Sometimes, users want to see the whole national picture, and expect seamlessness around regional boundaries.
This does, indeed, cause problems. Clearly, national standards are the way forward with this, but aren’t always available, or haven’t been decided yet, or, more than often the case, regional groups have done it ‘their way’. There are obviously a number of choices here:
- Make all the data sets conform to a nationally agreed standard post-event, which might be a subset of the ‘best bits’ of each
- Start again with the national standard that you really wanted everyone to capture too
- Don’t bother with standards
If you were to choose the latter, it might just be business-reasonable, if it’s a low-value data set. However, this implies that you don’t want to see the data in its seamless entirety, or that you’re happy with a cartesian join across all regions that might just give you a massive (and potentially duplicate-bearing) attribute set.
Don’t forget edge matching
And then of course, there’s the seamless boundary creation. If data has been created by different folk at different regions, you’re bound to get some edge match issues. Of course, you’ve got your regional boundaries as templates, so they’re going to be useful to ensure you clip the edges to where they’re supposed to be. And you might choose to extend certain features to the boundary where required, so as to avoid gaps. Gaps like, where linear features don’t match, but you have to take a strategy on mid-point-join, or go with one region over another.
The practical option is, in this case, the conform-to-best-shot case. After all, that’s what data cleansing is about. I’d probably go as far as saying that 1Spatial’s Radius Studio is probably your best friend on this one, because, if you’ve got more than one copy that you want to do some spatial checking on the way in, you’re probably going to go down that route.
Is it justifiable? Well, again, it depends on the value of the data set. If you’re going for something that needs to be nice and clean (as it should – don’t forget we NEED data quality out there), you should be coming up with a master schema that takes in the best elements of each – perhaps with a bit of massaging- and a bit of edge cleaning.
But voila, you can come up with a great number of data sets that look like one person managed the whole country load in their spare time.