Thursday, March 21, 2013

GIS metadata standard deficiencies

My recent post on GIS standards dilemma generated quite an interest so, as a follow up, I am publishing today a post explaining my position in more detail and illustrate deficiencies of one of the standards with concrete examples.

The conception of GIS metadata standard was a long awaited breakthrough and raised hopes of the entire spatial community that finally it will be possible to have a consistent way of describing vast amount of geographic information created over the years but also new data generated on a daily basis. The expected benefits of the standard were far reaching because it would allow consistent cataloguing, discovery and sharing of all the information. In other words, if successfully implemented, it would deliver a great economic benefit for all. The concept of Spatial Data infrastructure (SDI) was borne… That was more than a decade ago.

Fast forward to 2013. Creation of SDI has been a holy grail of GIS community for quite a long time so why, despite all the good intentions and millions of dollars poured into various initiatives, we still don’t have one in Australia? Why we don’t even have in place the first element of that infrastructure – a single catalogue of spatial data? In my opinion the answer is simple – because we are trying to act on a flowed concept.

At the core of the problem is a conceptual flaw in the underlying metadata standard that makes it impossible to implement successfully any nation wide or international SDI. In other words, SDI concept will never work beyond a closely controlled community of interest, with a “dictatorship like” implementation of the rules that go far beyond the loosely defined standards. Until that flaw is widely acknowledged we cannot move forward. Any attempt to build a national SDI, or even a simple catalogue based on flawed ISO 19115 standard is bound to fail and is a total waste of money. The reason why follows...

For years many were led to believe “follow the standard and everything will take care of itself”. But the reality check provides a totally different picture. For a start, it took years to formalise Australian profile of ISO 19115 standard. Then everybody started working on their own extensions because it turned out it is quite hard to implement the standard in a meaningful way for all the data types as well as historical data which lack many details about it. But the true nature of the problem lays somewhere else... 

You see, the standard prescribes the structure of the metadata record, that is, what information should be included, but to a large degree, it does not mandate the content. The result is a “free text” like entry for almost everything that is included in a metadata record. Just to illustrate, access constraint is specified as “legal” and “use” related, and both are limited to the following categories: “copyright, patent, patentPending, trademark, license, intellectualPropertyRights, restricted, otherConstraints”. But the information is optional so that metadata element may also be empty. Now, consider a case of a user who tries to find free data… impossible.

Inclusion of so much free text in metadata information means the key benefit of creating a structured metadata record in the first place is almost entirely lost. Yes, it describes the dataset it refers to but in a totally unique way, which means searching a collection of records can only be limited to very generic criteria – in practice, with any certainty to only time and location (ie. a bounding box for the dataset). The problem is compounded if you start looking across different collections of metadata records, created and maintained by different individuals, with a different logic of what is important and what is not… But don’t blame the creators of metadata records for this – the standard does not prescribe the content in the first place!

The second problem is that the current metadata standard is applied primarily to collections (like, for example, TOPO-250K Series 3 topographic vector data for Australia or its raster representation) but it is generally not applied to individual data layers within a collection (which, in case of TOPO-250K Series 3 data would be any of 92 layers that comprise the collection). Therefore, a simple search for say, “road vector data in Australia” will not yield any results unless you revert to free text search option and “roads” happen to be specifically mentioned somewhere within metadata record (more on this below).

Not to mention that it would be almost impossible, from a practical point of view, to apply the metadata in the existing format to individual features or points making up that feature. This aspect of information about spatial data, especially important for the data originators and maintainers, has been totally overlooked by the creators of the metadata standard. 

Then there is a data user perspective. The key benefit of a comprehensive metadata record is that it provides all the relevant information enabling user to firstly, find the data and secondly, decide whether it is fit for intended purpose. In the most general terms, the users apply “when, where and what” criteria to find the data (not necessary in that specific order). In particular, they specify the reference date (relatively well defined in metadata records so, the least of the problems), location (which is limited only to a bounding box but data footprint concept is also addressed within existing metadata standard) and some characteristics of the dataset … and this is where things are not so great because each data type will have its own set of characteristics and these are mostly optional in an ISO 19115 compliant metadata record (so may not be implemented at all by data providers).

Take for example cases where users are interested in “2m accuracy roads dataset for Bendigo, Vic”… or “imagery over Campbelltown, NSW acquired no later than 3 months ago and with under 1m resolution”. It is virtually impossible to specify search criteria in this way so the users have to fit their criteria to information that is captured in metadata. That is, location becomes the bounding box constraint, time criterion becomes date constraint (either specific or as a range from – to) and the characteristics of datasets can only be specified as keywords…

And this leads me to the final point - the need for ISO 19115 compliant metadata in the first place. Since the only truly comprehensive way to find what you are looking for is to conduct free text search, the structured content of the metadata record is obsolete. The result would be exactly the same if the information is compiled into “a few paragraphs of text”. That is the essence of the argument Ed Parsons, Geospatial Technologist of Google presented to the Australian spatial community as far back as 2009 but which remains mostly ignored to this day…

There is only one practical use for all the metadata records already created. You can dump the entire content of the catalogues, the ones that contain the information about the data you care, into your own server and reprocess it to your liking into something more meaningful, or just expose it to Google robots so that content can be indexed and becomes discoverable via standard Google search. Unfortunately, this totally defeats another implied benefit of SDI - that metadata records will be maintained and updated at the source and that there will be no need for duplication of information…

I believe it is time to close the chapter on a national SDI and move on. Another failed attempt to create “an infrastructure that will serve all users in Australia” cannot be reasonably justified. The bar has to be lowered to cater only for the needs of your own community of practice. Which also means, you have to do it all by yourself and according to your own rules (ie. most likely creating your own metadata standard). That’s the only way to move forward.

Related Posts:
Ed Parsons on Spatial Data Infrastructure
Data overload makes SDI obsolete

GIS standards dilemma

Wednesday, March 20, 2013

New attempt to build disaster management platform

A few days ago The University of Melbourne, IBM and NICTA announced their joint project to develop the Australia Disaster Management Platform (ADMP). The aim is to build “…an innovative, integrated, open standards-based disaster management platform designed to gather, integrate and analyse vast amounts of geo-spatial and infrastructure information from multiple data sets to create real-time practical information streams on disaster events.” The platform is expected to include 3D visualisation, simulation, forecasting, behavioural modelling and sensors. It is quite an ambitious undertaking but given the profile of institutions involved some good should come of it, eventually – initial pilot will focus on Melbourne and it will only be a proof of concept.

The announcement indicates that the ADMP will be developed and implemented in close collaboration with emergency services, and will be based on existing roadmaps such as the Victorian Emergency Management Reform - Whitepaper, Dec 2012. The Platform will then facilitate informed decision-making by communicating the information, via various channels and at appropriate levels of detail, to the wide spectrum of people involved in making emergency decisions - from the central coordinating agencies that are charged with directing activities, to on-ground emergency services personnel, through to the local community.

The comments published under the article in the Sydney Morning Herald announcing the project highlight initial scepticism of the community to this initiative. It probably reflects general perception of lack of progress on disaster mitigation and disaster response front since 2009 Victorian bushfires. To be fair, a lot has happened since then and emergency response organisations learnt and improved a lot as well but it appears members of the community feel that still not enough has been done to date.

Related Posts
Disasters and maps
Google public alerts map

Thursday, March 14, 2013

GIS standards dilemma

Standards underpin the entire discipline of geomatics, yet they are also the cause of the biggest failures when applied without much insight into their limitations.

Just imagine trying to create a map by overlaying two data layers with different and unspecified datums and projections. It is simply not possible. Datums and projections are examples of practical standards that work because they play a very important function in capture, management and use of spatial information.

Yet, there are also many cases where standards are an obstacle - for example, when people take them “too literary”. Many standards published by the Open Geospatial Consortium (OGC) for data interchange fall into this category (I have to declare here that I have been an avid critic of inappropriate use of spatial standards for a long time). Take for example GML defining geographic data structures – conceptually it is great, but when was the last time you downloaded or exchanged spatial data in this format? Never? I thought so…

Web Feature Service (WFS) is another standard that, in my opinion, failed to deliver. It was designed for accessing and/or updating data stored in dispersed databases via a common exchange protocol. True, you can quote “hundreds of millions of dollars” in projects for implementation of systems utilising WFS but beyond the closely gated communities with strict internal implementation rules, these are not part of a “global spatial data exchange”, which simply does not exist despite more than a decade of concerted efforts by many, many individuals and organisations. At the end of the day, people still prefer to get the data in “good old” csv or shp format… I am trivialising the whole issue but you get the point.

Spatial metadata standard is another example of a great concept (in theory) that failed to deliver in practice. The standard is too complex to implement and categorisations used are open to interpretation so, even if a particular record is “compliant” you cannot assume the information will be compatible with a record compiled for a similar dataset on another system. I am sorry to say but that is why any Spatial Data Infrastructure project based on ISO 19115 Metadata Standard is doomed to fail from the start…

Human nature is that we are drawn to simple things because of practicality, yet when we design things by a committee they tend to be bloated with complexities due to ever growing list of requirements. This is the case with many spatial standards…

I still remember the hype about WFS and GML at the conferences and presentations about interoperability a decade or so ago (funny how this word quickly fell off the vocabulary list) and how “dumb image” Web Map Service (WMS) was downplayed to the extent it was seen as an inferior solution not worth implementing.  Yet, even dumber solution from Google (ie. static tile based representation of spatial data) succeeded as a dominant online mapping tool (every respectable GIS software vendor offers tiled maps as a part of standard package). The promoters totally ignored a much simpler format for data transfer offered by Simple Feature Service (SFS) standard, that would have much better chance of being widely accepted, opting instead for the “the biggest thing in town”- WFS.

However, despite all the efforts and good intensions, the non-GIS centric rest of the world didn’t buy into the arguments and invented alternatives such as RESTful service and GeoRSS and GeoJSON formats for transfer of spatial data - in order to address specific requirements and, most importantly, to keep things simple! Open Street Map project (which has grown to the extent that now contains roads and topo features for almost the entire world) invented its own spatial data structure instead of following officially sanctioned GML format. Meantime, Google pushed its version of spatial data format called KML which was specifically created for Google's 3D mapping application (KML was subsequently handed to OGC for ongoing management). All these became de-facto standards - by acceptance and not by design.

So, should standards be followed or ignored? The key message I am trying to convey is that standards should only be used for guidance and be implemented when you can gain an advantage by following them. But standards should be ignored when they are an obstacle. For example, persisting with implementation of solutions incorporating WFS, WCS, CSW etc. OGC standards for backend processes, just for the sake of “compliance”, totally misses the intention of original creators of those standards. Pick the best bits and create your own “standards” if it gets you to the finished line faster and/or delivers extra benefits. Sure, if there is a demand for your data in WFS, etc. format, you should cater for that need but it should not stop you from implementing a backed architecture that is optimal for your specific requirements, regardless whether it is “compliant or not”. 

All in all, forget the standards if they prevent you from creating better solutions. Create your own instead. But use existing standards when they are fit for purpose. Because there is no need to reinvent the wheel… Common sense, just common sense.

Related Posts:
Ed Parsons on Spatial Data Infrastructure
Data overload makes SDI obsolete 
What's the benefit of gov data warehouses?

Monday, March 11, 2013

Point cloud 3D map technology

Progress in online mapping technologies happens in "in leaps and bounds" and then it stalls for a while until the next breakthrough energises new cohort of followers and imitators. It has been eight years since Google introduced its Google map and Google Earth to the world. Since then there has been a good progress with "flat mapping" applications but the 3D branch did not progress at the same pace...

A few years back a UK online directory was the first major internet portal I came across that deployed point cloud 3D mapping technology. It was quite an impressive application and certainly worked well in a browser without any plug-ins. For a moment it looked that this technology could be the next catalyst for a rapid advancement in presenting spatial data in three dimensions. The UK government even termed 3D point cloud spatial capability as a “map of the future”. I write about it in past tense because you can’t find the application on any more (although, you can still probably find some old youtube videos demonstrating its capability). So, is now using “old and boring” Google Map instead and its users are deprived of quite an innovative 3D visualisation capability.

Little I knew that the application was actually developed by a Swedish company C3 Technologies, an offshoot of SAAB. When the company was acquired by Apple, the application was pulled from the market – presumably to be improved and to be re-released at some later date.

But that capability is not entirely lost to the rest of the world. It is not clear whether Apple sold the technology or else but now Nokia emerged as a primary user the C3 Technologies mapping capability in its 3D online maps.

Also, an Australian company Euclideon is now offering an SDK with a very similar 3D point cloud capability to anyone who would like to integrate it with their own spatial software. The technology was long in the development and has a fair share of skeptics – it was first unveiled in 2003 and was initially aimed at interactive gaming market but is turns out it is better suited to spatial applications. In 2010 Euclideon received a $2 million grant, the largest awarded by the Australian Federal Government under Commercialisation Australia initiative, to take this technology to the market. Hopefully, it is only a matter of time the new generation of 3D cloud based maps will start to appear on the web. The first taker of Euclideon technology is AEROmetrex with aero3D Pro but it is only a desktop version...

Related Posts:
Map of the future

Apple 3D mapping quest 3D map

Sunday, March 3, 2013

AllHomes property map

Australian real estate portals do contain maps to aid in searching of property listings and in market research but these are nowhere near the sophistication of some of overseas counterparts. Amongst the home grown varieties worth mentioning is property portal - by far the most popular with the buyers as well as agents in the Australian Capital Territory.  The design and functionality is pretty basic but the site contains information that those researching local market will find of great interest.

In particular, the site provides past sales records for entire suburbs as well as annual median prices for the last 15 years. There is also a simple mapping application that comes with optional base layers depicting planning zones and easements (very useful if you are looking for properties for redevelopment), unimproved land values for individual parcels (useful for comparisons), as well as locations of schools and school zones boundaries (primary, secondary and colleges). The site offers basic yet quite attractive presentation of key data that will help in making an informed decision.

Related Posts:
Presenting property prices on maps
Map of Melbourne house prices
Sydney house prices
WA housing affordability index - See more at:
Map of Melbourne house prices
Sydney house prices
WA housing affordability index

Map of Melbourne house prices
Sydney house prices
WA housing affordability index - See more at:
Map of Melbourne house prices
Sydney house prices
WA housing affordability index - See more at: