Cascading and Federated WFS and the Concept of Geolinking

As many people have pointed out, especially here in Canada, there is a great deal of geographic or geographically related information which does not reside in spatial or GIS databases. Nonetheless there is the need to link this information with associated geospatial entities (e.g. administrative or jurisdictional boundaries) for the purposes of spatial analysis and map display. In fact, a geolinking service has been proposed at the Open Geospatial Consortium (OGC), for just this purpose.

An apparently unrelated issue is that modeling of geographic features and their consequent support in a Web Feature Service (WFS). One organization might for example, model a road as a generic “RoadWay” and define specific subtypes for “Street”, “Highway”, “Expressway”, while another might simply add a classification attribute to the “RoadWay”. Clearly the two models are not equivalent, but they are very similar.

Another apparently unrelated issue is that of traditional conflation. In this case you may have two different descriptions of a building, with different geometric and non-spatial properties. In conflating these two descriptions, you might like to use the geometry from one description and the spatial and non-spatial properties of the other.

How are these three issues – geolinking, model representation and conflation related to each other? And what has this got to do with the WFS (Web Feature Service)?

For a quick review, a WFS is a web service that provides transactional (update/delete/insert, request) to geospatial data using XML messages in a manner that is vendor neutral and that hides the underlying data store (e.g. storage technology, schemas etc). Most WFS have been implemented as client-server architectures, and many even employ RPC (Remote Procedure Call), but this is not really required. REST-based architectures are not inconsistent with the WFS specification.

Let’s start with the idea of geolinking. To make matters more concrete, we assume that we have two databases, one a relational database containing population data (birth rates, mortality rates, and current populations for a variety of jurisdictional entities (e.g. cities, municipalities, provinces, counties, states, etc.). The schema of this data base is as follows:

Jurisdiction Jurisdiction
Type

Birth Rate
(births/year)
Mortality
(deaths/year)
Population
(current)
         
         

A sample fragment from the database table might then look like:

Jurisdiction
Jurisdiction
Type
Birth Rate
(births/year)

Mortality
(deaths/year)

Population
(current)
Niagara County 10.62 6.81  
Welland Municipality 10.7 6.9 51,275

Completely separate from this database (i.e. located in a different data store and likely managed by a different organization) is a database that contains the boundaries or extents of the jurisdictional entities for example the Province of Ontario. (This is the Canadian Province containing Niagara and Welland). Assume that this database provides the spatial extent, expressed as a polygon, for all of the jurisdictional entities in Canada, and that there are features defined for Municipalities, Counties and Provinces, with each instance of these feature types having an ID value (e.g. type = Municipality, ID=”Welland”).

Now let’s proceed to link these two databases together – to geolink them – which is to associate the attributes in the relational database with the geometry in the spatial database.

To begin with, take a feature perspective on the relational database, and implicitly assert features, with types defined by the values of the enumerated attribute “JurisdictionType”, and with local database resource identifier “Jurisdiction”. Such a mapping could readily be supported by installing a WFS on the relation database and suitably configuring the WFS schema mapping (see http://www.galdosinc.com/archives/525 ). Note that this will give rise to a particularly simple GML schema representing the demographic data.

Now let’s also install a WFS onto the geospatial database as well, so in both cases we can request features by ID and other properties, using the WFS request protocol.

To link these two datasets there needs to be a special kind of cascading WFS that can perform the needed schema mapping, and effect the desired geolinking. This special WFS presents the usual WFS interfaces to the rest of the world, namely GetCapabilities, DescribeFeatureType, and GetFeature operations. It then translates these operations into further operations against the two WFS installed above. A GetCapabilities operation to this cascading WFS would result in GetCapabilities requests to each of the WFS’, with the Cascading WFS using its mapping rules to create a single Capabilities document response. For example, it could return a single list of feature types, namely Province, County, and Municipality. If we then requested a DescribeFeatureType (Municipality) it would return a single application schema for Municipality that combined the spatial information from one WFS and the attribute information from the other (Mortality, Birth Rate etc), by doing a “join” on the feature ID. To generate a map of Ontario showing the birth rate by county, a client would make a request to the Cascading WFS, which would in turn translate this request into queries to the other WFS’ and effect the required join operation on the returned data.

Such a specialized cascading or federated WFS can also deal with the issue of variant models for geographic features. Consider two spatial databases for roads as discussed earlier. Suppose we now deploy a Cascading WFS which exposes a different road model, namely one with a generic notion of a NavigablePath and with subtypes for Road, Railway, and FerryRoute. For the Roads subtype assume also a “type attribute” specifying the kinds of Roads, as an enumerated value, namely (Road, Street, Boulevard, Highway, Freeway, and Tollway. The Cascading WFS is then configured to map its feature types to the feature types of the cascaded WFS’. For example, (Road, type=”Road, Street, Boulevard” is mapped to the feature type “Street” of one database, and to (“RoadWay”, classification=”Street”), in the other. When a client issues a request to the Cascading WFS, the WFS uses its mapping rules to generate queries to the cascaded WFS’, and then transforms and integrates the responses. With this approach, different models for the road system can be handled using an extended Cascading WFS.

By now it should be apparent, that conflation is also something that “could” be handled by a suitably configured Cascading WFS. Of course this discussion has glossed over issues of performance, and the complexity of the mappings involved, some of which clearly will require numerical, string or even geometric transformations. Nonetheless, it makes sense to think of these three different issues as related to particular Cascading WFS implementations, each performing data translation in addition to cascading of requests, and then to explore the needed types of translations. This will come in a future blog.

Ron Lake, Chairman and CEO, Galdos Systems Inc.

Davos of Geo in Vancouver

GeoWeb 2008 is coming soon! This year is shaping up as a kind of Davos of Geo; a meeting of the leaders in geographic information systems and in particular the GeoWeb.

Such a meeting of leaders is essential to forge new business alliances, to focus attention on the concept of the GeoWeb, and to work together for its realization. The GeoWeb – the global and local integration of spatial information systems is an emerging reality.

GeoWeb has of course many interpretations to many people. For some, it is a specialization of the semantic web. For others, it is the aggregation of data for search engines and regional or national governments. For still others, the GeoWeb is a kind of accounting system for the planet, providing visibility into the state of the all things connected with the earth. GeoWeb is all of these things, and the GeoWeb conference is an opportunity to move these ideas forward, to forge new business relationships, and to network with your colleagues.

This years GeoWeb will host a $22,500 Student Contest (see http://geowebconference.org/students
-academia/contest-information
) to stimulate GeoWeb software and theoretical developments that lead to the evolution of the GeoWeb. The contest is sponsored by Galdos Systems, Inc., Google, OSGEO and the FGDC. Other sponsors are welcome to participate. Students interested in this contest may also be interested in the Model Your Campus contest at Google (see http://contest.sketchup.com/intl/en/index.php).

This years GeoWeb also features three guest speakers, namely Michael Goodchild (University of Santa Barbara), Kimon Onuma, FAIA (Onuma, Inc.) and Michael Kay (Saxonica). These speakers will also be providing three of the dozen or more workshops that will be featured at the event.

This years GeoWeb also has keynotes from Michael Jones (Google) and Alex Miller (ESRI).

Excellent technical presentations have been a hallmark of GeoWeb, and this year will be no different based on the abstracts already submitted. There is still time to get your abstract in (submissions close on March 7th, 2008) – see http://geowebconference.org/papers-workshops/call-for-papers).

GeoWeb 2008 will have 3D/BIM/CAD/GIS integration as a major theme as we build toward Geoweb 2009 Cityscapes!

If you are leader in Geo or want to become a leader – come and make your voice heard at GeoWeb 2008! More about that Davos idea – like in Davos we will have an onsite film crew that will interview people on the issue of the day announced each morning. Interviews will be posted each evening on a GeoWeb YouTube channel.

The fireworks are back – the venue is fantastic – it is Vancouver after all!! See you there, July 21-25, 2008.

Ron Lake, Chairman and CEO, Galdos Systems, Inc.

KML as Observations: (KML to GML and back again!)

KML as a mapping language is clearly gaining momentum around the world. It is used as the means for map display in both web sites like Google Maps and Virtual Earth, but also in a range of more conventional mapping and GIS products. Web Map Servers (OGC) can now provide maps in KML as well as more traditional image formats like GIF or JPG. This posting, looks at the use of KML in a slightly different context, namely that of creating observations.

We motivate the discussion by considering the use of probe cars to digitize highway and road infrastructure. An increasing number of vehicles are equipped today with navigation devices – so called “GPS” units that provide on board map displays and guide you from one destination to another. Of course such units can also be used to capture the vehicles position for an external data collector and in doing so capture aspects of the geometry of the road, or the current speed of the traffic where the vehicle is located. This raises the obvious question. What is the relationship between the track recorded by the GPS unit in the vehicle, and the road features that populate the database on which the navigation unit’s map display is based?

We assert that the GPS track, however recorded, is an observation and that such observations can be used to generate or update the feature model of the road network. Furthermore we propose that KML can be used as a means to capture such observations.

Observations (see GML specification) model the “act of observing or measuring”. As a result an observation has specific properties like the time of the observing, possibly the location of the observer (this is the result in this case), the result of the observing (e.g. vehicle speed, vehicle location), the target or subject of the observing, and the instrument or procedure that is employed. Now we should be clear that no such construct currently exists in KML. There is no supported means to distinguish an observation from the styling of a feature (e.g. the road) for presentation. Nonetheless, the idea of using KML to capture this information is attractive, since we can anticipate the use of KML for map display in future navigation devices. How to do it?

Assuming that we do not make any extensions to KML (more about that in a future blog), how should we represent an observation. One approach is to use KML Extended Data and add some elements from GML. Specifically we add elements in the Extended Data from GML Observation, namely Observation, target, and using. Note that an application schema extending observation could have been created if additional user properties were desired such as observer name, description etc. To simplify matters we will stick to the use of core elements of GML. Note that we do not include in the observation the gml:validTime or gml:location as these are captured in this case by the KML encoding of the track as a PlaceMark. We also do not include gml:resultOf in this specific case as only location information is being acquired. We could include a gml:resultOf if we were to include other parameters in the observation such as vehicle direction and speed.

Detection of the gml:Observation in the PlaceMark designates this PlaceMark as an Observation.

So how do we get from Observations to “authorized” features.

A feature is a model of some aspect of the world. Features are named typed entities with properties specific to their type (e.g. a Road has a number of lanes, driving directions for the lanes, lane width, surface type and so forth). Features arise from some form of community agreement. This may be formal or even legal (e.g. the boundary of a land parcel), or it may be quite informal. For roads it is both formal and legal, however this formal and legal agreement is a function of jurisdiction and varies from one part of the world to another.

We can thus create a particular road model (list of properties (fields) and their types) that represents the community understanding of road for a particular jurisdiction. This road model can be expressed in a GML application schema. The tracks encoded in KML can then be used quite directly to construct GML observations about the road network which are then forwarded back to the road network custodian for integration into the road network model. Accumulation and analysis of multiple such tracks then leads to modifications to the road network which may then be propagated by OGC WFS transactions to the network of road databases and to the in vehicle navigation devices themselves. In the navigation device the updated data can be styled into KML for presentation to the driver, with styling appropriate to the driver’s locale.

Note that this process of presentation => content (capture observation content) => presentation (style updated content for presentation) is a basic part of the separation of presentation and content and exploits the power of both KML and GML.

Ron Lake, Chairman and CEO, Galdos Systems, Inc.

Conference Organizer
Conference Supporter