The GeoSeer Blog

A Midyear Update

Posted on 2019-08-07

The GeoSeer index of OGC Services continues to grow, now standing tantalisingly close to 200,000 services: there are currently 197,911 from over four thousand four hundred different hosts. And of course this only includes active services; the index is kept in an "evergreen" state consisting only of services that actually worked when we last queried them. There are many more services that are intermittent but these aren't useful to you so don't feature in the index.

On adventures we go

As well as continuing to hone and expand the service, we've also been participating in some community events. In June we participated in the OGC's API Hackathon in London, part of the process for developing the next generation of OGC spatial standards. They're at an early phase - with API Features being the furthest along - and we participated with the aim of making sure that discoverability was kept in mind during their development. After all, there's no point developing cutting edge standards if no-one can find implementations.

Then we went to Italy to the European Commission's Joint Research Centre (JRC) - home of INSPIRE - to present and participate in a workshop about service discovery and search engines with regards to INSPIRE services. We met some of the people behind a few of the portals we harvest from, and exchanged thoughts on how services and data can be made more discoverable.

Statistics - SRS

We received a user enquiry as to which Spatial Reference System (SRS) was most common in OGC services, so we did a quick check and wanted to share the top results with everyone because who doesn't like stats. Note that there are lots of caveats that we won't go into here, we're sharing these as-is. It doesn't come as a surprise that EPSG:4326 aka WGS84, and Web Mercator are the most common.

SRS CodeNameNumber of datasetsNotes
EPSG:4326WGS84892,331Standard Latitude-Longitude
CRS:84WGS84514,924Longitude-Latitude swapped version of WGS84
EPSG:3857Web Mercator394,736De facto web mapping projection
EPSG:900913Web Mercator259,519Deprecated code for Web Mercator
EPSG:4258ETRS89166,684Europe
EPSG:25832ETRS89 / UTM zone 32N133,604Europe between 6°E and 12°E
EPSG:102100Web Mercator102,823ArcGIS Online version of EPSG:3857

The datasets define 1,318 different SRS'; above are just the ones with more than 100,000 datasets. We're always open to doing some stats analysis, just ask.

Licensing?

Finally, we've started investigating making the database available to third parties via licensing. If you're interested, let us know. Watch this space.


GeoSeer API Goes Live

Posted on 2019-04-09

We've hinted at it in previous blog posts, but now it's time for the big reveal: the GeoSeer API is live!

Designed to allow you to integrate the power of GeoSeer's search into your business's Web GIS or other application, the API allows your users to easily and seamlessly search for datasets without having to leave their normal tooling. There's an entire-page with information about it here.

As well as including all the features you're used to in the web-search, the API also includes some cool new features:
  • Bounding Box Search - Search for datasets that are within, disjoint, or intersecting a given bounding box, while also using a search term. Ideal for searching for layers that overlap the user's current viewing area.
  • Lat/Lon Search - Easily find datasets that intersect a specific point. Your user selects a location and now they can find data that intersect it. Simple.
  • Service Type filter - Only find datasets that are of the OGC service type(s) that you're interested in. Does your application only support WMS and WFS for instance? Then filter results to only search those service types.
  • Service Search - The GeoSeer web search only allows users to search datasets/layers, but the API also allows searching by service. Readily find services hosted by anyone from local government, through to global spanning organisations like the World Food Programme and everyone between.

We've created the snazzy GeoSeer API WebGIS that demonstrates the API in action, giving you a feeling for what you can do with it and how it could integrate with your own application(s).

The API has several plans to cover various needs, and the Enterprise plan allows for considerable customisation so you can get exactly what you need. So take a look and find out more about the API


One Year Old and Better Than Ever

Posted on 2019-03-03

Today GeoSeer celebrates its first birthday, having originally gone live on 2018-03-03, so we want to look back at the year and take a glimpse into the future.

This year we...

It has been a busy year for us. We added CSW harvesting in May, a stats page for our fellow big-data nerds in September, and a new look, along with an API beta in January. This blog was itself created in April, and got its own RSS feed back in January.

Ever more data

We've also done a lot of general work to try and increase the index size, but not at the cost of spurious results. When we went live a year ago, our index had (using our current methodology) 836,917 distinct layers from 89,825 services. Today we boast 1,229,623 distinct layers (46% more) from 167,882 services (89% more).

During our latest crawl, our index size jumped to well over 3 million layers. "Jackpot" we thought! But upon further investigation (because we're always suspicious of anything anomalous - you have to be if you want to develop something good) we discovered it was from a single host that claimed to have 2.1 million layers across about 5000 services. Deeper investigation showed that it seems that it's the same 500 layers shared thousands of times. So we removed them all from the index and only keep one of each layer to ensure the best possible results for you, our users.

What's next?

At this point it's becoming apparent that we're hitting the point of diminishing returns. We don't think there are many more readily discoverable OGC services and layers out there. We currently scrape over 300 data portals, plus many other data sources to try and find every service we can, but we can only find services which are publicly advertised somewhere, hence the "readily discoverable". But we're not giving up yet, and we have some ideas for several more scraping methodologies to further enhance the index.

And of course we'll also be releasing the API shortly. Watch this space!