We created GeoSeer to solve a problem: it's an absolute pain to find spatial data.
There are thousands of public-facing web-services out there using the OGC (new window) standards (WMS, WFS, WCS, WMTS) to serve data, but they have only limited discoverability. We wanted to create a search engine that would bring all of these services into a single place, so we created GeoSeer.
Out of the literally millions of free and open-source datasets that are available, a huge number of them are spatial, but how do you find the one that you need? No-one really wants to go rummaging through dozens of CKAN portals and outdated lists of services simply to find a dataset about the location of fire stations in Warwickshire for example.
In particular we wanted to solve the problem for spatial data because spatial data - by definition - already has a location associated with it. This means we can return more relevant results: in the fire stations example, the fire stations in Butte, California, USA are unlikely to be of interest to you.
How do I use these results?
All of these results are spatial datasets based around OGC web-services. Specifically they're (with links to Wikipedia that open in a new window):
- WMS (Web Map Service) - basically, a map
- WFS (Web Feature Service) - raw vector data
- WCS (Web Coverage Service) - raw raster data
- WMTS (Web Map Tile Service) - a pre-rendered basemap
To access them, you'll need to use a GIS. QGIS (new window) is a popular and free GIS and supports all of these standards and many other spatial and non-spatial formats besides. There are lots of tutorials online explaining how to add services - this search (new window) is a good start for instance.
Once you have a GIS, you can find the GetCapabilities URL at the top of the layer result page; you'll need to feed this into the GIS. You can also find the layer name/title with it - you'll need these to choose the right layer.
Do you have an API?
Yes we do! There's an entire page filled with information about the GeoSeer API. The API is a great way to programmatically access our services from within your own web-gis. Alas it's not free (we do have to pay the bills!), but we think it's worth it.
The API schema is comprehensively documented and built around the OpenAPI 3 standard, so there's plenty of tooling you can use to integrate it into your workflow and/or products.
Do you have a blog?
We do indeed. If you want to find out about OGC services and scraping them, then GeoSeer Blog is the place for you.
Is there a stats page?
So you're a data nerd too? Cool! There's a page stuffed with stats just waiting for you here. And if you can't find what you want on there, contact us and ask and we'll see what we can do.
How does it work?
The GeoSeer spider scrapes lots of different sources, using various API's to discover the many OGC web-services that are registered with them. It then downloads the GetCapabilities document for every web-service it can find (GetCapabilities is an OGC standard XML document with lots of information about what layers exist in a service). We then post-process all of those GetCapabilities documents, removing duplicate layers, cleaning them up, determining the spatial extent, and finally making them searchable.
The end result is a database with hundreds of thousands of layers that sits behind a simple, fast web-page.
I don't like the results...
Unfortunately the results brought back by GeoSeer aren't perfect, in large part because the source data is often lacking. Lets be honest here, no-one likes writing metadata!
GeoSeer does its best to clear up bad data and remove results when it can't. The following problems are things we can't do anything about, yet are remarkably common:
- Bad/missing/unclear layer descriptions/abstracts in the GetCapabilities documents - often they don't describe the layer at all.
- No suitable keywords in the layer.
- Incomplete/non-existent service-level metadata (contact info, service abstract, name, etc.).
- Bad/missing/unclear layer names and titles. A layer called "1" doesn't help anyone know what it's about.
- Bad spatial information - wrong coordinates and/or wrong projections.
- ArcGIS Server - Even if the data curator has taken the time and trouble to enter lots of user information about a service, ArcGIS Server doesn't typically expose this via its OGC GetCapabilities. If you're an ESRI customer, please report this as a bug to them.
So, if you host an OGC service, please make sure your metadata is correct! If you go and update it now and we already index it, we'll find it on our next crawl and everyone will benefit from better results.
For our part, we're working to improve both the quality and quantity of the results, including adding even more datasets to the mix.
There are no adverts, where are the ads?
We won't mince words here: We hate seeing adverts, so why should we foist them on you? They invade your privacy by tracking you around the internet, they're a prime source of malware, they're frequently deceitful, and are they're largely obnoxious. We have yet to meet anyone who actually likes them (excepting the people who sell them), so we don't have ads.
So rather than selling you out for a quick-buck, we instead seek to make GeoSeer sustainable via the paid-for GeoSeer API. If you like the service and want it to stay around, then why not buy into the GeoSeer API and integrate it into your own application and/or web-GIS? The GeoSeer API also has more features than the free web-search version.
I have a suggestion for...
Cool! Email it to us at the Contact Us address below.
We're always up for new datasets, new data portals, new search ideas, and anything else that will help us solve this problem.
If you have any feedback, questions, suggestions, or comments about GeoSeer we'd love to hear them, send us an email to: firstname.lastname@example.org
We're particularly interested in new data sources.
We'd like to credit the following:
Boring legal stuff
We value privacy, including yours, and we don't like lawyers (who does?), or companies that have multi-thousand-word EULA's, so we keep this short and simple:
- We don't endorse any of the results. There's no filtering or censorship here, we only do the minimum-necessary data-cleansing for "make it work" and "no empty results" purposes.
- In the incredibly unlikely situation this website/service breaks your computer, burns down your house, or otherwise starts armageddon, it's not our fault!
- We don't set any cookies.
- We don't don't do any tracking.
- Please don't mine our site. If you want access to the raw data, talk to us (email in the Contact Us section).
- We do keep server-side access logs.