Provider

Add support for any remote API or datasource to Koop. Dive into the docs below or check out a working sample here.

Index.js

Every provider must have a file called index.js. Its purpose is to tell Koop how to load and use the provider. The keys and values are enumerated in the example below.

Model.js

Every provider must have a Model. This is where almost all of the business logic of the provider will occur. Its primary job is to fetch data from a remote source like an API or database and return GeoJSON to Koop for further processing.

Model Functions

Name Required? Summary
getData Yes Fetches data and translates it to GeoJSON. See below.
createKey No Generates a string to use as a key for the data-cache. See below.
getAuthenticationSpecification No Delivers an object for use in configuring authentication in output-services. See authorization spec.
authenticate No Validates credentials and issues a token. See authorization spec.
authorize No Verifies request is made by an authenticated user. See authorization spec.

Function: getData

Models are required to implement a function called getData. It should fetch data from the remote API, translate the data into GeoJSON (if necessary) and call the callback function with the GeoJSON as the second parameter. If there is an error in fetching or processing data from the remote API it should call the callback function with an error as the first parameter and stop processing.

GeoJSON passed to the callback should be valid with respect to the GeoJSON specification. Some operations in output-services expect standard GeoJSON properties and / or values. In some cases, having data that conforms to the GeoJSON spec’s righthand rule is esstential for generating expected results (e.g., features crossing the antimeridian). Koop includes a GeoJSON validation that is suitable for non-production environments and can be activated by setting NODE_ENV to anything except production. In this mode, invalid GeoJSON from getData will trigger informative console warnings.

Function: createKey

Koop uses a an internal createKey function to generate a string for use as a key for the data-cache’s key-value store. Koop’s createKey uses the provider name and route parameters to define a key. This allows all requests with the same provider name and route parameters to leverage cached data.

Models can optionally implement a function called createKey. If defined, the Model’s createKey overrides Koop’s internal function. This can be useful if the cache key should be composed with parameters in addition to those found in the internal function. For example, the createKey below uses query parameters startdate adn enddate to construct the key (if they are defined):

Metadata

You can add a metadata property to the GeoJSON returned from getData and assign it an object for use in Koop output services. In addtion to name and description noted in the example above, the following fields may be useful:

The data type and values used for idField can affect the output of the koop-output-geoservices and behavior of some consumer clients. FeatureServer and winnow (dependencies of koop-output-geoservices) will create a separate OBJECTID field and set its value to the value of the attribute referenced by idField. OBJECTIDs that are not integers or outside the range of 0 - 2,147,483,647 can break features in some ArcGIS clients.

Cached vs Pass-Through

Providers typically fall into two categories: cached and pass-through.

Pass-Through

Pass-through providers do not store any data, they act as a proxy/translator between the client and the remote API.

Koop-Provider-Yelp is a good example. The Yelp API supports filters and geographic queries, but it only returns 20 results at a time and there is no way to download the entire Yelp dataset.

It makes sense to use a pass-through strategy if at least one of the following is true:

The request below fetches data from yelp and translates it into Geoservices JSON

http://localhost:8080/yelp/FeatureServer/0?where=term=pizza

GeoJSON can be retrieved as well

http://localhost:8080/yelp/FeatureServer/0?where=term=pizza&f=geojson

Cached

Cached providers periodically request entire datasets from the remote API.

Koop-Provider-Craigslist is a good example. The Craigslist API returns the entire set of postings for a given city and type in one call (e.g. Atlanta apartments). The data also does not change that frequently. Therefore the Craigslist provider uses the Koop cache with a TTL of 1 hour, guaranteeing that data will never be more than an hour out of date.

It makes sense to use a cache strategy if at least one of the following is true:

Advanced

Providers can do more than simply implement getData and hand GeoJSON back to Koop’s core. In fact, they can extend the API space in an arbitrary fashion by adding routes that map to controller functions. Those controller functions call functions on the model to fetch or process data from the remote API.

Request parameters in getData

Recall the getData function receives req, an Express.js request object. req includes a set of route parameters accessible with req.params, as well as a set of query-parameters accessible with req.query. Parameters can be used by getData to specify the particulars of data fetching. For example, route parameters :host and :id provide the Craiglist getData function with information necessary to generate URLs for requests to the Craigslist API.

Output-services route parameters

By default, Koop includes the koop-output-geoservices output-service. It adds a set of FeatureServer routes, some of which include addtional route parameters that can be used in your Model’s getData function.

Query parameters

As noted above, any query-parameters added to the request URL can accessed within getData and leveraged for data fetching purposes. For example, a request /provider/:id/FeatureServer/0?foo=bar would supply getData with req.query.foo equal to bar. With the proper logid, it could then be used to limit fetched data to records that had an attribute foo with a value of bar.

Generation of provider-specific output-routes

The position of the provider-specific fragment of a route path can vary depending on the path assignment in the routes array object of your output-services plugin. By default, Koop will construct the route with the provider’s parameters first, and subsequently add the route fragment defined by an output-services plugin. However, if you need the route path configured differently, you can add the $namespace and $providerParams placholders anywhere in the output-services path. Koop will replace these placeholders with the provider-specific route fragments (i.e, namespace and :host/:id). For example, an output path defined as $namespace/rest/services/$providerParams/FeatureServer/0 would translate to provider/rest/services/:host/:id/FeatureServer/0.

Output-routes without provider parameters

You may need routes that skip the addition of provider-specific parameters altogether. This can be accomplished by adding an absolutePath: true key-value to the routes array object in your output-services plugin. On such routes, Koop will define the route without any additional provider namespace or parameters.

Routes.js

This file is simply an array of routes that should be handled in the namespace of the provider e.g. http://adapters.koopernetes.com/agol/arcgis/datasets/e5255b1f69944bcd9cf701025b68f411_0

In the example above, the provider namespace is agol, and arcgis and e5255b1f69944bcd9cf701025b68f411_0 are parameters. Anything that matches /agol/*/datasets/* will be be handled by the model.

Each route has:

Example:

Controller.js

The purpose of the Controller file is to handle additional routes that are specified in route.js. It is a class that is instantiated with access to the model. Most processing should happen in the model, while the controller acts as a traffic director.

As of Koop 3.0, you do not need to create a controller. If you want to add additional functionality to Koop that is not supported by an output plugin, you can add additional functions to the controller.

Example:

Model.js

In addition to implementing getData, the model exists to interact with the remote API and to serve the controller. Any function the controller would need to call to handle a request should be implemented as a public function on the model.

Example: