Download OpenAPI specification:Download
Vermillion is a high-performance, scalable and secure open-source data exchange platform developed using Vertx. It is a general-purpose resource-server that data providers and consumers can use to exchange time-series as well as static datasets. Vermillion exposes a simple search interface that can be used to query resources using various parameters like time, geo-coordinates, full-text or any combination thereof.
The datasetu-auth-server is the authentication, authorisation and accounting (AAA) server of Datasetu. Data providers can set fine-grained access control policies to regulate access to their resources/datasets. Data consumers can request access tokens to get access to resources. For more information please refer to Datasetu Auth Server documentation. Access on a resource could be granted by providers using a read
scope or a write
scope. In the read
scope, consumers will be able to invoke read-related APIs on the datasets. In the write
scope, consumers will be able to invoke APIs that "write" to the resource. All APIs except publish need a read
scope. The publish API needs a write
scope for invocation.
Security Scheme Type | API Key |
---|---|
Query parameter name: | token |
A data consumer is any user or entity which is interested in a data resource that vermillion hosts (or acts as a intermediary for). Consumers discover resources on the datasetu catalogue and use the search interface to query the datasets
This endpoint is meant for downloading secure file datasets for which access has been obtained beforehand. If the fully-qualified resource ID is known then this endpoint can be invoked from programs or from user-agents like curl
. Otherwise, invoking the endpoint with just an access token will bring up an HTML page containing a list of datasets the consumer had requested for. Furthermore, this API can be used in two modes. In the first mode, a specific resource ID or a specific set of resource IDs can be requested (a subset of the resources that the token has authorisation to). They will be then be made available on the consumer's directory which can be downloaded. In the second mode, the consumer can simply pass an access token, and all resources that the token has authorisation to will be made available in the consumer's directory. The pre-condition for the second mode is that the token presented must not have authorisation to heterogenous resources, i.e., a mixture of time-series datasets and files (or files residing on other resource servers). The download
API merely performs the function of symlinking the requested resources to the consumer's directory. Once the symlinks are created, this API internally redirects to the /consumer/
API.
read
) ACCESS_TOKEN required | string Example: ACCESS_TOKEN=auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e153 A token granted by the datasetu auth server to access resources. |
RESOURCE_ID | string Example: RESOURCE_ID=rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b /example.com/test-category/test-resource.public A fully qualified resource name obtained from the datasetu catalogue. One or more resource IDs can be specified in this API. In the latter case, the resource IDs need to be separated by a comma. |
This is a sample text from a file.
This API is for getting the latest datapoint of a resource. This is typically meant to be used on time-series datasets. However, it could be used to query the latest metadata of static files as well. It supports both open and secure datasets. An access token is required in the latter case.
read
) RESOURCE_ID required | string Example: RESOURCE_ID=rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b /example.com/test-category/test-resource A fully qualified resource name obtained from the datasetu catalogue. |
ACCESS_TOKEN | string Example: ACCESS_TOKEN=auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e179 An access token granted by the datasetu auth server |
{- "data": {
- "data": {
- "Project": "Vermillion",
- "ApiDocs": "Redoc",
- "Definition": "OpenAPI"
}
}, - "timestamp": "2021-03-05T10:18:00.952628Z",
- "id": "rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b/example.com/test-category/test-resource.public",
- "category": "test-category"
}
Scroll API gives all the datasets in chunks of specified size. The size determines the pagination of data points and it is to be defined in search API. Prior to this, search API should be hit to obtain scroll_id.
read
) ACCESS_TOKEN | string Example: ACCESS_TOKEN=auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e179 An access token granted by the datasetu auth server |
scroll_id required | string This is the scroll Id associated with data. |
scroll_duration required | string The time duration specified/requested to scroll in and around the data |
{- "scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFDFXVGpZbmdCLXVCbkdFcEk3TFF1AAAAAAAAAAIWZWNVMWdVVkVUNHlub1kzdldYR2d3Zw==",
- "scroll_duration": "30m"
}
{- "hits": {
- "data": {
- "data": {
- "Project": "Vermillion",
- "ApiDocs": "Redoc",
- "Definition": "OpenAPI"
}
}, - "timestamp": "2021-03-03T10:18:00.952628Z",
- "id": "rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b/example.com/test-category/test-resource.public",
- "category": "test-category",
- "co-ordinates": [
- "56.9",
- "76.5"
], - "mime-type": "application/json"
}, - "scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFDFXVGpZbmdCLXVCbkdFcEk3TFF1AAAAAAAAAAIWZWNVMWdVVkVUNHlub1kzdldYR2d3Zw=="
}
This API provides a search interface for the data hosted on vermillion. Both public and secure datasets can be queried using this API, with an access token being required in the latter case. This interface provides options to query using time, geo-spatial co-ordinates, text or any combination thereof. Resource Id is a mandatory field across all search types. Along with the resource ID, at least one of the other three parameters is necessary for the search query.
read
) ACCESS_TOKEN | string Example: ACCESS_TOKEN=auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e179 A token granted by the datasetu auth server to access resources. |
The following lists the various filters which can be used in the search API. A filter can be clubbed with any other filter to perform a complex search.
This can be used to query resources using a time-based filter.
This can be used to query resources using a geo-spatial filter, i.e., using geo co-ordinates.
This can be used to query resources using a text-based or numeric filter.
When one or more of the above filters are used, all of them are applied while querying the DB.
required | resourceId (string) or Array of resourceId-array (strings) |
required | object (time) A jsonObject specifying the start and end times. |
scroll_duration | string The time duration specified/requested for the ES context to be alive and subsequently to scroll in & around the data |
size | integer The number of hits that consumer is interested in. |
{- "id": "rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b/example.com/test-category/test-resource.public",
- "time": {
- "start": "2021-02-3",
- "end": "2021-03-3"
}, - "scroll_duration": "60m",
- "size": 3
}
{- "hits": {
- "data": {
- "data": {
- "Project": "Vermillion",
- "ApiDocs": "Redoc",
- "Definition": "OpenAPI"
}
}, - "timestamp": "2021-03-03T10:18:00.952628Z",
- "id": "rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b/example.com/test-category/test-resource.public",
- "category": "test-category",
- "co-ordinates": [
- "56.9",
- "76.5"
], - "mime-type": "application/json"
}, - "scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFDFXVGpZbmdCLXVCbkdFcEk3TFF1AAAAAAAAAAIWZWNVMWdVVkVUNHlub1kzdldYR2d3Zw=="
}
This API is for consumers to get access to secure file datasets of providers. The pre-requisite to invoke this API is to invoke the /download
API. The latter will create symlinks for the requested datasets in the consumer's directory. This API can be used on a browser in which case an HTML page containing the folders is returned. Alternatively, it could also be invoked from a user-agent such as curl
if the fully qualified resource ID is known.
read
) ACCESS_TOKEN required | string Example: auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e179 A token granted by the datasetu auth server to access resources. In the above example, the endpoint the consumer needs to invoke would be |
secure-resource-1 secure-resource-2
This API allows a consumer to browse files/datasets that providers have made available publicly. This API can be used on a browser in which case an HTML page containing the folders is returned. Alternatively, it could also be invoked from a user-agent such as curl
if the fully qualified resource ID is known.
read
) RESOURCE_ID | string Example: rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b/example.com/test-category/test-resource A fully qualified resource name obtained from the datasetu catalogue. In the above resource ID, the full path to access the file is |
rbccps.org, iisc.com
A data provider is any user or entity which is responsible for a dataset that vermillion hosts. Providers can be data owners or have delegated access to act as custodians for resources. Providers upload details, access mechanisms, license and other metadata of resources onto the datasetu catalogue. Also, they manage access control rules for their resources on the datasetu auth server. Providers use the publish interface of Vermillion to upload datasets and dynamic metadata associated with it.
This endpoint gives providers access to publish data into vermillion. Resource ID and access token are mandatory parameters. This API can be used to publish either time series data or static files. Depending on the mode, the request will have to be either application/json or multipart/form-data.
write
) RESOURCE_ID required | string Example: RESOURCE_ID=rbccps.org/e096b3abef24b99383d9bd28e9b8c89cfd50be0b /example.com/test-category/test-resource A fully qualified resource name obtained from the datasetu catalogue. |
ACCESS_TOKEN required | string Example: ACCESS_TOKEN=auth.datasetu.org/36a83204ea6ad6690a0eccda0f37e179 An access token granted by the datasetu auth server |
As mentioned previously, this API can be used to publish time series data or static files. The request will vary depending on the mode used.
Time-series data in JSON, formatted as per the schema specified below.
Any file that the provider wishes to host on Vermillion
timestamp | string An optional parameter to indicate the relevant timestamp of the resource (created, modified etc.). When not specified, this field defaults to the time at which the data was published. |
data required | object A mandatory field that contains the data of the resource. This is encased in the |
coordinates | Array of strings An array of co-ordinates specified as [longitude, latitude]. |
{- "timestamp": "2021-03-03T10:18:00.952628Z",
- "data": {
- "data": {
- "PM10": {
- "value": "70",
- "unit": "micrograms per cubic metre"
}
}
}, - "coordinates": [
- "56.898989",
- "67.4939"
]
}
Ok