class documentation
APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/
Example usage for the YAML API:
usda: type: api.APIDataSet url: https://quickstats.nass.usda.gov params: key: SOME_TOKEN, format: JSON, commodity_desc: CORN, statisticcat_des: YIELD, agg_level_desc: STATE, year: 2000
Example usage for the Python API:
>>> from kedro.extras.datasets.api import APIDataSet >>> >>> >>> data_set = APIDataSet( >>> url="https://quickstats.nass.usda.gov", >>> params={ >>> "key": "SOME_TOKEN", >>> "format": "JSON", >>> "commodity_desc": "CORN", >>> "statisticcat_des": "YIELD", >>> "agg_level_desc": "STATE", >>> "year": 2000 >>> } >>> ) >>> data = data_set.load()
Method | __init__ |
Creates a new instance of APIDataSet to fetch data from an API endpoint. |
Method | _describe |
Undocumented |
Method | _execute |
Undocumented |
Method | _exists |
Undocumented |
Method | _load |
Undocumented |
Method | _save |
Undocumented |
Instance Variable | _request |
Undocumented |
Inherited from AbstractDataSet
:
Class Method | from |
Create a data set instance using the configuration provided. |
Method | __str__ |
Undocumented |
Method | exists |
Checks whether a data set's output already exists by calling the provided _exists() method. |
Method | load |
Loads data by delegation to the provided load method. |
Method | release |
Release any cached data. |
Method | save |
Saves data by delegation to the provided save method. |
Method | _copy |
Undocumented |
Method | _release |
Undocumented |
Property | _logger |
Undocumented |
def __init__(self, url:
str
, method: str
= 'GET', data: Any
= None, params: Dict[ str, Any]
= None, headers: Dict[ str, Any]
= None, auth: Union[ Iterable[ str], AuthBase]
= None, json: Union[ List, Dict[ str, Any]]
= None, timeout: int
= 60, credentials: Union[ Iterable[ str], AuthBase]
= None):
(source)
¶
Creates a new instance of APIDataSet to fetch data from an API endpoint.
Parameters | |
url:str | The API URL endpoint. |
method:str | The Method of the request, GET, POST, PUT, DELETE, HEAD, etc... |
data:Any | The request payload, used for POST, PUT, etc requests https://requests.readthedocs.io/en/latest/user/quickstart/#more-complicated-post-requests |
params:Dict[ | The url parameters of the API. https://requests.readthedocs.io/en/latest/user/quickstart/#passing-parameters-in-urls |
headers:Dict[ | The HTTP headers. https://requests.readthedocs.io/en/latest/user/quickstart/#custom-headers |
auth:Union[ | Anything requests accepts. Normally it's either ('login', 'password'), or AuthBase, HTTPBasicAuth instance for more complex cases. Any iterable will be cast to a tuple. |
json:Union[ | The request payload, used for POST, PUT, etc requests, passed in to the json kwarg in the requests object. https://requests.readthedocs.io/en/latest/user/quickstart/#more-complicated-post-requests |
timeout:int | The wait time in seconds for a response, defaults to 1 minute. https://requests.readthedocs.io/en/latest/user/quickstart/#timeouts |
credentials:Union[ | same as auth. Allows specifying auth secrets in credentials.yml. |
Raises | |
ValueError | if both credentials and auth are specified. |