class documentation

class APIDataSet(AbstractDataSet[None, requests.Response]): (source)

View In Hierarchy

APIDataSet loads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/latest/

Example usage for the YAML API:

usda:
  type: api.APIDataSet
  url: https://quickstats.nass.usda.gov
  params:
    key: SOME_TOKEN,
    format: JSON,
    commodity_desc: CORN,
    statisticcat_des: YIELD,
    agg_level_desc: STATE,
    year: 2000

Example usage for the Python API:

>>> from kedro.extras.datasets.api import APIDataSet
>>>
>>>
>>> data_set = APIDataSet(
>>>     url="https://quickstats.nass.usda.gov",
>>>     params={
>>>         "key": "SOME_TOKEN",
>>>         "format": "JSON",
>>>         "commodity_desc": "CORN",
>>>         "statisticcat_des": "YIELD",
>>>         "agg_level_desc": "STATE",
>>>         "year": 2000
>>>     }
>>> )
>>> data = data_set.load()
Method __init__ Creates a new instance of APIDataSet to fetch data from an API endpoint.
Method _describe Undocumented
Method _execute_request Undocumented
Method _exists Undocumented
Method _load Undocumented
Method _save Undocumented
Instance Variable _request_args Undocumented

Inherited from AbstractDataSet:

Class Method from_config Create a data set instance using the configuration provided.
Method __str__ Undocumented
Method exists Checks whether a data set's output already exists by calling the provided _exists() method.
Method load Loads data by delegation to the provided load method.
Method release Release any cached data.
Method save Saves data by delegation to the provided save method.
Method _copy Undocumented
Method _release Undocumented
Property _logger Undocumented
def __init__(self, url: str, method: str = 'GET', data: Any = None, params: Dict[str, Any] = None, headers: Dict[str, Any] = None, auth: Union[Iterable[str], AuthBase] = None, json: Union[List, Dict[str, Any]] = None, timeout: int = 60, credentials: Union[Iterable[str], AuthBase] = None): (source)

Creates a new instance of APIDataSet to fetch data from an API endpoint.

Parameters
url:strThe API URL endpoint.
method:strThe Method of the request, GET, POST, PUT, DELETE, HEAD, etc...
data:AnyThe request payload, used for POST, PUT, etc requests https://requests.readthedocs.io/en/latest/user/quickstart/#more-complicated-post-requests
params:Dict[str, Any]The url parameters of the API. https://requests.readthedocs.io/en/latest/user/quickstart/#passing-parameters-in-urls
headers:Dict[str, Any]The HTTP headers. https://requests.readthedocs.io/en/latest/user/quickstart/#custom-headers
auth:Union[Iterable[str], AuthBase]Anything requests accepts. Normally it's either ('login', 'password'), or AuthBase, HTTPBasicAuth instance for more complex cases. Any iterable will be cast to a tuple.
json:Union[List, Dict[str, Any]]The request payload, used for POST, PUT, etc requests, passed in to the json kwarg in the requests object. https://requests.readthedocs.io/en/latest/user/quickstart/#more-complicated-post-requests
timeout:intThe wait time in seconds for a response, defaults to 1 minute. https://requests.readthedocs.io/en/latest/user/quickstart/#timeouts
credentials:Union[Iterable[str], AuthBase]same as auth. Allows specifying auth secrets in credentials.yml.
Raises
ValueErrorif both credentials and auth are specified.
def _describe(self) -> Dict[str, Any]: (source)

Undocumented

def _execute_request(self) -> requests.Response: (source)

Undocumented

def _exists(self) -> bool: (source)

Undocumented

def _load(self) -> requests.Response: (source)

Undocumented

def _save(self, data: None) -> NoReturn: (source)

Undocumented

_request_args: Dict[str, Any] = (source)

Undocumented