module documentation

This module provides a set of classes which underpin the data loading and saving functionality provided by kedro.io.

Exception VersionNotFoundError VersionNotFoundError raised by AbstractVersionedDataSet implementations in case of no load versions available for the data set.
Function generate_timestamp Generate the timestamp to be used by versioning.
Function get_filepath_str Returns filepath. Returns full filepath (with protocol) if protocol is HTTP(s).
Function get_protocol_and_path Parses filepath on protocol and path.
Function parse_dataset_definition Parse and instantiate a dataset class using the configuration provided.
Function validate_on_forbidden_chars Validate that string values do not include white-spaces or ;
Constant CLOUD_PROTOCOLS Undocumented
Constant HTTP_PROTOCOLS Undocumented
Constant PROTOCOL_DELIMITER Undocumented
Constant VERSION_FORMAT Undocumented
Constant VERSION_KEY Undocumented
Constant VERSIONED_FLAG_KEY Undocumented
Function _load_obj Undocumented
Function _local_exists Undocumented
Function _parse_filepath Split filepath on protocol and path. Based on fsspec.utils.infer_storage_options.
Constant _CONSISTENCY_WARNING Undocumented
Constant _DEFAULT_PACKAGES Undocumented
Type Variable _DI Undocumented
Type Variable _DO Undocumented
def generate_timestamp() -> str: (source)

Generate the timestamp to be used by versioning.

Returns
strString representation of the current timestamp.
def get_filepath_str(path: PurePath, protocol: str) -> str: (source)

Returns filepath. Returns full filepath (with protocol) if protocol is HTTP(s).

Parameters
path:PurePathfilepath without protocol.
protocol:strprotocol.
Returns
strFilepath string.
def get_protocol_and_path(filepath: str, version: Version = None) -> Tuple[str, str]: (source)

Parses filepath on protocol and path.

Parameters
filepath:strraw filepath e.g.: gcs://bucket/test.json.
version:Versioninstance of kedro.io.core.Version or None.
Returns
Tuple[str, str]Protocol and path.
Raises
DataSetErrorwhen protocol is http(s) and version is not None.
NoteHTTP(s) dataset doesn't support versioning.
def parse_dataset_definition(config: Dict[str, Any], load_version: str = None, save_version: str = None) -> Tuple[Type[AbstractDataSet], Dict[str, Any]]: (source)

Parse and instantiate a dataset class using the configuration provided.

Parameters
config:Dict[str, Any]Data set config dictionary. It must contain the type key with fully qualified class name.
load_version:strVersion string to be used for load operation if the data set is versioned. Has no effect on the data set if versioning was not enabled.
save_version:strVersion string to be used for save operation if the data set is versioned. Has no effect on the data set if versioning was not enabled.
Returns
2-tuple(Dataset class object, configuration dictionary)
Raises
DataSetErrorIf the function fails to parse the configuration provided.
def validate_on_forbidden_chars(**kwargs): (source)

Validate that string values do not include white-spaces or ;

CLOUD_PROTOCOLS: tuple[str, ...] = (source)

Undocumented

Value
('s3', 's3n', 's3a', 'gcs', 'gs', 'adl', 'abfs', 'abfss', 'gdrive')
HTTP_PROTOCOLS: tuple[str, ...] = (source)

Undocumented

Value
('http', 'https')
PROTOCOL_DELIMITER: str = (source)

Undocumented

Value
'://'
VERSION_FORMAT: str = (source)

Undocumented

Value
'%Y-%m-%dT%H.%M.%S.%fZ'
VERSION_KEY: str = (source)

Undocumented

Value
'version'
VERSIONED_FLAG_KEY: str = (source)

Undocumented

Value
'versioned'
def _load_obj(class_path: str) -> Optional[object]: (source)

Undocumented

def _local_exists(filepath: str) -> bool: (source)

Undocumented

def _parse_filepath(filepath: str) -> Dict[str, str]: (source)

Split filepath on protocol and path. Based on fsspec.utils.infer_storage_options.

Parameters
filepath:strEither local absolute file path or URL (s3://bucket/file.csv)
Returns
Dict[str, str]Parsed filepath.
_CONSISTENCY_WARNING: str = (source)

Undocumented

Value
'Save version \'{}\' did not match load version \'{}\' for {}. This is strongly 
discouraged due to inconsistencies it may cause between \'save\' and \'load\' op
erations. Please refrain from setting exact load version for intermediate data s
ets where possible to avoid this warning.'
_DEFAULT_PACKAGES: list[str] = (source)

Undocumented

Value
['kedro.io.', 'kedro_datasets.', 'kedro.extras.datasets.', '']

Undocumented

Value
TypeVar('_DI')

Undocumented

Value
TypeVar('_DO')