module documentation
AbstractVersionedDataSet implementation to access Spark dataframes using pyspark
Class |
|
Subclasses hdfs.InsecureClient and implements hdfs_exists and hdfs_glob methods required by SparkDataSet |
Function | _dbfs |
Perform an ls list operation in DBFS using the provided pattern. It is assumed that version paths are managed by Kedro. Broad Exception is present due to dbutils.fs.ExecutionError that cannot be imported directly. |
Function | _dbfs |
Perform a custom glob search in DBFS using the provided pattern. It is assumed that version paths are managed by Kedro only. |
Function | _get |
Get the instance of 'dbutils' or None if the one could not be found. |
Function | _parse |
Undocumented |
Function | _split |
Undocumented |
Function | _strip |
Undocumented |
Perform an ls
list operation in DBFS using the provided pattern.
It is assumed that version paths are managed by Kedro.
Broad Exception
is present due to dbutils.fs.ExecutionError
that
cannot be imported directly.
Parameters | |
pattern:str | Filepath to search for. |
dbutils:Any | dbutils instance to operate with DBFS. |
Returns | |
bool | Boolean value if filepath exists. |
Perform a custom glob search in DBFS using the provided pattern. It is assumed that version paths are managed by Kedro only.
Parameters | |
pattern:str | Glob pattern to search for. |
dbutils:Any | dbutils instance to operate with DBFS. |
Returns | |
List[ | List of DBFS paths prefixed with '/dbfs' that satisfy the glob pattern. |