class documentation
class GeoJSONDataSet(AbstractVersionedDataSet[
GeoJSONDataSet loads/saves data to a GeoJSON file using an underlying filesystem (eg: local, S3, GCS). The underlying functionality is supported by geopandas, so it supports all allowed geopandas (pandas) options for loading and saving GeoJSON files.
Example:
>>> import geopandas as gpd >>> from shapely.geometry import Point >>> from kedro.extras.datasets.geopandas import GeoJSONDataSet >>> >>> data = gpd.GeoDataFrame({'col1': [1, 2], 'col2': [4, 5], >>> 'col3': [5, 6]}, geometry=[Point(1,1), Point(2,4)]) >>> data_set = GeoJSONDataSet(filepath="test.geojson", save_args=None) >>> data_set.save(data) >>> reloaded = data_set.load() >>> >>> assert data.equals(reloaded)
Method | __init__ |
Creates a new instance of GeoJSONDataSet pointing to a concrete GeoJSON file on a specific filesystem fsspec. |
Method | invalidate |
Invalidate underlying filesystem cache. |
Constant | DEFAULT |
Undocumented |
Constant | DEFAULT |
Undocumented |
Method | _describe |
Undocumented |
Method | _exists |
Undocumented |
Method | _load |
Undocumented |
Method | _release |
Undocumented |
Method | _save |
Undocumented |
Instance Variable | _fs |
Undocumented |
Instance Variable | _fs |
Undocumented |
Instance Variable | _fs |
Undocumented |
Instance Variable | _load |
Undocumented |
Instance Variable | _protocol |
Undocumented |
Instance Variable | _save |
Undocumented |
Inherited from AbstractVersionedDataSet
:
Method | exists |
Checks whether a data set's output already exists by calling the provided _exists() method. |
Method | load |
Loads data by delegation to the provided load method. |
Method | resolve |
Compute the version the dataset should be loaded with. |
Method | resolve |
Compute the version the dataset should be saved with. |
Method | save |
Saves data by delegation to the provided save method. |
Method | _fetch |
Undocumented |
Method | _fetch |
Generate and cache the current save version |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Method | _get |
Undocumented |
Instance Variable | _exists |
Undocumented |
Instance Variable | _filepath |
Undocumented |
Instance Variable | _glob |
Undocumented |
Instance Variable | _version |
Undocumented |
Instance Variable | _version |
Undocumented |
Inherited from AbstractDataSet
(via AbstractVersionedDataSet
):
Class Method | from |
Create a data set instance using the configuration provided. |
Method | __str__ |
Undocumented |
Method | release |
Release any cached data. |
Method | _copy |
Undocumented |
Property | _logger |
Undocumented |
def __init__(self, filepath:
str
, load_args: Dict[ str, Any]
= None, save_args: Dict[ str, Any]
= None, version: Version
= None, credentials: Dict[ str, Any]
= None, fs_args: Dict[ str, Any]
= None):
(source)
¶
Creates a new instance of GeoJSONDataSet pointing to a concrete GeoJSON file on a specific filesystem fsspec.
Parameters | |
filepath:str | Filepath in POSIX format to a GeoJSON file prefixed with a protocol like
s3:// . If prefix is not provided file protocol (local filesystem) will be used.
The prefix should be any protocol supported by fsspec.
Note: http(s) doesn't support versioning. |
loadDict[ | GeoPandas options for loading GeoJSON files. Here you can find all available arguments: https://geopandas.org/en/stable/docs/reference/api/geopandas.read_file.html |
saveDict[ | GeoPandas options for saving geojson files. Here you can find all available arguments: https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.to_file.html The default_save_arg driver is 'GeoJSON', all others preserved. |
version:Version | If specified, should be an instance of kedro.io.core.Version. If its load attribute is None, the latest version will be loaded. If its save |
credentials:Dict[ | credentials required to access the underlying filesystem.
Eg. for GCFileSystem it would look like {'token': None} . |
fsDict[ | Extra arguments to pass into underlying filesystem class constructor
(e.g. {"project": "my-project"} for GCSFileSystem), as well as
to pass to the filesystem's open method through nested keys
open_args_load and open_args_save .
Here you can find all available arguments for open :
https://filesystem-spec.readthedocs.io/en/latest/api.html#fsspec.spec.AbstractFileSystem.open
All defaults are preserved, except mode , which is set to wb when saving. |