werkzeug.urls

module documentation

(source)

Functions for working with URLs. Contains implementations of functions from :mod:`urllib.parse` that handle bytes and strings.

Class	`BaseURL`	Superclass of :py:class:`URL` and :py:class:`BytesURL`.
Class	`BytesURL`	Represents a parsed URL in bytes.
Class	`URL`	Represents a parsed URL. This behaves like a regular tuple but also has some extra attributes that give further insight into the URL.
Function	`iri_to_uri`	Convert an IRI to a URI. All non-ASCII and unsafe characters are quoted. If the URL has a domain, it is encoded to Punycode.
Function	`uri_to_iri`	Convert a URI to an IRI. All valid UTF-8 characters are unquoted, leaving all reserved and invalid characters quoted. If the URL has a domain, it is decoded from Punycode.
Function	`url_decode`	Parse a query string and return it as a :class:`MultiDict`.
Function	`url_decode_stream`	Works like :func:`url_decode` but decodes a stream. The behavior of stream and limit follows functions like :func:`~werkzeug.wsgi.make_line_iter`. The generator of pairs is directly fed to the `cls` so you can consume the data while it's parsed.
Function	`url_encode`	URL encode a dict/`MultiDict`. If a value is `None` it will not appear in the result string. Per default only values are encoded into the target charset strings.
Function	`url_encode_stream`	Like :meth:`url_encode` but writes the results to a stream object. If the stream is `None` a generator over all encoded pairs is returned.
Function	`url_fix`	Sometimes you get an URL by a user that just isn't a real URL because it contains unsafe characters like ' ' and so on. This function can fix some of the problems in a similar way browsers handle data entered by the user:...
Function	`url_join`	Join a base URL and a possibly relative URL to form an absolute interpretation of the latter.
Function	`url_parse`	Parses a URL from a string into a :class:`URL` tuple. If the URL is lacking a scheme it can be provided as second argument. Otherwise, it is ignored. Optionally fragments can be stripped from the URL by setting `allow_fragments` to `False`.
Function	`url_quote`	URL encode a single string with a given encoding.
Function	`url_quote_plus`	URL encode a single string with the given encoding and convert whitespace to "+".
Function	`url_unparse`	The reverse operation to :meth:`url_parse`. This accepts arbitrary as well as :class:`URL` tuples and returns a URL as a string.
Function	`url_unquote`	URL decode a single string with a given encoding. If the charset is set to `None` no decoding is performed and raw bytes are returned.
Function	`url_unquote_plus`	URL decode a single string with the given `charset` and decode "+" to whitespace.
Class	`_URLTuple`	Undocumented
Function	`_codec_error_url_quote`	Used in :func:`uri_to_iri` after unquoting to re-quote any invalid bytes.
Function	`_fast_url_quote_plus`	Undocumented
Function	`_make_fast_url_quote`	Precompile the translation table for a URL encoding function.
Function	`_unquote_to_bytes`	Undocumented
Function	`_url_decode_impl`	Undocumented
Function	`_url_encode_impl`	Undocumented
Function	`_url_unquote_legacy`	Undocumented
Variable	`_always_safe`	Undocumented
Variable	`_bytetohex`	Undocumented
Variable	`_fast_quote_plus`	Undocumented
Variable	`_fast_url_quote`	Undocumented
Variable	`_hexdigits`	Undocumented
Variable	`_hextobyte`	Undocumented
Variable	`_scheme_re`	Undocumented
Variable	`_to_iri_unsafe`	Undocumented
Variable	`_to_uri_safe`	Undocumented
Variable	`_unquote_maps`	Undocumented

def iri_to_uri(iri: t.Union[str, t.Tuple[str, str, str, str, str]], charset: str = 'utf-8', errors: str = 'strict', safe_conversion: bool = False) -> str: (source) ¶

Convert an IRI to a URI. All non-ASCII and unsafe characters are quoted. If the URL has a domain, it is encoded to Punycode. >>> iri_to_uri('http://\u2603.net/p\xe5th?q=\xe8ry%DF') 'http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF' :param iri: The IRI to convert. :param charset: The encoding of the IRI. :param errors: Error handler to use during ``bytes.encode``. :param safe_conversion: Return the URL unchanged if it only contains ASCII characters and no whitespace. See the explanation below. There is a general problem with IRI conversion with some protocols that are in violation of the URI specification. Consider the following two IRIs:: magnet:?xt=uri:whatever itms-services://?action=download-manifest After parsing, we don't know if the scheme requires the ``//``, which is dropped if empty, but conveys different meanings in the final URL if it's present or not. In this case, you can use ``safe_conversion``, which will return the URL unchanged if it only contains ASCII characters and no whitespace. This can result in a URI with unquoted characters if it was not already quoted correctly, but preserves the URL's semantics. Werkzeug uses this for the ``Location`` header for redirects. .. versionchanged:: 0.15 All reserved characters remain unquoted. Previously, only some reserved characters were left unquoted. .. versionchanged:: 0.9.6 The ``safe_conversion`` parameter was added. .. versionadded:: 0.6

def uri_to_iri(uri: t.Union[str, t.Tuple[str, str, str, str, str]], charset: str = 'utf-8', errors: str = 'werkzeug.url_quote') -> str: (source) ¶

Convert a URI to an IRI. All valid UTF-8 characters are unquoted, leaving all reserved and invalid characters quoted. If the URL has a domain, it is decoded from Punycode. >>> uri_to_iri("http://xn--n3h.net/p%C3%A5th?q=%C3%A8ry%DF") 'http://\u2603.net/p\xe5th?q=\xe8ry%DF' :param uri: The URI to convert. :param charset: The encoding to encode unquoted bytes with. :param errors: Error handler to use during ``bytes.encode``. By default, invalid bytes are left quoted. .. versionchanged:: 0.15 All reserved and invalid characters remain quoted. Previously, only some reserved characters were preserved, and invalid bytes were replaced instead of left quoted. .. versionadded:: 0.6

def url_decode(s: t.AnyStr, charset: str = 'utf-8', include_empty: bool = True, errors: str = 'replace', separator: str = '&', cls: t.Optional[t.Type[ds.MultiDict]] = None) -> ds.MultiDict[str, str]: (source) ¶

Parse a query string and return it as a :class:`MultiDict`. :param s: The query string to parse. :param charset: Decode bytes to string with this charset. If not given, bytes are returned as-is. :param include_empty: Include keys with empty values in the dict. :param errors: Error handling behavior when decoding bytes. :param separator: Separator character between pairs. :param cls: Container to hold result instead of :class:`MultiDict`. .. versionchanged:: 2.0 The ``decode_keys`` parameter is deprecated and will be removed in Werkzeug 2.1. .. versionchanged:: 0.5 In previous versions ";" and "&" could be used for url decoding. Now only "&" is supported. If you want to use ";", a different ``separator`` can be provided. .. versionchanged:: 0.5 The ``cls`` parameter was added.

def url_decode_stream(stream: t.IO[bytes], charset: str = 'utf-8', include_empty: bool = True, errors: str = 'replace', separator: bytes = b'&', cls: t.Optional[t.Type[ds.MultiDict]] = None, limit: t.Optional[int] = None) -> ds.MultiDict[str, str]: (source) ¶

Works like :func:`url_decode` but decodes a stream. The behavior of stream and limit follows functions like :func:`~werkzeug.wsgi.make_line_iter`. The generator of pairs is directly fed to the `cls` so you can consume the data while it's parsed. :param stream: a stream with the encoded querystring :param charset: the charset of the query string. If set to `None` no decoding will take place. :param include_empty: Set to `False` if you don't want empty values to appear in the dict. :param errors: the decoding error behavior. :param separator: the pair separator to be used, defaults to ``&`` :param cls: an optional dict class to use. If this is not specified or `None` the default :class:`MultiDict` is used. :param limit: the content length of the URL data. Not necessary if a limited stream is provided. .. versionchanged:: 2.0 The ``decode_keys`` and ``return_iterator`` parameters are deprecated and will be removed in Werkzeug 2.1. .. versionadded:: 0.8

def url_encode(obj: t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]], charset: str = 'utf-8', sort: bool = False, key: t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]] = None, separator: str = '&') -> str: (source) ¶

URL encode a dict/`MultiDict`. If a value is `None` it will not appear in the result string. Per default only values are encoded into the target charset strings. :param obj: the object to encode into a query string. :param charset: the charset of the query string. :param sort: set to `True` if you want parameters to be sorted by `key`. :param separator: the separator to be used for the pairs. :param key: an optional function to be used for sorting. For more details check out the :func:`sorted` documentation. .. versionchanged:: 2.0 The ``encode_keys`` parameter is deprecated and will be removed in Werkzeug 2.1. .. versionchanged:: 0.5 Added the ``sort``, ``key``, and ``separator`` parameters.

def url_encode_stream(obj: t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]], stream: t.Optional[t.IO[str]] = None, charset: str = 'utf-8', sort: bool = False, key: t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]] = None, separator: str = '&'): (source) ¶

Like :meth:`url_encode` but writes the results to a stream object. If the stream is `None` a generator over all encoded pairs is returned. :param obj: the object to encode into a query string. :param stream: a stream to write the encoded object into or `None` if an iterator over the encoded pairs should be returned. In that case the separator argument is ignored. :param charset: the charset of the query string. :param sort: set to `True` if you want parameters to be sorted by `key`. :param separator: the separator to be used for the pairs. :param key: an optional function to be used for sorting. For more details check out the :func:`sorted` documentation. .. versionchanged:: 2.0 The ``encode_keys`` parameter is deprecated and will be removed in Werkzeug 2.1. .. versionadded:: 0.8

def url_fix(s: str, charset: str = 'utf-8') -> str: (source) ¶

Sometimes you get an URL by a user that just isn't a real URL because it contains unsafe characters like ' ' and so on. This function can fix some of the problems in a similar way browsers handle data entered by the user: >>> url_fix('http://de.wikipedia.org/wiki/Elf (Begriffskl\xe4rung)') 'http://de.wikipedia.org/wiki/Elf%20(Begriffskl%C3%A4rung)' :param s: the string with the URL to fix. :param charset: The target charset for the URL if the url was given as a string.

def url_join(base: t.Union[str, t.Tuple[str, str, str, str, str]], url: t.Union[str, t.Tuple[str, str, str, str, str]], allow_fragments: bool = True) -> str: (source) ¶

Join a base URL and a possibly relative URL to form an absolute interpretation of the latter. :param base: the base URL for the join operation. :param url: the URL to join. :param allow_fragments: indicates whether fragments should be allowed.

def url_parse(url: str, scheme: t.Optional[str] = None, allow_fragments: bool = True) -> BaseURL: (source) ¶

Parses a URL from a string into a :class:`URL` tuple. If the URL is lacking a scheme it can be provided as second argument. Otherwise, it is ignored. Optionally fragments can be stripped from the URL by setting `allow_fragments` to `False`. The inverse of this function is :func:`url_unparse`. :param url: the URL to parse. :param scheme: the default schema to use if the URL is schemaless. :param allow_fragments: if set to `False` a fragment will be removed from the URL.

def url_quote(string: t.Union[str, bytes], charset: str = 'utf-8', errors: str = 'strict', safe: t.Union[str, bytes] = '/:', unsafe: t.Union[str, bytes] = '') -> str: (source) ¶

URL encode a single string with a given encoding. :param s: the string to quote. :param charset: the charset to be used. :param safe: an optional sequence of safe characters. :param unsafe: an optional sequence of unsafe characters. .. versionadded:: 0.9.2 The `unsafe` parameter was added.

def url_quote_plus(string: str, charset: str = 'utf-8', errors: str = 'strict', safe: str = '') -> str: (source) ¶

URL encode a single string with the given encoding and convert whitespace to "+". :param s: The string to quote. :param charset: The charset to be used. :param safe: An optional sequence of safe characters.

def url_unparse(components: t.Tuple[str, str, str, str, str]) -> str: (source) ¶

The reverse operation to :meth:`url_parse`. This accepts arbitrary as well as :class:`URL` tuples and returns a URL as a string. :param components: the parsed URL as tuple which should be converted into a URL string.

def url_unquote(s: t.Union[str, bytes], charset: str = 'utf-8', errors: str = 'replace', unsafe: str = '') -> str: (source) ¶

URL decode a single string with a given encoding. If the charset is set to `None` no decoding is performed and raw bytes are returned. :param s: the string to unquote. :param charset: the charset of the query string. If set to `None` no decoding will take place. :param errors: the error handling for the charset decoding.

def url_unquote_plus(s: t.Union[str, bytes], charset: str = 'utf-8', errors: str = 'replace') -> str: (source) ¶

URL decode a single string with the given `charset` and decode "+" to whitespace. Per default encoding errors are ignored. If you want a different behavior you can set `errors` to ``'replace'`` or ``'strict'``. :param s: The string to unquote. :param charset: the charset of the query string. If set to `None` no decoding will take place. :param errors: The error handling for the `charset` decoding.

def _codec_error_url_quote(e: UnicodeError) -> t.Tuple[str, int]: (source) ¶

Used in :func:`uri_to_iri` after unquoting to re-quote any invalid bytes.

def _fast_url_quote_plus(string: bytes) -> str: (source) ¶

Undocumented

def _make_fast_url_quote(charset: str = 'utf-8', errors: str = 'strict', safe: t.Union[str, bytes] = '/:', unsafe: t.Union[str, bytes] = '') -> t.Callable[[bytes], str]: (source) ¶

Precompile the translation table for a URL encoding function. Unlike :func:`url_quote`, the generated function only takes the string to quote. :param charset: The charset to encode the result with. :param errors: How to handle encoding errors. :param safe: An optional sequence of safe characters to never encode. :param unsafe: An optional sequence of unsafe characters to always encode.

def _unquote_to_bytes(string: t.Union[str, bytes], unsafe: t.Union[str, bytes] = '') -> bytes: (source) ¶

Undocumented

def _url_decode_impl(pair_iter: t.Iterable[t.AnyStr], charset: str, include_empty: bool, errors: str) -> t.Iterator[t.Tuple[str, str]]: (source) ¶

Undocumented

def _url_encode_impl(obj: t.Union[t.Mapping[str, str], t.Iterable[t.Tuple[str, str]]], charset: str, sort: bool, key: t.Optional[t.Callable[[t.Tuple[str, str]], t.Any]]) -> t.Iterator[str]: (source) ¶

Undocumented