package documentation

scrapy.linkextractors This package contains a collection of Link Extractors. For more info see docs/topics/link-extractors.rst

Module lxmlhtml Link extractor based on lxml.html

From __init__.py:

Constant IGNORED_EXTENSIONS Undocumented
Function _is_valid_url Undocumented
Function _matches Undocumented
Variable _re_type Undocumented
IGNORED_EXTENSIONS: list[str] = (source)

Undocumented

Value
['7z',
 '7zip',
 'bz2',
 'rar',
 'tar',
 'tar.gz',
 'xz',
...
_re_type = (source)

Undocumented

def _matches(url, regexs): (source)

Undocumented

def _is_valid_url(url): (source)

Undocumented