class documentation

class LxmlLinkExtractor: (source)

View In Hierarchy

Undocumented

Method __init__ Undocumented
Method extract_links Returns a list of :class:`~scrapy.link.Link` objects from the specified :class:`response <scrapy.http.Response>`.
Method matches Undocumented
Instance Variable allow_domains Undocumented
Instance Variable allow_res Undocumented
Instance Variable canonicalize Undocumented
Instance Variable deny_domains Undocumented
Instance Variable deny_extensions Undocumented
Instance Variable deny_res Undocumented
Instance Variable link_extractor Undocumented
Instance Variable restrict_text Undocumented
Instance Variable restrict_xpaths Undocumented
Method _extract_links Undocumented
Method _link_allowed Undocumented
Method _process_links Undocumented
Class Variable _csstranslator Undocumented
def __init__(self, allow=(), deny=(), allow_domains=(), deny_domains=(), restrict_xpaths=(), tags=('a', 'area'), attrs=('href'), canonicalize=False, unique=True, process_value=None, deny_extensions=None, restrict_css=(), strip=True, restrict_text=None): (source)

Undocumented

def extract_links(self, response): (source)

Returns a list of :class:`~scrapy.link.Link` objects from the specified :class:`response <scrapy.http.Response>`. Only links that match the settings passed to the ``__init__`` method of the link extractor are returned. Duplicate links are omitted if the ``unique`` attribute is set to ``True``, otherwise they are returned.

def matches(self, url): (source)

Undocumented

allow_domains = (source)

Undocumented

allow_res = (source)

Undocumented

canonicalize = (source)

Undocumented

deny_domains = (source)

Undocumented

deny_extensions = (source)

Undocumented

deny_res = (source)

Undocumented

link_extractor = (source)

Undocumented

restrict_text = (source)

Undocumented

restrict_xpaths = (source)

Undocumented

def _extract_links(self, *args, **kwargs): (source)

Undocumented

def _link_allowed(self, link): (source)

Undocumented

def _process_links(self, links): (source)

Undocumented

_csstranslator = (source)

Undocumented