class PageElement(object): (source)
Known subclasses: bs4.element.NavigableString
, bs4.element.Tag
Contains the navigational information for some part of the page: that is, its current location in the parse tree. NavigableString, Tag, etc. are all subclasses of PageElement.
Method | append |
Appends the given PageElement to the contents of this one. |
Method | extend |
Appends the given PageElements to this one's contents. |
Method | extract |
Destructively rips this element out of the tree. |
Method | find |
Find all PageElements that match the given criteria and appear later in the document than this PageElement. |
Method | find |
Look backwards in the document from this PageElement and find all PageElements that match the given criteria. |
Method | find |
Find the first PageElement that matches the given criteria and appears later in the document than this PageElement. |
Method | find |
Find the closest sibling to this PageElement that matches the given criteria and appears later in the document. |
Method | find |
Find all siblings of this PageElement that match the given criteria and appear later in the document. |
Method | find |
Find the closest parent of this PageElement that matches the given criteria. |
Method | find |
Find all parents of this PageElement that match the given criteria. |
Method | find |
Look backwards in the document from this PageElement and find the first PageElement that matches the given criteria. |
Method | find |
Returns the closest sibling to this PageElement that matches the given criteria and appears earlier in the document. |
Method | find |
Returns all siblings to this PageElement that match the given criteria and appear earlier in the document. |
Method | format |
Format the given string using the given formatter. |
Method | formatter |
Look up or create a Formatter for the given identifier, if necessary. |
Method | get |
Get all child strings of this PageElement, concatenated using the given separator. |
Method | insert |
Insert a new PageElement in the list of this PageElement's children. |
Method | insert |
Makes the given element(s) the immediate successor of this one. |
Method | insert |
Makes the given element(s) the immediate predecessor of this one. |
Method | next |
Undocumented |
Method | next |
Undocumented |
Method | parent |
Undocumented |
Method | previous |
Undocumented |
Method | previous |
Undocumented |
Method | replace |
Replace this PageElement with one or more PageElements, keeping the rest of the tree the same. |
Method | setup |
Sets up the initial relations between this element and other elements. |
Method | unwrap |
Replace this PageElement with its contents. |
Method | wrap |
Wrap this PageElement inside another one. |
Class Variable | default |
Undocumented |
Class Variable | next |
Undocumented |
Class Variable | previous |
Undocumented |
Class Variable | text |
Undocumented |
Instance Variable | next |
Undocumented |
Instance Variable | next |
Undocumented |
Instance Variable | parent |
Undocumented |
Instance Variable | previous |
Undocumented |
Instance Variable | previous |
Undocumented |
Property | decomposed |
Check whether a PageElement has been decomposed. |
Property | next |
The PageElement, if any, that was parsed just after this one. |
Property | next |
All PageElements that were parsed after this one. |
Property | next |
All PageElements that are siblings of this one but were parsed later. |
Property | parents |
All PageElements that are parents of this PageElement. |
Property | previous |
The PageElement, if any, that was parsed just before this one. |
Property | previous |
All PageElements that were parsed before this one. |
Property | previous |
All PageElements that are siblings of this one but were parsed earlier. |
Property | stripped |
Yield all strings in this PageElement, stripping them first. |
Method | _all |
Yield all strings of certain classes, possibly stripping them. |
Method | _find |
Iterates over a generator looking for things that match. |
Method | _find |
Undocumented |
Method | _last |
Finds the last element beneath this object to be parsed. |
Property | _is |
Is this element part of an XML tree or an HTML tree? |
Appends the given PageElements to this one's contents. :param tags: A list of PageElements. If a single Tag is provided instead, this PageElement's contents will be extended with that Tag's contents.
Destructively rips this element out of the tree. :param _self_index: The location of this element in its parent's .contents, if known. Passing this in allows for a performance optimization. :return: `self`, no longer part of the tree.
Find all PageElements that match the given criteria and appear later in the document than this PageElement. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :param limit: Stop looking after finding this many results. :kwargs: A dictionary of filters on attribute values. :return: A ResultSet containing PageElements.
Look backwards in the document from this PageElement and find all PageElements that match the given criteria. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :param limit: Stop looking after finding this many results. :kwargs: A dictionary of filters on attribute values. :return: A ResultSet of PageElements. :rtype: bs4.element.ResultSet
Find the first PageElement that matches the given criteria and appears later in the document than this PageElement. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Find the closest sibling to this PageElement that matches the given criteria and appears later in the document. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Find all siblings of this PageElement that match the given criteria and appear later in the document. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :param limit: Stop looking after finding this many results. :kwargs: A dictionary of filters on attribute values. :return: A ResultSet of PageElements. :rtype: bs4.element.ResultSet
Find the closest parent of this PageElement that matches the given criteria. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Find all parents of this PageElement that match the given criteria. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param limit: Stop looking after finding this many results. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Look backwards in the document from this PageElement and find the first PageElement that matches the given criteria. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Returns the closest sibling to this PageElement that matches the given criteria and appears earlier in the document. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :kwargs: A dictionary of filters on attribute values. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
Returns all siblings to this PageElement that match the given criteria and appear earlier in the document. All find_* methods take a common set of arguments. See the online documentation for detailed explanations. :param name: A filter on tag name. :param attrs: A dictionary of filters on attribute values. :param string: A filter for a NavigableString with specific text. :param limit: Stop looking after finding this many results. :kwargs: A dictionary of filters on attribute values. :return: A ResultSet of PageElements. :rtype: bs4.element.ResultSet
Format the given string using the given formatter. :param s: A string. :param formatter: A Formatter object, or a string naming one of the standard formatters.
Look up or create a Formatter for the given identifier, if necessary. :param formatter: Can be a Formatter object (used as-is), a function (used as the entity substitution hook for an XMLFormatter or HTMLFormatter), or a string (used to look up an XMLFormatter or HTMLFormatter in the appropriate registry.
Get all child strings of this PageElement, concatenated using the given separator. :param separator: Strings will be concatenated using this separator. :param strip: If True, strings will be stripped before being concatenated. :param types: A tuple of NavigableString subclasses. Any strings of a subclass not found in this list will be ignored. Although there are exceptions, the default behavior in most cases is to consider only NavigableString and CData objects. That means no comments, processing instructions, etc. :return: A string.
Insert a new PageElement in the list of this PageElement's children. This works the same way as `list.insert`. :param position: The numeric position that should be occupied in `self.children` by the new PageElement. :param new_child: A PageElement.
bs4.BeautifulSoup
Makes the given element(s) the immediate successor of this one. The elements will have the same parent, and the given elements will be immediately after this one. :param args: One or more PageElements.
bs4.BeautifulSoup
Makes the given element(s) the immediate predecessor of this one. All the elements will have the same parent, and the given elements will be immediately before this one. :param args: One or more PageElements.
Replace this PageElement with one or more PageElements, keeping the rest of the tree the same. :param args: One or more PageElements. :return: `self`, no longer part of the tree.
Sets up the initial relations between this element and other elements. :param parent: The parent of this element. :param previous_element: The element parsed immediately before this one. :param next_element: The element parsed immediately before this one. :param previous_sibling: The most recently encountered element on the same level of the parse tree as this one. :param previous_sibling: The next element to be encountered on the same level of the parse tree as this one.
Wrap this PageElement inside another one. :param wrap_inside: A PageElement. :return: `wrap_inside`, occupying the position in the tree that used to be occupied by `self`, and with `self` inside it.
The PageElement, if any, that was parsed just after this one. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
All PageElements that are siblings of this one but were parsed later. :yield: A sequence of PageElements.
The PageElement, if any, that was parsed just before this one. :return: A PageElement. :rtype: bs4.element.Tag | bs4.element.NavigableString
All PageElements that are siblings of this one but were parsed earlier. :yield: A sequence of PageElements.
Yield all strings in this PageElement, stripping them first. :yield: A sequence of stripped strings.
bs4.element.NavigableString
, bs4.element.Tag
Yield all strings of certain classes, possibly stripping them. This is implemented differently in Tag and NavigableString.