class NavigableString(str, PageElement): (source)
Known subclasses: bs4.element.PreformattedString
, bs4.element.RubyParenthesisString
, bs4.element.RubyTextString
, bs4.element.Script
, bs4.element.Stylesheet
, bs4.element.TemplateString
A Python Unicode string that is part of a parse tree. When Beautiful Soup parses the markup <b>penguin</b>, it will create a NavigableString for the string "penguin".
Method | __copy__ |
A copy of a NavigableString has the same contents and class as the original, but it is not connected to the parse tree. |
Method | __getattr__ |
text.string gives you text. This is for backwards compatibility for Navigable*String, but for CData* it lets you get the string without the CData wrapper. |
Method | __getnewargs__ |
Undocumented |
Method | __new__ |
Create a new NavigableString. |
Method | name |
Prevent NavigableString.name from ever being set. |
Method | output |
Run the string through the provided formatter. |
Constant | PREFIX |
Undocumented |
Constant | SUFFIX |
Undocumented |
Class Variable | known |
Undocumented |
Class Variable | strings |
Undocumented |
Property | name |
Since a NavigableString is not a Tag, it has no .name. |
Method | _all |
Yield all strings of certain classes, possibly stripping them. |
Inherited from PageElement
:
Method | append |
Appends the given PageElement to the contents of this one. |
Method | extend |
Appends the given PageElements to this one's contents. |
Method | extract |
Destructively rips this element out of the tree. |
Method | find |
Find all PageElements that match the given criteria and appear later in the document than this PageElement. |
Method | find |
Look backwards in the document from this PageElement and find all PageElements that match the given criteria. |
Method | find |
Find the first PageElement that matches the given criteria and appears later in the document than this PageElement. |
Method | find |
Find the closest sibling to this PageElement that matches the given criteria and appears later in the document. |
Method | find |
Find all siblings of this PageElement that match the given criteria and appear later in the document. |
Method | find |
Find the closest parent of this PageElement that matches the given criteria. |
Method | find |
Find all parents of this PageElement that match the given criteria. |
Method | find |
Look backwards in the document from this PageElement and find the first PageElement that matches the given criteria. |
Method | find |
Returns the closest sibling to this PageElement that matches the given criteria and appears earlier in the document. |
Method | find |
Returns all siblings to this PageElement that match the given criteria and appear earlier in the document. |
Method | format |
Format the given string using the given formatter. |
Method | formatter |
Look up or create a Formatter for the given identifier, if necessary. |
Method | get |
Get all child strings of this PageElement, concatenated using the given separator. |
Method | insert |
Insert a new PageElement in the list of this PageElement's children. |
Method | insert |
Makes the given element(s) the immediate successor of this one. |
Method | insert |
Makes the given element(s) the immediate predecessor of this one. |
Method | next |
Undocumented |
Method | next |
Undocumented |
Method | parent |
Undocumented |
Method | previous |
Undocumented |
Method | previous |
Undocumented |
Method | replace |
Replace this PageElement with one or more PageElements, keeping the rest of the tree the same. |
Method | setup |
Sets up the initial relations between this element and other elements. |
Method | unwrap |
Replace this PageElement with its contents. |
Method | wrap |
Wrap this PageElement inside another one. |
Class Variable | default |
Undocumented |
Class Variable | next |
Undocumented |
Class Variable | previous |
Undocumented |
Class Variable | text |
Undocumented |
Instance Variable | next |
Undocumented |
Instance Variable | next |
Undocumented |
Instance Variable | parent |
Undocumented |
Instance Variable | previous |
Undocumented |
Instance Variable | previous |
Undocumented |
Property | decomposed |
Check whether a PageElement has been decomposed. |
Property | next |
The PageElement, if any, that was parsed just after this one. |
Property | next |
All PageElements that were parsed after this one. |
Property | next |
All PageElements that are siblings of this one but were parsed later. |
Property | parents |
All PageElements that are parents of this PageElement. |
Property | previous |
The PageElement, if any, that was parsed just before this one. |
Property | previous |
All PageElements that were parsed before this one. |
Property | previous |
All PageElements that are siblings of this one but were parsed earlier. |
Property | stripped |
Yield all strings in this PageElement, stripping them first. |
Method | _find |
Iterates over a generator looking for things that match. |
Method | _find |
Undocumented |
Method | _last |
Finds the last element beneath this object to be parsed. |
Property | _is |
Is this element part of an XML tree or an HTML tree? |
A copy of a NavigableString has the same contents and class as the original, but it is not connected to the parse tree.
text.string gives you text. This is for backwards compatibility for Navigable*String, but for CData* it lets you get the string without the CData wrapper.
Create a new NavigableString. When unpickling a NavigableString, this method is called with the string in DEFAULT_OUTPUT_ENCODING. That encoding needs to be passed in to the superclass's __new__ or the superclass won't know how to handle non-ASCII characters.
bs4.element.PreformattedString
Run the string through the provided formatter. :param formatter: A Formatter object, or a string naming one of the standard formatters.
Since a NavigableString is not a Tag, it has no .name. This property is implemented so that code like this doesn't crash when run on a mixture of Tag and NavigableString objects: [x.name for x in tag.children]
bs4.element.PageElement._all_strings
Yield all strings of certain classes, possibly stripping them. This makes it easy for NavigableString to implement methods like get_text() as conveniences, creating a consistent text-extraction API across all PageElements. :param strip: If True, all strings will be stripped before being yielded. :param types: A tuple of NavigableString subclasses. If this NavigableString isn't one of those subclasses, the sequence will be empty. By default, the subclasses considered are NavigableString and CData objects. That means no comments, processing instructions, etc. :yield: A sequence that either contains this string, or is empty.