class EntitySubstitution(object): (source)
Known subclasses: bs4.formatter.Formatter
The ability to substitute XML or HTML entities for certain characters.
Class Method | quoted |
Make a value into a quoted XML attribute, possibly escaping it. |
Class Method | substitute |
Replace certain Unicode characters with named HTML entities. |
Class Method | substitute |
Substitute XML entities for special XML characters. |
Class Method | substitute |
Substitute XML entities for special XML characters. |
Constant | AMPERSAND |
Undocumented |
Constant | BARE |
Undocumented |
Constant | CHARACTER |
Undocumented |
Constant | CHARACTER |
Undocumented |
Constant | CHARACTER |
Undocumented |
Constant | HTML |
Undocumented |
Class Method | _substitute |
Used with a regular expression to substitute the appropriate HTML entity for a special character string. |
Class Method | _substitute |
Used with a regular expression to substitute the appropriate XML entity for a special character string. |
Method | _populate |
Initialize variables used by this class to manage the plethora of HTML5 named entities. |
Make a value into a quoted XML attribute, possibly escaping it. Most strings will be quoted using double quotes. Bob's Bar -> "Bob's Bar" If a string contains double quotes, it will be quoted using single quotes. Welcome to "my bar" -> 'Welcome to "my bar"' If a string contains both single and double quotes, the double quotes will be escaped, and the string will be quoted using double quotes. Welcome to "Bob's Bar" -> "Welcome to "Bob's bar"
Replace certain Unicode characters with named HTML entities. This differs from data.encode(encoding, 'xmlcharrefreplace') in that the goal is to make the result more readable (to those with ASCII displays) rather than to recover from errors. There's absolutely nothing wrong with a UTF-8 string containg a LATIN SMALL LETTER E WITH ACUTE, but replacing that character with "é" will make it more readable to some people. :param s: A Unicode string.
Substitute XML entities for special XML characters. :param value: A string to be substituted. The less-than sign will become <, the greater-than sign will become >, and any ampersands will become &. If you want ampersands that appear to be part of an entity definition to be left alone, use substitute_xml_containing_entities() instead. :param make_quoted_attribute: If True, then the string will be quoted, as befits an attribute value.
def substitute_xml_containing_entities(cls, value, make_quoted_attribute=False): (source) ¶
Substitute XML entities for special XML characters. :param value: A string to be substituted. The less-than sign will become <, the greater-than sign will become >, and any ampersands that are not part of an entity defition will become &. :param make_quoted_attribute: If True, then the string will be quoted, as befits an attribute value.
Used with a regular expression to substitute the appropriate HTML entity for a special character string.
Used with a regular expression to substitute the appropriate XML entity for a special character string.
Initialize variables used by this class to manage the plethora of HTML5 named entities. This function returns a 3-tuple containing two dictionaries and a regular expression: unicode_to_name - A mapping of Unicode strings like "⦨" to entity names like "angmsdaa". When a single Unicode string has multiple entity names, we try to choose the most commonly-used name. name_to_unicode: A mapping of entity names like "angmsdaa" to Unicode strings like "⦨". named_entity_re: A regular expression matching (almost) any Unicode string that corresponds to an HTML5 named entity.