class documentation

A basic test of a treebuilder's competence. Any HTML treebuilder, present or future, should be able to pass these tests. With invalid markup, there's room for interpretation, and different parsers can handle it differently. But with the markup in these tests, there's not much room for interpretation.

Method assertDoctypeHandled Assert that a given doctype string is handled correctly.
Method test_ampersand_in_attribute_value_gets_escaped Undocumented
Method test_angle_brackets_in_attribute_values_are_escaped Undocumented
Method test_apos_entity Undocumented
Method test_attribute_values_with_double_nested_quotes_get_quoted Undocumented
Method test_attribute_values_with_nested_quotes_are_left_alone Undocumented
Method test_basic_namespaces Parsers don't need to *understand* namespaces, but at the very least they should not choke on namespaces or lose data.
Method test_br_is_always_empty_element_tag A <br> tag is designated as an empty-element tag.
Method test_can_parse_unicode_document Undocumented
Method test_closing_tag_with_no_opening_tag Undocumented
Method test_comment Undocumented
Method test_correctly_nested_tables One table can go inside another one.
Method test_deepcopy Make sure you can copy the tree builder.
Method test_deeply_nested_multivalued_attribute Undocumented
Method test_detect_xml_parsed_as_html Undocumented
Method test_double_head Undocumented
Method test_empty_doctype Undocumented
Method test_empty_element_tags Verify consistent handling of empty-element tags, no matter how they come in through the markup.
Method test_entities_converted_on_the_way_out Undocumented
Method test_entities_in_attributes_converted_to_unicode Undocumented
Method test_entities_in_foreign_document_encoding Undocumented
Method test_entities_in_strings_converted_during_parsing Undocumented
Method test_entities_in_text_converted_to_unicode Undocumented
Method test_escaped_ampersand_in_attribute_value_is_left_alone Undocumented
Method test_head_tag_between_head_and_body Prevent recurrence of a bug in the html5lib treebuilder.
Method test_html5_style_meta_tag_reflects_current_encoding Undocumented
Method test_meta_tag_reflects_current_encoding Undocumented
Method test_mixed_case_doctype Undocumented
Method test_multipart_strings Mostly to prevent a recurrence of a bug in the html5lib treebuilder.
Method test_multiple_copies_of_a_tag Prevent recurrence of a bug in the html5lib treebuilder.
Method test_multivalued_attribute_on_html Undocumented
Method test_multivalued_attribute_value_becomes_list Undocumented
Method test_multivalued_attribute_with_whitespace Undocumented
Method test_namespaced_html Undocumented
Method test_namespaced_public_doctype Undocumented
Method test_namespaced_system_doctype Undocumented
Method test_nested_block_level_elements Block elements can be nested.
Method test_nested_formatting_elements Undocumented
Method test_nested_inline_elements Inline elements can be nested indefinitely.
Method test_non_breaking_spaces_converted_on_the_way_in Undocumented
Method test_normal_doctypes Make sure normal, everyday HTML doctypes are handled correctly.
Method test_out_of_range_entity Undocumented
Method test_p_tag_is_never_empty_element A <p> tag is never designated as an empty-element tag.
Method test_pickle_and_unpickle_identity Undocumented
Method test_preserved_whitespace_in_pre_and_textarea Whitespace must be preserved in <pre> and <textarea> tags, even if that would mean not prettifying the markup.
Method test_processing_instruction Undocumented
Method test_public_doctype_with_url Undocumented
Method test_python_specific_encodings_not_used_in_charset Undocumented
Method test_quot_entity_converted_to_quotation_mark Undocumented
Method test_real_hebrew_document Undocumented
Method test_real_iso_8859_document Undocumented
Method test_real_shift_jis_document Undocumented
Method test_real_xhtml_document A real XHTML document should come out more or less the same as it went in.
Method test_single_quote_attribute_values_become_double_quotes Undocumented
Method test_smart_quotes_converted_on_the_way_in Undocumented
Method test_soupstrainer Parsers should be able to work with SoupStrainers.
Method test_special_string_containers Undocumented
Method test_strings_resembling_character_entity_references Undocumented
Method test_system_doctype Undocumented
Method test_tag_with_no_attributes_can_have_attributes_added Undocumented
Method test_unclosed_tags_get_closed A tag that's not closed by the end of the document should be closed.
Method test_worst_case Test the worst case (currently) for linking issues.
Method _document_with_doctype Generate and parse a document with the given doctype.

Inherited from TreeBuilderSmokeTest:

Method test_attribute_multi_valued Undocumented
Method test_attribute_not_multi_valued Undocumented
Method test_fuzzed_input Undocumented
def assertDoctypeHandled(self, doctype_fragment): (source)

Assert that a given doctype string is handled correctly.

def test_ampersand_in_attribute_value_gets_escaped(self): (source)

Undocumented

def test_angle_brackets_in_attribute_values_are_escaped(self): (source)

Undocumented

def test_apos_entity(self): (source)

Undocumented

def test_attribute_values_with_double_nested_quotes_get_quoted(self): (source)

Undocumented

def test_attribute_values_with_nested_quotes_are_left_alone(self): (source)

Undocumented

def test_basic_namespaces(self): (source)

Parsers don't need to *understand* namespaces, but at the very least they should not choke on namespaces or lose data.

def test_br_is_always_empty_element_tag(self): (source)

A <br> tag is designated as an empty-element tag. Some parsers treat <br></br> as one <br/> tag, some parsers as two tags, but it should always be an empty-element tag.

def test_can_parse_unicode_document(self): (source)

Undocumented

def test_closing_tag_with_no_opening_tag(self): (source)

Undocumented

def test_comment(self): (source)

Undocumented

def test_correctly_nested_tables(self): (source)

One table can go inside another one.

def test_deepcopy(self): (source)

Make sure you can copy the tree builder. This is important because the builder is part of a BeautifulSoup object, and we want to be able to copy that.

def test_deeply_nested_multivalued_attribute(self): (source)

Undocumented

def test_detect_xml_parsed_as_html(self): (source)

Undocumented

def test_double_head(self): (source)

Undocumented

def test_empty_doctype(self): (source)

Undocumented

def test_empty_element_tags(self): (source)

Verify consistent handling of empty-element tags, no matter how they come in through the markup.

def test_entities_converted_on_the_way_out(self): (source)

Undocumented

def test_entities_in_attributes_converted_to_unicode(self): (source)

Undocumented

def test_entities_in_foreign_document_encoding(self): (source)

Undocumented

def test_entities_in_strings_converted_during_parsing(self): (source)

Undocumented

def test_entities_in_text_converted_to_unicode(self): (source)

Undocumented

def test_escaped_ampersand_in_attribute_value_is_left_alone(self): (source)

Undocumented

def test_head_tag_between_head_and_body(self): (source)

Prevent recurrence of a bug in the html5lib treebuilder.

def test_html5_style_meta_tag_reflects_current_encoding(self): (source)

Undocumented

def test_meta_tag_reflects_current_encoding(self): (source)

Undocumented

def test_mixed_case_doctype(self): (source)

Undocumented

def test_multipart_strings(self): (source)

Mostly to prevent a recurrence of a bug in the html5lib treebuilder.

def test_multiple_copies_of_a_tag(self): (source)

Prevent recurrence of a bug in the html5lib treebuilder.

def test_multivalued_attribute_on_html(self): (source)

Undocumented

def test_multivalued_attribute_value_becomes_list(self): (source)

Undocumented

def test_multivalued_attribute_with_whitespace(self): (source)

Undocumented

def test_namespaced_html(self): (source)

Undocumented

def test_namespaced_public_doctype(self): (source)
def test_namespaced_system_doctype(self): (source)
def test_nested_block_level_elements(self): (source)

Block elements can be nested.

def test_nested_formatting_elements(self): (source)

Undocumented

def test_nested_inline_elements(self): (source)

Inline elements can be nested indefinitely.

def test_non_breaking_spaces_converted_on_the_way_in(self): (source)

Undocumented

def test_normal_doctypes(self): (source)

Make sure normal, everyday HTML doctypes are handled correctly.

def test_out_of_range_entity(self): (source)

Undocumented

def test_p_tag_is_never_empty_element(self): (source)

A <p> tag is never designated as an empty-element tag. Even if the markup shows it as an empty-element tag, it shouldn't be presented that way.

def test_pickle_and_unpickle_identity(self): (source)

Undocumented

def test_preserved_whitespace_in_pre_and_textarea(self): (source)

Whitespace must be preserved in <pre> and <textarea> tags, even if that would mean not prettifying the markup.

def test_processing_instruction(self): (source)

Undocumented

def test_public_doctype_with_url(self): (source)

Undocumented

def test_python_specific_encodings_not_used_in_charset(self): (source)

Undocumented

def test_quot_entity_converted_to_quotation_mark(self): (source)

Undocumented

def test_real_hebrew_document(self): (source)

Undocumented

def test_real_iso_8859_document(self): (source)

Undocumented

def test_real_shift_jis_document(self): (source)

Undocumented

def test_real_xhtml_document(self): (source)

A real XHTML document should come out more or less the same as it went in.

def test_single_quote_attribute_values_become_double_quotes(self): (source)

Undocumented

def test_smart_quotes_converted_on_the_way_in(self): (source)

Undocumented

def test_soupstrainer(self): (source)

Parsers should be able to work with SoupStrainers.

def test_special_string_containers(self): (source)

Undocumented

def test_strings_resembling_character_entity_references(self): (source)

Undocumented

def test_system_doctype(self): (source)

Undocumented

def test_tag_with_no_attributes_can_have_attributes_added(self): (source)

Undocumented

def test_unclosed_tags_get_closed(self): (source)

A tag that's not closed by the end of the document should be closed. This applies to all tags except empty-element tags.

def test_worst_case(self): (source)

Test the worst case (currently) for linking issues.

def _document_with_doctype(self, doctype_fragment, doctype_string='DOCTYPE'): (source)

Generate and parse a document with the given doctype.