module documentation

External interface to the BeautifulSoup HTML parser.

Function convert_tree Convert a BeautifulSoup tree to a list of Element trees.
Function fromstring Parse a string of HTML data into an Element tree using the BeautifulSoup parser.
Function parse Parse a file into an ElemenTree using the BeautifulSoup parser.
Function unescape Undocumented
Variable handle_entities Undocumented
Class _PseudoTag Undocumented
Function _convert_tree Undocumented
Function _init_node_converters Undocumented
Function _parse Undocumented
Constant _DECLARATION_OR_DOCTYPE Undocumented
Variable _parse_doctype_declaration Undocumented
def convert_tree(beautiful_soup_tree, makeelement=None): (source)

Convert a BeautifulSoup tree to a list of Element trees. Returns a list instead of a single root Element to support HTML-like soup with more than one root element. You can pass a different Element factory through the `makeelement` keyword.

def fromstring(data, beautifulsoup=None, makeelement=None, **bsargs): (source)

Parse a string of HTML data into an Element tree using the BeautifulSoup parser. Returns the root ``<html>`` Element of the tree. You can pass a different BeautifulSoup parser through the `beautifulsoup` keyword, and a diffent Element factory function through the `makeelement` keyword. By default, the standard ``BeautifulSoup`` class and the default factory of `lxml.html` are used.

def parse(file, beautifulsoup=None, makeelement=None, **bsargs): (source)

Parse a file into an ElemenTree using the BeautifulSoup parser. You can pass a different BeautifulSoup parser through the `beautifulsoup` keyword, and a diffent Element factory function through the `makeelement` keyword. By default, the standard ``BeautifulSoup`` class and the default factory of `lxml.html` are used.

def unescape(string): (source)


handle_entities = (source)


def _convert_tree(beautiful_soup_tree, makeelement): (source)


def _init_node_converters(makeelement): (source)


def _parse(source, beautifulsoup, makeelement, **bsargs): (source)




(Declaration, Doctype)
_parse_doctype_declaration = (source)
