Home | Trees | Indices | Help |
---|
|
The BeautifulSoup class is oriented towards skipping over common HTML errors like unclosed tags. However, sometimes it makes errors of its own. For instance, consider this fragment: <b>Foo<b>Bar</b></b> This is perfectly valid (if bizarre) HTML. However, the BeautifulSoup class will implicitly close the first b tag when it encounters the second 'b'. It will think the author wrote "<b>Foo<b>Bar", and didn't close the first 'b' tag, because there's no real-world reason to bold something that's already bold. When it encounters '</b></b>' it will close two more 'b' tags, for a grand total of three tags closed instead of two. This can throw off the rest of your document structure. The same is true of a number of other tags, listed below. It's much more common for someone to forget to close a 'b' tag than to actually use nested 'b' tags, and the BeautifulSoup class handles the common case. This class handles the not-co-common case: where you can't believe someone wrote what they did, but it's valid HTML and BeautifulSoup screwed up by assuming it wouldn't be.
|
|||
Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from Inherited from |
|
|||
I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS = 'em', 'big', 'i',
|
|||
I_CANT_BELIEVE_THEYRE_NESTABLE_BLOCK_TAGS = 'noscript'
|
|||
NESTABLE_TAGS = buildTagMap([], BeautifulSoup.NESTABLE_TAGS, I
|
|||
Inherited from Inherited from Inherited from Inherited from Inherited from |
|
|||
Inherited from |
|
I_CANT_BELIEVE_THEYRE_NESTABLE_INLINE_TAGS
|
NESTABLE_TAGS
|
Home | Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Thu Sep 16 13:42:09 2010 | http://epydoc.sourceforge.net |