Package buildxml :: Package tools :: Module BeautifulSoup :: Class Tag
[hide private]
[frames] | no frames]

Class Tag

source code


Represents a found HTML tag with its attributes and contents.

Instance Methods [hide private]
 
_invert(h)
Cheap function to invert a hash.
source code
 
_convertEntities(self, match)
Used in a call to re.sub to replace HTML, XML, and numeric entities with the appropriate Unicode characters.
source code
 
__init__(self, parser, name, attrs=None, parent=None, previous=None)
Basic constructor.
source code
 
getString(self) source code
 
setString(self, string)
Replace the contents of the tag with a string
source code
 
getText(self, separator=u"") source code
 
text(self, separator=u"") source code
 
get(self, key, default=None)
Returns the value of the 'key' attribute for the tag, or the value given for 'default' if it doesn't have that attribute.
source code
 
clear(self)
Extract all children.
source code
 
index(self, element) source code
 
has_key(self, key) source code
 
__getitem__(self, key)
tag[key] returns the value of the 'key' attribute for the tag, and throws an exception if it's not there.
source code
 
__iter__(self)
Iterating over a tag iterates over its contents.
source code
 
__len__(self)
The length of a tag is the length of its list of contents.
source code
 
__contains__(self, x) source code
 
__nonzero__(self)
A tag is non-None even if it has no contents.
source code
 
__setitem__(self, key, value)
Setting tag[key] sets the value of the 'key' attribute for the tag.
source code
 
__delitem__(self, key)
Deleting tag[key] deletes all 'key' attributes for the tag.
source code
 
__call__(self, *args, **kwargs)
Calling a tag like a function is the same as calling its findAll() method.
source code
 
__getattr__(self, tag) source code
 
__eq__(self, other)
Returns true iff this tag has the same name, the same attributes, and the same contents (recursively) as the given tag.
source code
 
__ne__(self, other)
Returns true iff this tag is not identical to the other tag, as defined in __eq__.
source code
 
__repr__(self, encoding=DEFAULT_OUTPUT_ENCODING)
Renders this tag as a string.
source code
 
__unicode__(self) source code
 
_sub_entity(self, x)
Used with a regular expression to substitute the appropriate XML entity for an XML special character.
source code
 
__str__(self, encoding=DEFAULT_OUTPUT_ENCODING, prettyPrint=False, indentLevel=0)
Returns a string or Unicode representation of this tag and its contents.
source code
 
decompose(self)
Recursively destroys the contents of this tree.
source code
 
prettify(self, encoding=DEFAULT_OUTPUT_ENCODING) source code
 
renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING, prettyPrint=False, indentLevel=0)
Renders the contents of this tag as a string in the given encoding.
source code
 
find(self, name=None, attrs={}, recursive=True, text=None, **kwargs)
Return only the first child of this Tag matching the given criteria.
source code
 
findChild(self, name=None, attrs={}, recursive=True, text=None, **kwargs)
Return only the first child of this Tag matching the given criteria.
source code
 
findAll(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)
Extracts a list of Tag objects that match the given criteria.
source code
 
findChildren(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)
Extracts a list of Tag objects that match the given criteria.
source code
 
first(self, name=None, attrs={}, recursive=True, text=None, **kwargs)
Return only the first child of this Tag matching the given criteria.
source code
 
fetch(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)
Extracts a list of Tag objects that match the given criteria.
source code
 
fetchText(self, text=None, recursive=True, limit=None) source code
 
firstText(self, text=None, recursive=True) source code
 
_getAttrMap(self)
Initializes a map representation of this tag's attributes, if not already initialized.
source code
 
childGenerator(self) source code
 
recursiveChildGenerator(self) source code

Inherited from PageElement: append, extract, fetchNextSiblings, fetchParents, fetchPrevious, fetchPreviousSiblings, findAllNext, findAllPrevious, findNext, findNextSibling, findNextSiblings, findParent, findParents, findPrevious, findPreviousSibling, findPreviousSiblings, insert, nextGenerator, nextSiblingGenerator, parentGenerator, previousGenerator, previousSiblingGenerator, replaceWith, replaceWithChildren, setup, substituteEncoding, toEncoding

Inherited from PageElement (private): _findAll, _findOne, _lastRecursiveChild

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Class Variables [hide private]
  XML_ENTITIES_TO_SPECIAL_CHARS = {"apos": "'", "quot": '"', "am...
  XML_SPECIAL_CHARS_TO_ENTITIES = _invert(XML_ENTITIES_TO_SPECIA...
  string = property(getString, setString)
  BARE_AMPERSAND_OR_BRACKET = re.compile("([<>]|"+ "&(?!#\d+;|#x...
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

_convertEntities(self, match)

source code 

Used in a call to re.sub to replace HTML, XML, and numeric entities with the appropriate Unicode characters. If HTML entities are being converted, any unrecognized entities are escaped.

__init__(self, parser, name, attrs=None, parent=None, previous=None)
(Constructor)

source code 

Basic constructor.

Overrides: object.__init__

__call__(self, *args, **kwargs)
(Call operator)

source code 

Calling a tag like a function is the same as calling its findAll() method. Eg. tag('a') returns a list of all the A tags found within this tag.

__eq__(self, other)
(Equality operator)

source code 

Returns true iff this tag has the same name, the same attributes, and the same contents (recursively) as the given tag.

NOTE: right now this will return false if two tags have the same attributes in a different order. Should this be fixed?

__repr__(self, encoding=DEFAULT_OUTPUT_ENCODING)
(Representation operator)

source code 

Renders this tag as a string.

Overrides: object.__repr__

__str__(self, encoding=DEFAULT_OUTPUT_ENCODING, prettyPrint=False, indentLevel=0)
(Informal representation operator)

source code 

Returns a string or Unicode representation of this tag and its contents. To get Unicode, pass None for encoding.

NOTE: since Python's HTML parser consumes whitespace, this method is not certain to reproduce the whitespace present in the original string.

Overrides: object.__str__

renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING, prettyPrint=False, indentLevel=0)

source code 

Renders the contents of this tag as a string in the given encoding. If encoding is None, returns a Unicode string..

findAll(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)

source code 

Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.

The value of a key-value pair in the 'attrs' map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of 'matches'. The same is true of the tag name.

findChildren(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)

source code 

Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.

The value of a key-value pair in the 'attrs' map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of 'matches'. The same is true of the tag name.

fetch(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)

source code 

Extracts a list of Tag objects that match the given criteria. You can specify the name of the Tag and any attributes you want the Tag to have.

The value of a key-value pair in the 'attrs' map can be a string, a list of strings, a regular expression object, or a callable that takes a string and returns whether or not the string matches for some custom definition of 'matches'. The same is true of the tag name.


Class Variable Details [hide private]

XML_ENTITIES_TO_SPECIAL_CHARS

Value:
{"apos": "'", "quot": '"', "amp": "&", "lt": "<", "gt": ">"}

XML_SPECIAL_CHARS_TO_ENTITIES

Value:
_invert(XML_ENTITIES_TO_SPECIAL_CHARS)

BARE_AMPERSAND_OR_BRACKET

Value:
re.compile("([<>]|"+ "&(?!#\d+;|#x[0-9a-fA-F]+;|\w+;)"+ ")")