Class SyncPlugin_forschdb
source code
Finds all publications in the Forschungsdatenbank from all 11
faculties and the Universitätsklinikum.
The plugin needs a template url entry in config.PLUGINS:
...
{u'name': u'forschdb',
u'url':
(u'http://forschdb.verwaltung.uni-freiburg.de/servuni/'
u'forschdbuni.fdbfbr1?Fakultaet=${fac}&Dokumentart='
u'Publikation&Ausgabeart=xml&Jahr=1900-${to_year}')},
...
It will then replace ${to_year}
with the current year and
generate a list of 13 URLs replacing ${fac}
onces with 99
and the other 12 times with values from the range (0, 11). These URLs
will then be queried, resulting each in a XML document with all
publication entries for the faculty fac
from the year
1900
until now.
The contents of each <publication>
entry is then
parsed with BeautifulSoup and a XMLEntry is produced. The content of the XMLEntry will
be produced according to common citation rules, which presently
distinguish five different types of publications:
-
"Buchbeitrag"
-
"Monografie und Herausgeberschrift"
-
"Edition und Uebersetzung"
-
"Sonstiges"
-
"Artikel"
"Artikel" will be the catch-all for unknown types of
publications, of which there are presently none.
To Do:
Have a look at memory consumption and optimize!
Notes:
-
All authors of a publication will be listed in the content of the
XMLEntry to increase the findability. This does
not however conform to standard standard citation rules. Since the
Furschungsdatenbank also uses this citation style for authors, this
increases coherence.
-
The function _getData uses german
variable names to be coherent with the naming of the XML elements
it processes.
Inherited from xmlgetter.plugin.BaseSyncPlugin (private):
_NO_NET ,
_base_url ,
_entries ,
_entries_written ,
_from_date ,
_intermediate_temp_filename ,
_intermediate_xml_filename ,
_stats ,
_temp_filename ,
_url ,
_xml_filename
|
Extracts data from a BeautifulSoup.Tag instance.
- Parameters:
tag (BeautifulSoup.Tag) - The root tag from which on to search for the data.
tagname (string) - The name of the tag that contains the data. If
tagname is None , the data will be
extracted from tag itself.
- Returns:
- A string representing the found data, or
None if no
data could be found.
|
Extracts author data from a BeautifulSoup.Tag
instance.
- Parameters:
tag (BeautifulSoup.Tag) - The root tag from which on to search for the data.
- Returns:
- All authors of the publication concatenated and separated by a
','. If no author could be found,
None is returned.
|
Gets the data from the Forschungsdatenbank.
Retrieves all publications for each faculty and the university
hospital.
Uses german variable names to be coherent with the XML data
retrieved.
- Returns: bool
False if an error or warning occured,
True otherwise.
- Overrides:
xmlgetter.plugin.BaseSyncPlugin._getData
|