Package buildxml :: Module config
[hide private]
[frames] | no frames]

Module config

source code

This is the configuration file for the packege.

The variables PORTALS and PLUGINS together hold all sources to be included in the search index.


Author: Johannes Schwenk

Copyright: 2010, Johannes Schwenk

Version: 2.0

Date: 2010-09-15

Variables [hide private]
string USER_AGENT = u'Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv...
The user agent string the client should use to identify itself to servers.
string PORTAL_PLUGIN_NAME = u'portal'
The name of the generic plugin module name for Plone-Portals.
int PORTAL_RETRY_WAIT = 10
If the portals server could not fulfill the request, wait for X seconds before a retry.
int MAX_PORTAL_RETRIES = 3
Number of retries before failure.
int PORTAL_REQUEST_INCREMENT = 3
Number of entries to get from the portal's server for each incremental request.
int REQUEST_TIMEOUT = 1200
Number of seconds to wait for the servers response.
datetime LAST_QUERY_DEFAULT = datetime(1970, 1, 1)
The date from which to start the querying of portals if no last update is specified, e.g.
list of dict PORTALS = [{u'url': u'http://cmsdev.rektorat.uni-freiburg.de:2...
List of portals to query.
string PLUGIN_DIR_NAME = u'plugins'
Name of the directory containing the plugins.
list of dict PLUGINS = [{u'name': u'stb', u'url': u'http://info.verwaltung....
List of plugins to load and query.
int LOG_LEVEL = logging.DEBUG
The debug level to be used by e.g.
string LOG_FILE_DIR = u'./log'
The directory of the logfile.
string LOG_FILENAME = u'%s/getXML.log' % LOG_FILE_DIR
The full path and filename of the logfile.
int LOG_BACKUP_COUNT = 9
Number of logfile backups to keep.
int LOG_ROLLOVER_SIZE = 0
If the logfile exceeds this size (in bytes), the logger will start a new logfile and keep up to LOG_BACKUP_COUNT old logfiles around.
string STATE_FILE_DIR = u'./state'
The directory where to save the state for portals and plugins.
string STATE_FILE_EXT = u'dat'
The extension of the state files written to STATE_FILE_DIR.
string TEMP_DIR = u'./tmp'
Name of the directory for temporary data, e.g.
string TEMP_FILE_EXT = u'tmp'
The extension of temporary files.
string TEMPLATES_DIR = u'./templates'
Name of a directory where to find templates and text snippets.
string XML_FILENAME = u'unifr.xml'
The filename of the resulting XML document ready to be fed to the parser for search index generation.
string OUT_DIR = u'/home/schwenk/dipl/completesearch/databases/unifr'
The file output file (XML_FILENAME) will be moved to this location once the retrieval process has finished successfully.
bool ALWAYS_OUTPUT_STATS_ON_EXIT = True
Whether to output the stats to stderr on exit of getXML.py, regardless of an error or warning has occured or not.
string COMPLETION_SERVER_PROGRAM = u'./codebase/server/startCompletio...
Command to start the CompletionServer.
list COMPLETION_SERVER = [COMPLETION_SERVER_PROGRAM, u'-d', u'0d', ...
Join the arguments to start the CompletionServer so they can be passed to subprocess.call().
string COMPLETION_SERVER_START_DIR = u'/home/schwenk/dipl/completesea...
Working directory from which to start the CompletionServer.
string PARSER_DIR = u'/home/schwenk/dipl/completesearch/databases/unifr'
The directory where the parser is located.
list PARSER = [u'make', u'pall']
The command to start the parsing of the XML file.
Variables Details [hide private]

USER_AGENT

The user agent string the client should use to identify itself to servers.

Type:
string
Value:
u'Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.1.5) '+ u'Gecko/\
20091102 Firefox/3.5.5'

MAX_PORTAL_RETRIES

Number of retries before failure. If the portals server could not fulfill the request, wait for PORTAL_RETRY_WAIT seconds before retrying. Retry a maximum of X times.

Type:
int
Value:
3

LAST_QUERY_DEFAULT

The date from which to start the querying of portals if no last update is specified, e.g. on the first run.

Type:
datetime
Value:
datetime(1970, 1, 1)

PORTALS

List of portals to query. Each entry is a dictionary with url, and name, where url is the URL to the portal's remoteSyncQueryXML script and name is beeing used in statistics and logging.

Type:
list of dict
Value:
[{u'url': u'http://cmsdev.rektorat.uni-freiburg.de:23456/'+ u'remoteSy\
ncQueryXML', u'name': u'cmsdev'}, {u'url': u'http://zope5.ruf.uni-frei\
burg.de:12285/exzellenz/'+ u'remoteSyncQueryXML', u'name': u'exzellenz\
'}, {u'url': u'http://zope5.ruf.uni-freiburg.de:12285/podcasts/'+ u're\
moteSyncQueryXML', u'name': u'podcasts'}, {u'url': u'http://zope5.ruf.\
uni-freiburg.de:12285/pr/remoteSyncQueryXML', u'name': u'pr'}, {u'url'\
: u'http://zope5.ruf.uni-freiburg.de:12285/mw/remoteSyncQueryXML', u'n\
ame': u'mw'}, {u'url': u'http://zope3.ruf.uni-freiburg.de:12281/uni/re\
...

PLUGINS

List of plugins to load and query. Each entry is a dictionary with url, and name, where url is passed to the plugin, usually as starting point for the data retrieval process, and name is beeing used in statistics and logging.

Type:
list of dict
Value:
[{u'name': u'stb', u'url': u'http://info.verwaltung.uni-freiburg.de/se\
rvuni/'+ u'stellenuni.abfr1?kategorieid=alle&layout=v3'+ u'&sprache=d&\
ausgabeart=xml'}, {u'name': u'vkal', u'url': u'http://info.verwaltung.\
uni-freiburg.de/servuni/'+ u'vkaluni.abfr1?layout=v3&ausgabeart=xml&'+\
 u'modus=2&zeitpunkt=4'}, {u'name': u'studentenwerk', u'url': u'http:/\
/www.studentenwerk.uni-freiburg.de/'+ u'index.php?id=272',}, {u'name':\
 u'forschdb', u'url':(u'http://forschdb.verwaltung.uni-freiburg.de/ser\
vuni/' u'forschdbuni.fdbfbr1?Fakultaet=${fac}&Dokumentart=' u'Publikat\
...

LOG_LEVEL

The debug level to be used by e.g. BaseLogger. Can be one of DEBUG, INFO, WARNING, ERROR or CRITICAL .

Type:
int
Value:
logging.DEBUG

LOG_BACKUP_COUNT

Number of logfile backups to keep.

Type:
int

See Also: LOG_ROLLOVER_SIZE

Value:
9

STATE_FILE_DIR

The directory where to save the state for portals and plugins.

Type:
string

See Also: xmlgetter.state

Value:
u'./state'

STATE_FILE_EXT

The extension of the state files written to STATE_FILE_DIR. The name of the state files will be the name of the plugin or portal defined in PORTALS or PLUGINS

Type:
string

See Also: PortalSourceState

Value:
u'dat'

TEMP_DIR

Name of the directory for temporary data, e.g. retrieval data.

Type:
string
Value:
u'./tmp'

XML_FILENAME

The filename of the resulting XML document ready to be fed to the parser for search index generation. It will be built in TEMP_DIR and on successful generation moved to OUT_DIR .

Type:
string
Value:
u'unifr.xml'

OUT_DIR

The file output file (XML_FILENAME) will be moved to this location once the retrieval process has finished successfully. Must be an absolute path!

Type:
string
Value:
u'/home/schwenk/dipl/completesearch/databases/unifr'

ALWAYS_OUTPUT_STATS_ON_EXIT

Whether to output the stats to stderr on exit of getXML.py, regardless of an error or warning has occured or not. Useful if one wants to get notified about every completed acquisition process.

Type:
bool
Value:
True

COMPLETION_SERVER_PROGRAM

Command to start the CompletionServer.

Type:
string
Value:
u'./codebase/server/startCompletionServer'

COMPLETION_SERVER

Join the arguments to start the CompletionServer so they can be passed to subprocess.call().

Type:
list
Value:
[COMPLETION_SERVER_PROGRAM, u'-d', u'0d', u'-w', u'0d', u'-S', u'SSSS'\
, u'-r', u'-p', u'12345', u'-l', u'unifr.log', u'unifr.hybrid']

COMPLETION_SERVER_START_DIR

Working directory from which to start the CompletionServer.

Type:
string
Value:
u'/home/schwenk/dipl/completesearch/databases' u'/unifr'

PARSER_DIR

The directory where the parser is located. Absolute path!

Type:
string
Value:
u'/home/schwenk/dipl/completesearch/databases/unifr'

PARSER

The command to start the parsing of the XML file. Also in this case (set to execute "make pall") it also rebuilds the index.

Type:
list
Value:
[u'make', u'pall']