| |
- GrabRequest
- MirrorGroup
-
- MGRandomOrder
- MGRandomStart
class GrabRequest |
|
This is a dummy class used to hold information about the specific
request. For example, a single file. By maintaining this information
separately, we can accomplish two things:
1) make it a little easier to be threadsafe
2) have request-specific parameters |
|
|
class MGRandomOrder(MirrorGroup) |
|
A mirror group that uses mirrors in a random order.
This behavior of this class is identical to MirrorGroup, except that
it uses the mirrors in a random order. Note that the order is set at
initialization time and fixed thereafter. That is, it does not pick a
random mirror after each failure. |
|
Methods defined here:
- __init__(self, grabber, mirrors, **kwargs)
- Initialize the object
The arguments for intialization are the same as for MirrorGroup
Methods inherited from MirrorGroup:
- increment_mirror(self, gr, action={})
- Tell the mirror object increment the mirror index
This increments the mirror index, which amounts to telling the
mirror object to use a different mirror (for this and future
downloads).
This is a SEMI-public method. It will be called internally,
and you may never need to call it. However, it is provided
(and is made public) so that the calling program can increment
the mirror choice for methods like urlopen. For example, with
urlopen, there's no good way for the mirror group to know that
an error occurs mid-download (it's already returned and given
you the file object).
remove --- can have several values
0 do not remove the mirror from the list
1 remove the mirror for this download only
2 remove the mirror permanently
beware of remove=0 as it can lead to infinite loops
- urlgrab(self, url, filename=None, **kwargs)
- urlopen(self, url, **kwargs)
- urlread(self, url, limit=None, **kwargs)
Data and other attributes inherited from MirrorGroup:
- options = ['default_action', 'failure_callback']
|
class MGRandomStart(MirrorGroup) |
|
A mirror group that starts at a random mirror in the list.
This behavior of this class is identical to MirrorGroup, except that
it starts at a random location in the mirror list. |
|
Methods defined here:
- __init__(self, grabber, mirrors, **kwargs)
- Initialize the object
The arguments for intialization are the same as for MirrorGroup
Methods inherited from MirrorGroup:
- increment_mirror(self, gr, action={})
- Tell the mirror object increment the mirror index
This increments the mirror index, which amounts to telling the
mirror object to use a different mirror (for this and future
downloads).
This is a SEMI-public method. It will be called internally,
and you may never need to call it. However, it is provided
(and is made public) so that the calling program can increment
the mirror choice for methods like urlopen. For example, with
urlopen, there's no good way for the mirror group to know that
an error occurs mid-download (it's already returned and given
you the file object).
remove --- can have several values
0 do not remove the mirror from the list
1 remove the mirror for this download only
2 remove the mirror permanently
beware of remove=0 as it can lead to infinite loops
- urlgrab(self, url, filename=None, **kwargs)
- urlopen(self, url, **kwargs)
- urlread(self, url, limit=None, **kwargs)
Data and other attributes inherited from MirrorGroup:
- options = ['default_action', 'failure_callback']
|
class MirrorGroup |
|
Base Mirror class
Instances of this class are built with a grabber object and a list
of mirrors. Then all calls to urlXXX should be passed relative urls.
The requested file will be searched for on the first mirror. If the
grabber raises an exception (possibly after some retries) then that
mirror will be removed from the list, and the next will be attempted.
If all mirrors are exhausted, then an exception will be raised.
MirrorGroup has the following failover policy:
* downloads begin with the first mirror
* by default (see default_action below) a failure (after retries)
causes it to increment the local AND master indices. Also,
the current mirror is removed from the local list (but NOT the
master list - the mirror can potentially be used for other
files)
* if the local list is ever exhausted, a URLGrabError will be
raised (errno=256, no more mirrors)
OPTIONS
In addition to the required arguments "grabber" and "mirrors",
MirrorGroup also takes the following optional arguments:
default_action
A dict that describes the actions to be taken upon failure
(after retries). default_action can contain any of the
following keys (shown here with their default values):
default_action = {'increment': 1,
'increment_master': 1,
'remove': 1,
'remove_master': 0,
'fail': 0}
In this context, 'increment' means "use the next mirror" and
'remove' means "never use this mirror again". The two
'master' values refer to the instance-level mirror list (used
for all files), whereas the non-master values refer to the
current download only.
The 'fail' option will cause immediate failure by re-raising
the exception and no further attempts to get the current
download.
This dict can be set at instantiation time,
mg = MirrorGroup(grabber, mirrors, default_action={'fail':1})
at method-execution time (only applies to current fetch),
filename = mg.urlgrab(url, default_action={'increment': 0})
or by returning an action dict from the failure_callback
return {'fail':0}
in increasing precedence.
If all three of these were done, the net result would be:
{'increment': 0, # set in method
'increment_master': 1, # class default
'remove': 1, # class default
'remove_master': 0, # class default
'fail': 0} # set at instantiation, reset
# from callback
failure_callback
this is a callback that will be called when a mirror "fails",
meaning the grabber raises some URLGrabError. If this is a
tuple, it is interpreted to be of the form (cb, args, kwargs)
where cb is the actual callable object (function, method,
etc). Otherwise, it is assumed to be the callable object
itself. The callback will be passed a grabber.CallbackObject
instance along with args and kwargs (if present). The following
attributes are defined withing the instance:
obj.exception = < exception that was raised >
obj.mirror = < the mirror that was tried >
obj.relative_url = < url relative to the mirror >
obj.url = < full url that failed >
# .url is just the combination of .mirror
# and .relative_url
The failure callback can return an action dict, as described
above.
Like default_action, the failure_callback can be set at
instantiation time or when the urlXXX method is called. In
the latter case, it applies only for that fetch.
The callback can re-raise the exception quite easily. For
example, this is a perfectly adequate callback function:
def callback(obj): raise obj.exception
WARNING: do not save the exception object (or the
CallbackObject instance). As they contain stack frame
references, they can lead to circular references.
Notes:
* The behavior can be customized by deriving and overriding the
'CONFIGURATION METHODS'
* The 'grabber' instance is kept as a reference, not copied.
Therefore, the grabber instance can be modified externally
and changes will take effect immediately. |
|
Methods defined here:
- __init__(self, grabber, mirrors, **kwargs)
- Initialize the MirrorGroup object.
REQUIRED ARGUMENTS
grabber - URLGrabber instance
mirrors - a list of mirrors
OPTIONAL ARGUMENTS
failure_callback - callback to be used when a mirror fails
default_action - dict of failure actions
See the module-level and class level documentation for more
details.
- increment_mirror(self, gr, action={})
- Tell the mirror object increment the mirror index
This increments the mirror index, which amounts to telling the
mirror object to use a different mirror (for this and future
downloads).
This is a SEMI-public method. It will be called internally,
and you may never need to call it. However, it is provided
(and is made public) so that the calling program can increment
the mirror choice for methods like urlopen. For example, with
urlopen, there's no good way for the mirror group to know that
an error occurs mid-download (it's already returned and given
you the file object).
remove --- can have several values
0 do not remove the mirror from the list
1 remove the mirror for this download only
2 remove the mirror permanently
beware of remove=0 as it can lead to infinite loops
- urlgrab(self, url, filename=None, **kwargs)
- urlopen(self, url, **kwargs)
- urlread(self, url, limit=None, **kwargs)
Data and other attributes defined here:
- options = ['default_action', 'failure_callback']
| |