- add defaults dictionary, chain dictionaries rather than copying
- allow custom classes to be loaded explicitly via yaml
- for LineReader, assume ungzipped if first decompress fails
- properly ignore bad local paths
- add optional reporter object
- don't store explicit static path, but allow it to be set in the insert
- store host_prefix, which is either server name or empty
- for archival mode, absolute_paths settings controls if using absolute paths,
- for proxy always use absolute_paths
- default static path is: /static/default/
- allow extension apps to provide custom /static/X/ path
Route overriding:
- ability to set Route class
- custom init method
Archival Relative Redirect:
- if starting with timestamp, drop timestamp and assume host-relative path
Integration Tests:
- test proxy mode by using REQUEST_URI
- test archival relative redirect!
adding StaticHandler and loading templates and static resources from current package
add default template and static data to be included in the pywb package
add test for custom static route
* proxy router for handling only proxy
* proxy/archival router for handling both archival and proxy mode,
togglable with 'enable_http_proxy' setting in config
* supports only most recent capture playback -- no support for
selecting replay date/calendar view yet
* not testable with WebTest -- need better way to unit test proxy mode
wrapping previous WbResponse
overhaul yaml config to be much simpler, move best resolver and
best index reader to respective classes
add config_utils for sharing config, standard non-yaml config
provides defaults for testing
fix bug in query.html
Changes WbUrl forms:
/2013/im_/example.com -> 2013/im_/example.com
/*/example.com -> */example.com
/example.com -> example.com
* also simplify scheme-agnostic url (//) handling by just eating up extra
slashes
* add additional doctests on route, with and w/o custom SCRIPT_NAME
supports /cdx?url=... and other params including
filter=<regex>
collapse_time=<0-14>
resolve_revisits=<true|false>
reverse=<true|false>
closest=<timestamp>
* Refactor views class to support more Jinja2 views (J2Template)
* Add a home page, collection search page, and error pages, all optional
* all exceptions appear on error page
* wbrequest supports a request with an empty or / wb_url
- pywb_init module inits from ./test directory
misc:
- router has lookahead for '/'
- dechunk even for transparent/binary
- 'text' query mode displays cdx
fix html_rewriter missing ; on entities
js rewriter: only rewrite full document.domain
PathIndexPrefixResolver using binsearch on path index, for #9
resolvers moved to replay_resolvers.py
improve path-resolver logic: each resolver returns an array of possible
files (could be from primary or secondary storage).
then, iterate over all possible files from all resolvers until
a successful load, or raise exception if all failed
All rewriters can support either buffered or streaming mode.
In buffered mode, the full text content is written into a buffer
and served with a Content-Length
in streaming mode, text is streamed as it is rewritten and
no Content-Length is written
Default is to stream the response
Rename archivalrouter.MatchRegex -> archivalrouter.Route, supporting regex/prefix matching
add redir_to_exact to turn off redirect to exact timestamp in RewritingReplayHandler
update README
* Instead of relying on REQUEST_URI, pywb constructs a
REL_REQUEST_URI, from PATH_INFO + QUERY_STRING.
SCRIPT_NAME auto-added to prefix
* MatchPrefix is now superceded by MatchRegex, which
can match a plain string -- collId defaults to the full match
* Added optional archivalurl_class to router to allow for customized
ArchivalUrl implementations to be specified
* run.sh can test on a non-root mountpoint, eg. ./run.sh "/approot"
- Manually set env[‘REQUEST_URI’] (which is nonstandard) the same way
it’s set by uwsgi.
- Include HTTP error code reasons in error response. (wsgiref checks
that error code is at least 4 characters, i.e. includes reason)