1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-31 11:14:10 +02:00

16 Commits

Author SHA1 Message Date
Jack Cushman
903583c3d7 Handle ArchivalUrl subclasses. 2014-01-20 14:13:16 -05:00
Ilya Kreymer
9ff3fc300b Fix #5, bringing back customParams optional params sent to cdx server
Rename archivalrouter.MatchRegex -> archivalrouter.Route, supporting regex/prefix matching
add redir_to_exact to turn off redirect to exact timestamp in RewritingReplayHandler
update README
2014-01-20 10:50:06 -08:00
Ilya Kreymer
6cb1743163 Merge branch 'master' of github.com:ikreymer/pywb into work 2014-01-19 12:31:53 -08:00
Ilya Kreymer
354040a7e0 support for url-agnostic dedup, eg loading payload from a different url
than the revisit
2014-01-19 12:31:19 -08:00
Jack Cushman
c9d0b0ba7b Handle transfer-encoding:chunked; misc. replay bugs.
- Add a ChunkedLineReader to deal with replays with the
transfer-encoding: chunked header.
- Catch UnicodeDecodeErrors caused by multibyte characters getting
split during buffering.
- A couple of tiny bugs in replay.py
2014-01-18 21:32:49 -05:00
Ilya Kreymer
7ce6d0d22b first pass on html rendering via jinja, support for query (cdx) rendering 2014-01-17 16:24:36 -08:00
Ilya Kreymer
bcc9588c00 * archivalrouter: to take a list of handlers,
currently MatchPrefix and MatchRegex. handler returns a single response
(no chaining for now)
* rewriting: don't rewrite anchor only urls
* perf: add a very basic profiler in WBHandler for testing
2014-01-16 20:33:51 -08:00
Ilya Kreymer
c4457abc4c Update README
Rename FullHandler -> WBHandler
Add additional comments!
2014-01-03 21:44:20 -08:00
Ilya Kreymer
d820a8c06a add some comments, make charset parsing lower() 2014-01-03 17:40:20 -08:00
Ilya Kreymer
c3767cd31b fix css url parsing typo
always default to utf-8 if chardet thinks ascii
tweak banner
2014-01-03 21:38:18 +00:00
Ilya Kreymer
2357f108a3 rename rewriters
header_rewriter added!
support for encoding detection
various fixes
xmlrewriter
2014-01-03 13:03:03 -08:00
Ilya Kreymer
cca9071c53 minor tweaks, increase num closest searched, upper case url check
css remove fixed pos
2013-12-31 21:01:18 +00:00
Ilya Kreymer
d9930322f1 support utf-8 (so far)
support protocol-agnostic prefix //
failedFile list for warc loading
2013-12-31 00:18:12 +00:00
Ilya Kreymer
997dc5df0f fixes! Fix typos, in html parsing, fix base, support attrs w/o values 2013-12-30 03:03:33 +00:00
Ilya Kreymer
a84ec2abc7 first iteration of archival mode working w/ banner insertion!! 2013-12-28 17:39:43 -08:00
Ilya Kreymer
16f458d5ec archiveloader: Support for loading warc/arc records using hanzo parser (for record header parsing only)
ReplayHandler: load replay from query response, find best option
basic support for matching url, checking self-redirects!
2013-12-28 05:00:06 -08:00