Ilya Kreymer
b785cd6f08
memento: use mp_ modifier to support memento with frame or non-frame replay
...
change memento test to use frame replay
2014-07-20 15:43:39 -07:00
Ilya Kreymer
96fcaab521
live-rewrite-server: add ability to specify http/https proxy for live fetching
...
(for example, for use with a recording proxy)
2014-07-19 14:43:28 -07:00
Ilya Kreymer
f80c27ec00
cookie: add test for 'document.cookie' rewriting
2014-07-15 12:57:02 -07:00
Ilya Kreymer
fa52e0126d
cookies: support client side rewriting of document.cooke -> WB_wombat_cookie to rewrite cookie path, if present
2014-07-15 12:52:42 -07:00
Ilya Kreymer
e858b8faae
rewrite: better fix for multiple ../ in urls, additional tests
2014-07-14 20:50:45 -07:00
Ilya Kreymer
7032160cf9
rewrite: fix rel url resolution to better handle parent rel path.
...
Explicitly resolve path when possible, remove only if at root level
2014-07-14 19:13:19 -07:00
Ilya Kreymer
1b1a1f8115
proxy: add 'proxy_coll_select' config which will require a proxy-auth to select a collection for proxy mode.
...
Otherwise, defaults to first available collection, though proxy-auth can still be sent to specify different collection
2014-07-14 19:12:30 -07:00
Ilya Kreymer
1317b2b10f
route selection via proxy auth!
...
refactor poute request parsing to happen in the actual router class instead of in the route
in proxy mode, add support for picking a route via proxy-auth
improve test for 'top' rewriting
2014-07-10 21:54:23 -07:00
Ilya Kreymer
daffc7ff5d
header rewrite: pass through 'content-range' header
2014-07-07 17:02:44 -07:00
Ilya Kreymer
02326a2b12
bump dev version to 0.4.8
2014-07-07 17:02:28 -07:00
Ilya Kreymer
7694bf0678
update README.rst for master 0.4.7
2014-07-01 16:22:38 -07:00
Ilya Kreymer
46b16c61d5
update changelist, version to 0.4.7
2014-07-01 16:15:25 -07:00
Ilya Kreymer
2a2240a23a
fix 'bad.cdx' sorting order
2014-07-01 15:36:13 -07:00
Ilya Kreymer
1a42331e69
Merge branch 'develop' into binary-parse
2014-07-01 10:00:05 -07:00
Ilya Kreymer
1980b66127
warc indexing: in include_all mode, pass 'warcinfo' records to writer, allowing it to option to handle or ignore
2014-07-01 09:59:16 -07:00
Ilya Kreymer
57a38dedce
Merge branch 'develop' into binary-parse
2014-06-28 11:53:50 -07:00
Ilya Kreymer
377ea33bc8
tests: add test for wombat top
2014-06-28 11:53:23 -07:00
Ilya Kreymer
b0f7fdbed8
regexrewrite: fix rewrite for 'top'
2014-06-28 11:50:11 -07:00
Ilya Kreymer
f2bfc96002
Merge branch 'develop' into binary-parse
2014-06-28 11:04:43 -07:00
Ilya Kreymer
83b69e8447
indexing: don't include records of type 'application/warc-fields' unless all records are being included
2014-06-28 11:03:44 -07:00
Ilya Kreymer
70b7e29b36
pass raw bytes to htmlparser, assuming ascii-compatibility
...
(todo: add tests for non-ascii compatible encodings)
improved rendering of certain pages, needs more testing
lxml: remove lxml and complexity associated with having the parser,
as its too unpredictable for older html, does its own decoding.
2014-06-27 19:03:06 -07:00
Ilya Kreymer
dd9f138bab
disable decoding, by default, of content for html parser
2014-06-27 16:53:33 -07:00
Ilya Kreymer
fb07775d38
tests: add 'bad.cdx' for testing cdx lines with missing original for revisit,
...
missing/non-existant warc
2014-06-25 12:32:57 -07:00
Ilya Kreymer
913a1e9f31
warc: simplify recordloader a bit more, only response and request records
...
get parsed as http (excluding dns: and whois: uris)
All others have an '-' status and no headers parsing
tests: add test for zero-length revisits
2014-06-25 12:11:26 -07:00
Ilya Kreymer
6761f5697f
indexing: refactor cdxindexer interface to better allow custom writers
...
record loader: skip whois: and dns: records, better skipping of arc headers
(todo: need more unit tests)
2014-06-24 17:08:10 -07:00
Ilya Kreymer
3965fad4dd
cdx indexing: add support for 9-field cdx output,
...
request merge: store referer if available, check for record id matching
2014-06-19 16:51:23 -07:00
Ilya Kreymer
694b97e67f
archive indexing: Refactor, split into ArchiveIterator generic iteration and cdx-indexer,
...
which writes out CDX specifically
recordloader: always load request, limit stream before headers are loaded
2014-06-19 13:37:42 -07:00
Ilya Kreymer
de65b68edc
rules: additions to rules for FB
2014-06-18 16:45:54 -07:00
Ilya Kreymer
22a2da6e0c
rewrite: for WB_wombat_top rewriting, select next-to-top instead of self
2014-06-16 19:42:15 -07:00
Ilya Kreymer
e1c1d23a9f
framed replay: improved url update support, ensure update url is actually
...
the url of the frame (ignore ajax requests)
2014-06-16 18:46:01 -07:00
Ilya Kreymer
ac3efec4bc
update develop to 0.4.6
...
improved regex for top -> WB_wombat_top rewriting
2014-06-16 15:57:22 -07:00
Ilya Kreymer
f26b0ddbe4
update setup.py version
2014-06-15 12:35:20 -07:00
Ilya Kreymer
987a9ee58f
update README for master
2014-06-15 12:34:14 -07:00
Ilya Kreymer
c4e3f25f9a
Merge branch 'develop' for 0.4.5 release
2014-06-15 12:32:47 -07:00
Ilya Kreymer
4767ab0fdd
Update CHANGES.rst to 4.5
2014-06-15 12:09:10 -07:00
Ilya Kreymer
88d3e94b36
fixes for pep8, name fixes
2014-06-15 11:57:48 -07:00
Ilya Kreymer
073f1e142e
test_config: test lxml parser still
2014-06-14 21:33:08 -07:00
Ilya Kreymer
80e80e97d3
replay: support 'framed_replay' option in config for both replay and live rewrite
...
split replay view into BaseContentView and ReplayView
refactor RewriteLiveHandler into RewriteLiveView
add additional tests for framed and non-framed mode
default to framed replay!
2014-06-14 18:26:19 -07:00
Ilya Kreymer
d21f8079ca
cookie rewrite: remove max-age, add test
2014-06-14 10:04:31 -07:00
Ilya Kreymer
ceeb25a899
rewrite: fix unit tests, add extra closed check for 2.6 (not sure why its needed now)
2014-06-14 01:02:00 -07:00
Ilya Kreymer
028e274b22
rewrite tests: improve POST test, only add header if not empty
2014-06-14 00:18:35 -07:00
Ilya Kreymer
d7516f4cd7
rewrite: fix <base> rewriting, urlrewriter replacement
...
turn off lxml rewriter by default
2014-06-13 16:44:37 -07:00
Ilya Kreymer
0d3f663ef1
rewrite: disable refer-redirect in case of POST, handle request w/o redirect
...
(can't use 307 because of FF)
2014-06-13 16:23:11 -07:00
Ilya Kreymer
dfef05a74d
rewrite: live rewrite: switch to including all headers rather than a whitelist for proxying
2014-06-13 16:22:18 -07:00
Ilya Kreymer
41e1809039
update wombat.js (support for write override, fill in WB_wombat_location on new iframe)
...
disable 307 redirects as FF always displays modal confirmation for these, even for same host
2014-06-11 20:12:05 -07:00
Ilya Kreymer
bdafe0938d
remove accidental debug commits
2014-06-11 12:44:49 -07:00
Ilya Kreymer
14ed6c5898
remove accidental changes
2014-06-11 12:42:44 -07:00
Ilya Kreymer
0c9d88f032
POST replay: treat POST form data same as get query, no '&&&' marker
...
additional testing POST
2014-06-11 11:17:06 -07:00
Ilya Kreymer
e2349a74e2
replay: better POST support via post query append!
...
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
028cdaa22e
bump version to 0.4.1
2014-06-05 14:10:30 -07:00