1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 00:03:28 +01:00

384 Commits

Author SHA1 Message Date
Ilya Kreymer
913a1e9f31 warc: simplify recordloader a bit more, only response and request records
get parsed as http (excluding dns: and whois: uris)
All others have an '-' status and no headers parsing
tests: add test for zero-length revisits
2014-06-25 12:11:26 -07:00
Ilya Kreymer
6761f5697f indexing: refactor cdxindexer interface to better allow custom writers
record loader: skip whois: and dns: records, better skipping of arc headers
(todo: need more unit tests)
2014-06-24 17:08:10 -07:00
Ilya Kreymer
3965fad4dd cdx indexing: add support for 9-field cdx output,
request merge: store referer if available, check for record id matching
2014-06-19 16:51:23 -07:00
Ilya Kreymer
694b97e67f archive indexing: Refactor, split into ArchiveIterator generic iteration and cdx-indexer,
which writes out CDX specifically
recordloader: always load request, limit stream before headers are loaded
2014-06-19 13:37:42 -07:00
Ilya Kreymer
de65b68edc rules: additions to rules for FB 2014-06-18 16:45:54 -07:00
Ilya Kreymer
22a2da6e0c rewrite: for WB_wombat_top rewriting, select next-to-top instead of self 2014-06-16 19:42:15 -07:00
Ilya Kreymer
e1c1d23a9f framed replay: improved url update support, ensure update url is actually
the url of the frame (ignore ajax requests)
2014-06-16 18:46:01 -07:00
Ilya Kreymer
ac3efec4bc update develop to 0.4.6
improved regex for top -> WB_wombat_top rewriting
2014-06-16 15:57:22 -07:00
Ilya Kreymer
f26b0ddbe4 update setup.py version 2014-06-15 12:35:20 -07:00
Ilya Kreymer
987a9ee58f update README for master 2014-06-15 12:34:14 -07:00
Ilya Kreymer
c4e3f25f9a Merge branch 'develop' for 0.4.5 release 2014-06-15 12:32:47 -07:00
Ilya Kreymer
4767ab0fdd Update CHANGES.rst to 4.5 2014-06-15 12:09:10 -07:00
Ilya Kreymer
88d3e94b36 fixes for pep8, name fixes 2014-06-15 11:57:48 -07:00
Ilya Kreymer
073f1e142e test_config: test lxml parser still 2014-06-14 21:33:08 -07:00
Ilya Kreymer
80e80e97d3 replay: support 'framed_replay' option in config for both replay and live rewrite
split replay view into BaseContentView and ReplayView
refactor RewriteLiveHandler into RewriteLiveView
add additional tests for framed and non-framed mode
default to framed replay!
2014-06-14 18:26:19 -07:00
Ilya Kreymer
d21f8079ca cookie rewrite: remove max-age, add test 2014-06-14 10:04:31 -07:00
Ilya Kreymer
ceeb25a899 rewrite: fix unit tests, add extra closed check for 2.6 (not sure why its needed now) 2014-06-14 01:02:00 -07:00
Ilya Kreymer
028e274b22 rewrite tests: improve POST test, only add header if not empty 2014-06-14 00:18:35 -07:00
Ilya Kreymer
d7516f4cd7 rewrite: fix <base> rewriting, urlrewriter replacement
turn off lxml rewriter by default
2014-06-13 16:44:37 -07:00
Ilya Kreymer
0d3f663ef1 rewrite: disable refer-redirect in case of POST, handle request w/o redirect
(can't use 307 because of FF)
2014-06-13 16:23:11 -07:00
Ilya Kreymer
dfef05a74d rewrite: live rewrite: switch to including all headers rather than a whitelist for proxying 2014-06-13 16:22:18 -07:00
Ilya Kreymer
41e1809039 update wombat.js (support for write override, fill in WB_wombat_location on new iframe)
disable 307 redirects as FF always displays modal confirmation for these, even for same host
2014-06-11 20:12:05 -07:00
Ilya Kreymer
bdafe0938d remove accidental debug commits 2014-06-11 12:44:49 -07:00
Ilya Kreymer
14ed6c5898 remove accidental changes 2014-06-11 12:42:44 -07:00
Ilya Kreymer
0c9d88f032 POST replay: treat POST form data same as get query, no '&&&' marker
additional testing POST
2014-06-11 11:17:06 -07:00
Ilya Kreymer
e2349a74e2 replay: better POST support via post query append!
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
028cdaa22e bump version to 0.4.1 2014-06-05 14:10:30 -07:00
Ilya Kreymer
cf119174ea rewrite: for rewriting purposes, use original cdx url, not the request url
(significance if trailing '/' is present)
2014-06-05 14:09:30 -07:00
Ilya Kreymer
2c65521ea3 final README.rst edits 0.4.0 2014-05-30 12:52:43 -07:00
Ilya Kreymer
18f7031423 add bullet points to README! 2014-05-30 12:45:59 -07:00
Ilya Kreymer
e3bbf95280 merge develop for 0.4.0, update paths to master branch 2014-05-30 12:39:37 -07:00
Ilya Kreymer
05812060c0 Merge branch 'develop' 2014-05-30 12:37:59 -07:00
Ilya Kreymer
6d6f2452fc update README and CHANGES for release 2014-05-30 12:37:30 -07:00
Ilya Kreymer
9519e8d6f1 Update CHANGES.rst 2014-05-30 12:27:20 -07:00
Ilya Kreymer
f9710d033c fix integration test for 307
update head_insert for new wombat
remove redundant host jinja func, use 'urlsplit' instead
2014-05-30 11:17:12 -07:00
Ilya Kreymer
52040127b3 update wombat.js to latest
rewrite live: add another rewrite live header,
use 307 for archival referer based redirects
2014-05-30 11:03:22 -07:00
Ilya Kreymer
de69372b9f Update CHANGES.rst 2014-05-30 10:54:17 -07:00
Ilya Kreymer
9340165014 Changes for 0.4.0 2014-05-30 10:52:59 -07:00
Ilya Kreymer
eaf9cce261 Update README.rst
update for 0.4.0
2014-05-30 10:29:22 -07:00
Ilya Kreymer
9b732def93 cookie_rewriting: if domain is specified, apply cookie to coll root
rather than rewritten path.. needed in order for subdomain cookies to be
detected properly
2014-05-18 21:51:07 -07:00
Ilya Kreymer
8c15ac16fd search page template: add 'prefix' to search page template 2014-05-18 21:27:53 -07:00
Ilya Kreymer
1d674d97d8 pep8 pass! 2014-05-16 22:44:26 -07:00
Ilya Kreymer
923421d637 rewrite_content: add a few tests for cs_, js_, remove redundant except 2014-05-16 22:43:53 -07:00
Ilya Kreymer
2600d870d7 improved test: dsrules remove redundant check
static: check invalid static paths and file_wrapper
memento: check non-memento paths
test debug handlers and custom '-cdx' suffix
2014-05-16 22:17:51 -07:00
Ilya Kreymer
ca33287051 test: move non-surt-cdx sample to non-surt-cdx/ dir for clarity / avoid confusion
when bulk loading cdx/ dir (surt and non-surt cdx should NOT be mixed)
2014-05-16 21:21:14 -07:00
Ilya Kreymer
7d236af7d7 cdx: fix creation and add test for non-surt cdx (pywb-nonsurt/ test)
archiveindexer: -u option to generate non-surt cdx
tests: full test coverage for cdxdomainspecific (fuzzy and custom canon)
2014-05-16 21:16:50 -07:00
Ilya Kreymer
8758e60590 update to latest wombat.js 2014-05-16 09:58:07 -07:00
Ilya Kreymer
5285723ccf cookie_rewriter: catch CookieError and ignore erroring cookies 2014-05-15 22:37:08 -07:00
Ilya Kreymer
1d8c68b745 rewrite: only translate non-empty header values 2014-05-13 17:42:55 -07:00
Ilya Kreymer
871cc26fa4 rewrite: add optional cookie_rewriter, created by urlrewriter and called from header_rewriter
cookie_rewriter works correctly with a concatenated set-cookie list, returns a list of rewritten 'set-cookie' headers
rewrite_live: add proxying of Host, Origin, additional headers
split header rewriter tests into test_header_rewriter, add test_cookie_rewriter
bump version to 0.4.0!
2014-05-13 17:07:41 -07:00