Ilya Kreymer
14ed6c5898
remove accidental changes
2014-06-11 12:42:44 -07:00
Ilya Kreymer
0c9d88f032
POST replay: treat POST form data same as get query, no '&&&' marker
...
additional testing POST
2014-06-11 11:17:06 -07:00
Ilya Kreymer
e2349a74e2
replay: better POST support via post query append!
...
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
028cdaa22e
bump version to 0.4.1
2014-06-05 14:10:30 -07:00
Ilya Kreymer
cf119174ea
rewrite: for rewriting purposes, use original cdx url, not the request url
...
(significance if trailing '/' is present)
2014-06-05 14:09:30 -07:00
Ilya Kreymer
18f7031423
add bullet points to README!
2014-05-30 12:45:59 -07:00
Ilya Kreymer
e3bbf95280
merge develop for 0.4.0, update paths to master branch
2014-05-30 12:39:37 -07:00
Ilya Kreymer
05812060c0
Merge branch 'develop'
2014-05-30 12:37:59 -07:00
Ilya Kreymer
6d6f2452fc
update README and CHANGES for release
2014-05-30 12:37:30 -07:00
Ilya Kreymer
9519e8d6f1
Update CHANGES.rst
2014-05-30 12:27:20 -07:00
Ilya Kreymer
f9710d033c
fix integration test for 307
...
update head_insert for new wombat
remove redundant host jinja func, use 'urlsplit' instead
2014-05-30 11:17:12 -07:00
Ilya Kreymer
52040127b3
update wombat.js to latest
...
rewrite live: add another rewrite live header,
use 307 for archival referer based redirects
2014-05-30 11:03:22 -07:00
Ilya Kreymer
de69372b9f
Update CHANGES.rst
2014-05-30 10:54:17 -07:00
Ilya Kreymer
9340165014
Changes for 0.4.0
2014-05-30 10:52:59 -07:00
Ilya Kreymer
eaf9cce261
Update README.rst
...
update for 0.4.0
2014-05-30 10:29:22 -07:00
Ilya Kreymer
9b732def93
cookie_rewriting: if domain is specified, apply cookie to coll root
...
rather than rewritten path.. needed in order for subdomain cookies to be
detected properly
2014-05-18 21:51:07 -07:00
Ilya Kreymer
8c15ac16fd
search page template: add 'prefix' to search page template
2014-05-18 21:27:53 -07:00
Ilya Kreymer
1d674d97d8
pep8 pass!
2014-05-16 22:44:26 -07:00
Ilya Kreymer
923421d637
rewrite_content: add a few tests for cs_, js_, remove redundant except
2014-05-16 22:43:53 -07:00
Ilya Kreymer
2600d870d7
improved test: dsrules remove redundant check
...
static: check invalid static paths and file_wrapper
memento: check non-memento paths
test debug handlers and custom '-cdx' suffix
2014-05-16 22:17:51 -07:00
Ilya Kreymer
ca33287051
test: move non-surt-cdx sample to non-surt-cdx/ dir for clarity / avoid confusion
...
when bulk loading cdx/ dir (surt and non-surt cdx should NOT be mixed)
2014-05-16 21:21:14 -07:00
Ilya Kreymer
7d236af7d7
cdx: fix creation and add test for non-surt cdx (pywb-nonsurt/ test)
...
archiveindexer: -u option to generate non-surt cdx
tests: full test coverage for cdxdomainspecific (fuzzy and custom canon)
2014-05-16 21:16:50 -07:00
Ilya Kreymer
8758e60590
update to latest wombat.js
2014-05-16 09:58:07 -07:00
Ilya Kreymer
5285723ccf
cookie_rewriter: catch CookieError and ignore erroring cookies
2014-05-15 22:37:08 -07:00
Ilya Kreymer
1d8c68b745
rewrite: only translate non-empty header values
2014-05-13 17:42:55 -07:00
Ilya Kreymer
871cc26fa4
rewrite: add optional cookie_rewriter, created by urlrewriter and called from header_rewriter
...
cookie_rewriter works correctly with a concatenated set-cookie list, returns a list of rewritten 'set-cookie' headers
rewrite_live: add proxying of Host, Origin, additional headers
split header rewriter tests into test_header_rewriter, add test_cookie_rewriter
bump version to 0.4.0!
2014-05-13 17:07:41 -07:00
Ilya Kreymer
89da165467
exceptions: add optional url param to WbException, move handler_exception()
...
into WSGIApp for easier customization
2014-05-13 01:54:12 -07:00
Ilya Kreymer
e7957a5cae
remove SeekableTextFileReader, replaced with standard file-like objects
...
and seek(0, 2) and tell() to get file length
2014-05-06 20:54:42 -07:00
Ilya Kreymer
46449ac188
rewrite: pass wburl mod to rewritier, so that css/js rewriting
...
rules may override default content-type (in cases where it is incorrect)
allows for rule based cusomization (to be added later)
2014-05-05 22:12:45 -07:00
Ilya Kreymer
d2795dfdaa
minor cleanup:
...
wburl: add is_url_query() check
views: add kwargs to J2HtmlCapturesView for better extensibility
query_handler: simplify make_cdx_response() arguments
2014-05-01 11:58:34 -07:00
Ilya Kreymer
4c075d14af
views: actually encode template result as utf-8!
2014-04-30 21:16:05 -07:00
Ilya Kreymer
9cf5327e88
bufferedreader cleanup:
...
* BufferedReader defaults to no decompression
* DecompressingBufferedReader defaults to gzip decomp
* ChunkedDataReader defaults to no gzip decomp, but decomp
can be set later via set_decomp().
This allow chunked responses to be de-chunked but not decompressed
(eg for non-text responses)
2014-04-28 20:15:31 -07:00
Ilya Kreymer
53ad67eb9c
rewrite: disable one 'top' rewriting rule (should move to seperate mixin)
...
views: add urlsplit jinja2 filter
2014-04-27 01:04:20 -07:00
Ilya Kreymer
09653cf77e
rewrite: more nuanced 'top' rewriting, fix wombat frame mode detection
2014-04-26 18:43:25 -07:00
Ilya Kreymer
58f261fda4
cdx redis: disable new test until fakeredis supports zrangebylex()
2014-04-25 11:00:49 -07:00
Ilya Kreymer
2b8bea616e
when given a redis path of redis://<host>/<db>/<key>, use <key> as a
...
sorted cdx file with zrangebylex!
modified tests but need zrangebylex() support in fakeredis to finish
2014-04-25 10:52:35 -07:00
Ilya Kreymer
e4262502b0
fix ChunkedDataReader chunked + gzip decomp: if reading one chunk yields no data
...
(due to more data being needed for gzip decomp), keep reading more blocks until there is data
or last block is reached (or error). Ensure a single read() call will return some data if there is any
2014-04-25 10:30:22 -07:00
Ilya Kreymer
53f0cb540f
url rewriter: add optional 'full prefix', check and don't rewrite urls
...
if starting with prefix or full prefix
wbrequest: if no scheme present (shouldn't happen with wsgi) default to http
2014-04-24 10:44:08 -07:00
Ilya Kreymer
cd017669ae
bugfix: ChunkedDataReader handles zero-length chunk properly, add test
2014-04-23 10:00:25 -07:00
Ilya Kreymer
48e8e8eb1c
allow passing optional kwargs to render search page
...
add configutable 'default_mod' param
2014-04-22 16:33:47 -07:00
Ilya Kreymer
2ad41e2b94
rewrite: rewrite data-* attributes if they look like links (http, https, //)
2014-04-22 16:32:36 -07:00
Ilya Kreymer
6eef0afb86
add new custom rewriting rule (flickr)
2014-04-20 21:40:27 -07:00
Ilya Kreymer
e1e55ac061
minor tweaks: rewrite 'crossorigin' -> '_crossorigin' param to disable
...
crossorigin as it may interfere with loading rewritten content, add
tests for html and lxml parsers
add server_cls as optional param to QueryHandler.init_from_config()
for easier customization
views: dont create template if empty template file specified
2014-04-19 12:04:43 -07:00
Ilya Kreymer
23bb5bd175
rewrite: wombat update 2.0! Using Object.defineProperty() to better
...
override .href and .hash properties when possible.
.href returns original url, but on assignment rewrites before redirecting
.hash proxies to location.hash
Also added:
- window.top -> window.WB_wombat_top
- document.referrer -> document.WB_wombat_referrer
- <source> html tag rewriting
2014-04-18 19:30:48 -07:00
Ilya Kreymer
e011da43f2
live rewrite: use custom REL_REFERER field don't overrie HTTP_REFERER
...
if REL_REFERER not set, don't send any referrer
2014-04-15 16:44:02 -07:00
Ilya Kreymer
85593696fa
remove rfc3987 validation, was rejecting valid urls
...
add extract_referer_wburl_str() to extract WbUrl str, if any,
from the referrer. Use that for live_rewrite_handler to override
default referrer
2014-04-15 16:38:53 -07:00
Ilya Kreymer
611b9093bd
html insert: add include_ts option to optionally not add timestamp
2014-04-13 18:17:31 -07:00
Ilya Kreymer
d8c9a803f6
add support for optional proxies (verify set to false for now)
2014-04-13 17:50:26 -07:00
Ilya Kreymer
7636c9d3f7
fix: when reading response, only readline() if previous read()
...
was non-empty
2014-04-09 16:44:45 -07:00
Ilya Kreymer
bfc2e63793
live rewriter: integrate handler with rewrite_live.py module,
...
clean up css, add unit and integration tests
clean up cli server now known as 'live-rewrite-server', which performs live rewrite using
iframe paradigm
2014-04-09 15:49:55 -07:00