Ilya Kreymer
1d8c68b745
rewrite: only translate non-empty header values
2014-05-13 17:42:55 -07:00
Ilya Kreymer
871cc26fa4
rewrite: add optional cookie_rewriter, created by urlrewriter and called from header_rewriter
...
cookie_rewriter works correctly with a concatenated set-cookie list, returns a list of rewritten 'set-cookie' headers
rewrite_live: add proxying of Host, Origin, additional headers
split header rewriter tests into test_header_rewriter, add test_cookie_rewriter
bump version to 0.4.0!
2014-05-13 17:07:41 -07:00
Ilya Kreymer
89da165467
exceptions: add optional url param to WbException, move handler_exception()
...
into WSGIApp for easier customization
2014-05-13 01:54:12 -07:00
Ilya Kreymer
e7957a5cae
remove SeekableTextFileReader, replaced with standard file-like objects
...
and seek(0, 2) and tell() to get file length
2014-05-06 20:54:42 -07:00
Ilya Kreymer
46449ac188
rewrite: pass wburl mod to rewritier, so that css/js rewriting
...
rules may override default content-type (in cases where it is incorrect)
allows for rule based cusomization (to be added later)
2014-05-05 22:12:45 -07:00
Ilya Kreymer
d2795dfdaa
minor cleanup:
...
wburl: add is_url_query() check
views: add kwargs to J2HtmlCapturesView for better extensibility
query_handler: simplify make_cdx_response() arguments
2014-05-01 11:58:34 -07:00
Ilya Kreymer
4c075d14af
views: actually encode template result as utf-8!
2014-04-30 21:16:05 -07:00
Ilya Kreymer
9cf5327e88
bufferedreader cleanup:
...
* BufferedReader defaults to no decompression
* DecompressingBufferedReader defaults to gzip decomp
* ChunkedDataReader defaults to no gzip decomp, but decomp
can be set later via set_decomp().
This allow chunked responses to be de-chunked but not decompressed
(eg for non-text responses)
2014-04-28 20:15:31 -07:00
Ilya Kreymer
53ad67eb9c
rewrite: disable one 'top' rewriting rule (should move to seperate mixin)
...
views: add urlsplit jinja2 filter
2014-04-27 01:04:20 -07:00
Ilya Kreymer
09653cf77e
rewrite: more nuanced 'top' rewriting, fix wombat frame mode detection
2014-04-26 18:43:25 -07:00
Ilya Kreymer
58f261fda4
cdx redis: disable new test until fakeredis supports zrangebylex()
2014-04-25 11:00:49 -07:00
Ilya Kreymer
2b8bea616e
when given a redis path of redis://<host>/<db>/<key>, use <key> as a
...
sorted cdx file with zrangebylex!
modified tests but need zrangebylex() support in fakeredis to finish
2014-04-25 10:52:35 -07:00
Ilya Kreymer
e4262502b0
fix ChunkedDataReader chunked + gzip decomp: if reading one chunk yields no data
...
(due to more data being needed for gzip decomp), keep reading more blocks until there is data
or last block is reached (or error). Ensure a single read() call will return some data if there is any
2014-04-25 10:30:22 -07:00
Ilya Kreymer
53f0cb540f
url rewriter: add optional 'full prefix', check and don't rewrite urls
...
if starting with prefix or full prefix
wbrequest: if no scheme present (shouldn't happen with wsgi) default to http
2014-04-24 10:44:08 -07:00
Ilya Kreymer
cd017669ae
bugfix: ChunkedDataReader handles zero-length chunk properly, add test
2014-04-23 10:00:25 -07:00
Ilya Kreymer
48e8e8eb1c
allow passing optional kwargs to render search page
...
add configutable 'default_mod' param
2014-04-22 16:33:47 -07:00
Ilya Kreymer
2ad41e2b94
rewrite: rewrite data-* attributes if they look like links (http, https, //)
2014-04-22 16:32:36 -07:00
Ilya Kreymer
6eef0afb86
add new custom rewriting rule (flickr)
2014-04-20 21:40:27 -07:00
Ilya Kreymer
e1e55ac061
minor tweaks: rewrite 'crossorigin' -> '_crossorigin' param to disable
...
crossorigin as it may interfere with loading rewritten content, add
tests for html and lxml parsers
add server_cls as optional param to QueryHandler.init_from_config()
for easier customization
views: dont create template if empty template file specified
2014-04-19 12:04:43 -07:00
Ilya Kreymer
23bb5bd175
rewrite: wombat update 2.0! Using Object.defineProperty() to better
...
override .href and .hash properties when possible.
.href returns original url, but on assignment rewrites before redirecting
.hash proxies to location.hash
Also added:
- window.top -> window.WB_wombat_top
- document.referrer -> document.WB_wombat_referrer
- <source> html tag rewriting
2014-04-18 19:30:48 -07:00
Ilya Kreymer
e011da43f2
live rewrite: use custom REL_REFERER field don't overrie HTTP_REFERER
...
if REL_REFERER not set, don't send any referrer
2014-04-15 16:44:02 -07:00
Ilya Kreymer
85593696fa
remove rfc3987 validation, was rejecting valid urls
...
add extract_referer_wburl_str() to extract WbUrl str, if any,
from the referrer. Use that for live_rewrite_handler to override
default referrer
2014-04-15 16:38:53 -07:00
Ilya Kreymer
611b9093bd
html insert: add include_ts option to optionally not add timestamp
2014-04-13 18:17:31 -07:00
Ilya Kreymer
d8c9a803f6
add support for optional proxies (verify set to false for now)
2014-04-13 17:50:26 -07:00
Ilya Kreymer
7636c9d3f7
fix: when reading response, only readline() if previous read()
...
was non-empty
2014-04-09 16:44:45 -07:00
Ilya Kreymer
bfc2e63793
live rewriter: integrate handler with rewrite_live.py module,
...
clean up css, add unit and integration tests
clean up cli server now known as 'live-rewrite-server', which performs live rewrite using
iframe paradigm
2014-04-09 15:49:55 -07:00
Ilya Kreymer
11202c462f
support both frames and non-frames mode
...
add automatic framing when in framed mode
2014-04-09 15:49:55 -07:00
Ilya Kreymer
b4f30a770f
ChunkDataReader: if determined to be non-chunked, read full buffer
...
unchunked
2014-04-09 15:49:55 -07:00
Ilya Kreymer
19f2df4717
refactor:
...
- move is_identity(), is_embed() to wburl from wbrequest
- add is_mainpage() predicate
- add create_template() to each J2TemplateView to create itself
- add HeadInsertView to create a reusable head insert for
RewriteContent
- add 'mp_' as modifier for frames mode to be used as possible
modifier with HTMLRewriter
2014-04-09 15:49:55 -07:00
Ilya Kreymer
1fb6f5eff7
add rewriter_handler, frame wrapper support!
2014-04-09 15:49:55 -07:00
Ilya Kreymer
8897a0a7c9
decompressingbufferedreader: default to 'gzip' decompression instead of
...
none. ChunkedDataReader also automatically attempts decompression, by default
Add tests to verify
2014-04-08 21:49:04 -07:00
Ilya Kreymer
02fe78cb0b
update changes, add more tests
2014-04-07 17:41:14 -07:00
Ilya Kreymer
a331061691
minor tweaks: add default static_path for jinja,
...
remove unused import
2014-04-07 17:19:07 -07:00
Ilya Kreymer
c23dd7bda4
wombat update:
...
- support scheme-relative (//) urls
- override dom manipulation (appendChild, insertBefore, replaceChild)
- disable Worker() interface for now
2014-04-07 17:17:08 -07:00
Ilya Kreymer
2a318527df
lxml: use lxml's parse interface instead of feed interface to allow
...
xml to handle decoding unicode data, better address #36
2014-04-07 17:13:43 -07:00
Ilya Kreymer
890c323617
update bad.arc with empty record example
2014-04-07 17:12:33 -07:00
Ilya Kreymer
64eef7063d
record reading: better handling of empty arc (or warc) records
...
for indexing, index empty/invalid length as '-' status code
for reading, serve as 204 no content.
ensure that StatusAndHeaders has a valid statusline when serving
if http content-length is valid,, limit stream to that content-length
as well as record content-length (whichever is smaller)
replace content-length when buffering
2014-04-07 17:08:39 -07:00
Ilya Kreymer
d8c20a59cf
update to version 0.3.1
2014-04-06 11:46:43 -07:00
Ilya Kreymer
d6006acdc3
rewrite: when using lxml parser, just pass raw stream to lxml
...
without decoding. lxml parser expects to have raw bytes and will determine
encoding on its own. then serve back as utf-8 if no encoding specified.
should address #36
2014-04-06 09:47:34 -07:00
Ilya Kreymer
e077c23de7
fuzzy match: modify existing params to ensure any custom params are preserved
...
templates: add ability to set custom global vars, such as 'static_path'
for all templates
2014-04-04 12:20:54 -07:00
Ilya Kreymer
b0b0adb043
refactor: rename pywb.core -> pywb.webapp
...
move perms/test/test_perms_policy -> tests/perms_fixture
for rules file, use single DEFAULT_RULES_FILE import
2014-04-04 10:09:26 -07:00
Ilya Kreymer
3aa4a4da7a
rewrite: ensure lxml parser closes gracefully on no input
2014-04-03 13:00:22 -07:00
Ilya Kreymer
5388a0b03b
Merge branch 'develop' of https://github.com/ikreymer/pywb into develop
2014-04-03 12:45:54 -07:00
Ilya Kreymer
5dd586cf07
refactor: simplify rewrite_content and replay_views, remove
...
redundant code.. everything goes through rewrite_content(),
is sanitized (for transfer encoding) if needed
additional testing for decode_buff
fix failed_files bug in resolvingloader, add tests
2014-04-03 12:44:00 -07:00
Ilya Kreymer
5155a5c842
fix README headings
2014-04-03 09:25:10 -07:00
Ilya Kreymer
bd21fec6d4
update run-uwsgi.sh and add run-gunicorn.sh
...
update README and INSTALL, fix typo
only list wb handlers on home page by default
pep8 fixes
2014-04-03 08:56:18 -07:00
Ilya Kreymer
1e7ecb901a
tweak README, add no cover pragmas to blocking cli apps (for now)
2014-04-02 21:43:09 -07:00
Ilya Kreymer
80f2da9548
refactor: move configs/config.yaml to root again
...
remove cdx-server specific config, instead make cdx server api-only
path configurable from regular config
2014-04-02 21:26:53 -07:00
Ilya Kreymer
8bdafeb040
Update README.rst
...
move changes, installation to separate files.. add simplified install guide
2014-04-02 20:29:00 -07:00
Ilya Kreymer
05eba0194a
add CHANGES.rst changelist
2014-04-02 20:19:17 -07:00