1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

731 Commits

Author SHA1 Message Date
Ilya Kreymer
f9452bf48e rewrite: refactor IDN support: instead of returning IRI, return utf-8 %-encoded url
remove support for  returning IRI, as that requires detecting charset, instead just use %-encoded form
and let browser decode. Should address #66

Add rewrite option 'punycode_links_only' (default to false) to skip the %-encoded conversion of host, and just return punycode.

wombat: use getAttribute('href') on <a> tag to get original url, not punycode version

replay: add extra sanity check on Location header to ensure utf-8
2015-02-14 17:26:39 -08:00
Ilya Kreymer
79cfdd6a08 framework/urlrewriter: allow overriding UrlRewriter with optional urlrewriter_class param,
easier to override create_rebased_rewriter() with custom rewriter as well
2015-02-12 10:34:04 -08:00
Ilya Kreymer
dcf3688dc3 wombat: also override frameElement when changing window.parent for top-level replay frame 2015-02-11 19:26:45 -08:00
Ilya Kreymer
0b72bfe911 add 'none' js regex rewriter, which does not rewrite urls or location regexs
add test for none rewriter in test rule
2015-02-11 15:01:29 -08:00
Ilya Kreymer
f068186e37 wombat: replace window.self -> window for clarity 2015-02-11 15:01:04 -08:00
Ilya Kreymer
78bd89b4cb rewrite: simplify deprefix, url already unquoted now so remove extra unquote 2015-02-11 14:28:45 -08:00
Ilya Kreymer
4e7f95081f url_rewriter: catch exception when encoding to utf-8, may not be properly encoded, in which
case treat as bytes
2015-02-10 15:05:15 -08:00
Ilya Kreymer
90aba00ca0 not_found: catch NotFoundException from any part of handle_request, not just indexing.. allows for more flexible
usage with cdx iterators that are lazily evaluated on replay
2015-02-10 15:03:21 -08:00
Ilya Kreymer
148651680a wombat fix: use __orig_parent when referencing top-frame, since window.parent is being overriden 2015-02-10 15:02:08 -08:00
Ilya Kreymer
78ae86b6b6 Merge branch 'master' for 0.7.8 into develop 2015-02-05 08:45:55 -08:00
Ilya Kreymer
384e68c84b bump version to 0.7.8 for latest fix 2015-02-04 21:46:57 -08:00
Ilya Kreymer
cc144fdead rewrite: add basic test for X-Forwarded-Proto #57 2015-02-04 21:44:18 -08:00
Ilya Kreymer
78812c8085 rewrite: more conservative change, only rewrite the X-Forwarded-Proto
header for now, #57
2015-02-04 15:17:23 -08:00
Ilya Kreymer
cdb3dcc3d2 rewrite_live: don't forward via or https_x headers, only standard (for
now) possible fix for #57
2015-02-04 14:19:37 -08:00
Ilya Kreymer
40fba3c27b cdx-indexer: minor cleanup, add custom writer override to
write_multi_cdx_index
2015-02-04 11:17:26 -08:00
Ilya Kreymer
ef98716bd8 bump version to 0.7.7 in prep for release 2015-02-03 11:23:12 -08:00
Ilya Kreymer
c47d3ca925 wombat: add mutation observers, addressing #71 and maybe #67
rules: fix regex for yt, add rx for wikimedia
2015-02-03 11:19:41 -08:00
Ilya Kreymer
734ee4471b frame ui: pass timestamp to frame banner, fix typo in html
banner: allow overriding of banner id by returning custom id
2015-02-02 09:41:49 -08:00
Ilya Kreymer
29c6a36dac cdx api query: pass query timestamp mod to index query via 'query_closest'
field, to avoid confusion with 'closest'
2015-01-31 17:45:46 -08:00
Ilya Kreymer
55426e7619 memento: fix headers to be more consistent for framed replay. when using
frames, outer frames 'mirrors' mementos of the inner frame to be
discoverable by client side memento tools, tracked via #70
2015-01-29 22:27:15 -08:00
Ilya Kreymer
757345d317 replay api: make ReplayView overridable in WBHandler subclass,
allow custom content loader callable
2015-01-29 20:10:41 -08:00
Ilya Kreymer
7e017fd85e rewrite fixes: don't rewrite window.parent as it is overridable directly
html rewriter: ensure style is rewritten for all elements, add test!
wombat: cleanup and additional checks for assign(), setAttribute()
2015-01-29 20:08:00 -08:00
Ilya Kreymer
043ad5c860 wombat: improve createElementNS override to set prototype, just assign
window.parent directly
2015-01-29 10:13:32 -08:00
Ilya Kreymer
bf3d256a51 rewrite: add css-in-js rewrite rule for wikimedia, tracking via #67 for
perhaps a more general solution
2015-01-28 09:20:42 -08:00
Ilya Kreymer
ccedb2d60e regex_rewrite: add 'parent' rewrite in addition to 'top' for frames, add
WB_wombat_parent to wombat, add test for WB_wombat_parent
2015-01-27 19:57:56 -08:00
Ilya Kreymer
976decb3f1 wombat: ensure document.write override handles elements that go into
head as well as body
2015-01-27 18:02:14 -08:00
Ilya Kreymer
59630c08f6 bump version to 0.8.0! 2015-01-26 11:08:08 -08:00
Ilya Kreymer
695245d9e8 wburl idn: more complete support for idn urls (#66)
add distinct to_iri() and to_uri() functions in WbUrl
internal representation is always as ascii uri
for rewriting, defaults to iri representation unless
'rewrite_ascii_only_urls' is set to true per collection
add wbrequest.get_url() to get url as either iri or uri to be passed
to templates
2015-01-26 11:07:59 -08:00
Ilya Kreymer
edff3f17fb wburl: convert %-encoded hostnames or unicode urls to punycode for
better IDN support (#66)
2015-01-26 11:07:58 -08:00
Ilya Kreymer
933343fa01 update README for 0.7.2 master 2015-01-26 11:07:58 -08:00
Ilya Kreymer
8b5a6be956 Merge branch 'develop' for 0.7.6 2015-01-26 10:38:35 -08:00
Ilya Kreymer
8567b3fa76 CHANGELIST tweaks 2015-01-26 10:37:51 -08:00
Ilya Kreymer
5acd1164ab update CHANGELIST for 0.7.6 2015-01-26 10:31:24 -08:00
Ilya Kreymer
38e3bbbaef templates: add new 'not_found.html' template, which will be called for any missing replay request
instead of default error.html
'not_found_html' settable in the config per collection, as per #65
for not found index query, still use query.html but add condition to check for 0 results
add more query and replay not found
remove unused conditional (for search_view -- always exists)
2015-01-24 12:32:50 -08:00
Ilya Kreymer
80fd47ba3e add rules for vine (#62) 2015-01-22 16:45:09 -05:00
Ilya Kreymer
c9b2e3e69e wombat 2.2 improvements:
* for postMessage, add receive message overrides which uses original origin
to fix message passing tests that check for origin

* for createElementNS, ensure that the namespace url is not rewritten

* add equals_any() method, add "poster" attr to attr rewriting list

(solves several issues for vine replay, #62)
2015-01-22 16:43:52 -05:00
Ilya Kreymer
48b7751f80 bump version to 0.7.6
jinja2: allow adding multiple packages to search path
2015-01-19 21:54:11 -05:00
Ilya Kreymer
c935aa5ec9 Merge branch 'develop' for 0.7.5 0.7.5 2015-01-12 00:50:16 -08:00
Ilya Kreymer
71d9e58d7c fixup changes for 0.7.5 2015-01-12 00:38:51 -08:00
Ilya Kreymer
43805c67ef view: fix format_ts, use existing utc timestamp_to_sec conversion for %s 2015-01-12 00:28:06 -08:00
Ilya Kreymer
7ece05d022 bump version to 0.7.5
update CHANGES
fix .gitattributes to use standard flags
2015-01-12 00:09:02 -08:00
Ilya Kreymer
ac525b0937 tests: add tests for extract_post_query()
add test for HttpsUrlRewriter, remove unnecessary check in
bufferedreader
2015-01-11 23:54:29 -08:00
Ilya Kreymer
8449647c5f wbexception: remove unused status in WbException, set default error for
any uncaught exception to 500, instead of 400
2015-01-11 23:53:34 -08:00
Ilya Kreymer
7610d9deb7 views: cleanup view filters, remove obsolete, add tests for format_ts
and is_wb_handler
2015-01-11 23:02:48 -08:00
Ilya Kreymer
438f9c3e5c git: add gitattributes to ensure consistent line endings for warc, arc and
cdx
2015-01-11 19:09:01 -08:00
Ilya Kreymer
db75bda736 file open() pass: convert all read and write to ensure binary 'b' flag is set (#56) 2015-01-11 18:54:11 -08:00
Ilya Kreymer
fb4bf817f7 rangecache: use 'b' for file open 2015-01-11 18:34:32 -08:00
Ilya Kreymer
14657fbe15 certauth: fix max cert duration to avoid int overflow 2015-01-11 15:04:19 -08:00
Ilya Kreymer
7ae0ff86d2 test certauth: fix paths 2015-01-11 13:10:14 -08:00
Ilya Kreymer
cf0a21509b loaders: add to_file_url() for converting between filename and file://,
used in live rewrite and tests
2015-01-11 13:05:48 -08:00