1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 00:03:28 +01:00

574 Commits

Author SHA1 Message Date
Ilya Kreymer
d31a4df3a6 add changelist for 0.6.5 2014-12-04 23:10:51 -08:00
Ilya Kreymer
ea89702701 static handler: add default 'application/octet-stream' and only set
guessed mime if not none
2014-12-04 23:02:30 -08:00
Ilya Kreymer
c996e70a6e wburl: detect and decode partially encoded schemes in url, such as http%3A//,
https%A2F2F// before handling further
add additional tests for wburl
2014-11-29 11:13:57 -08:00
Ilya Kreymer
d7eb40af20 rewrite: properly rewrite scheme relative JS-escaped urls:
'\/\/example.com', '\\/\\/example.com/', treat same as '//example.com'
adding http: prefix
2014-11-23 18:56:49 -08:00
Ilya Kreymer
b8b8c30573 cookie_rewriter: add tests for exact cookie rewriter 2014-11-13 09:43:50 -08:00
Ilya Kreymer
20070e95b6 cookie_rewriter: add 'exact' cookie rewriter which never changes the
path/domain
2014-11-13 09:24:34 -08:00
Ilya Kreymer
388f31e08f rewrite: don't rewrite rel=canonical links, need to make rewriting more
configurable (#50)
2014-11-11 15:34:14 -08:00
Ilya Kreymer
49e98e0cdc archiveiterator/cdxindexer: cleaner load path for compressed and
uncompressed, ability to distinguish between chunked and non-chunked
warcs/arcs
Raise error for non-chunked gzip warcs as they can not be indexed for
replay, addressing #48
add 'bad' non-chunked gzip file for testing, using custom ext
2014-11-06 01:32:42 -08:00
Ilya Kreymer
044792f99f bump version to 0.6.5! 2014-11-06 01:28:56 -08:00
Ilya Kreymer
f6053a977b Update changes for 0.6.4 2014-11-05 21:59:54 -08:00
Ilya Kreymer
00121aa165 statusandheaders parsing: properly skip multiline bad headers (missing
header name and ':'), fixes #49
2014-11-05 20:26:23 -08:00
Ilya Kreymer
e4bcef1c8b rewrite: default HTMLParser entityref and charref are treated as plain
data for HTMLRewriter, since they are never rewritten, and to avoid
semicolon ambiguity, since no way to determine if there is a ; or not
at end. Addresses #43
2014-11-04 12:14:00 -08:00
Ilya Kreymer
5e4b830fa7 cdx: ensure cdx file is closed when iterator is done, since cdx files
are opened per-lookup, related to #45
2014-11-04 09:42:53 -08:00
Ilya Kreymer
a3b931b45e regex rewrite: fix js regex (dashes), add additional test case 2014-11-01 15:39:51 -07:00
Ilya Kreymer
841fd3f7b4 warc: add ability to set read block size (def 16384) in archiveiterator 2014-11-01 13:29:37 -07:00
Ilya Kreymer
5be65f2945 rules: better rule def, cleanup spacing 2014-10-30 00:10:39 -07:00
Ilya Kreymer
f14f37d5b1 tests: use httpbin for redirect tests 2014-10-29 09:47:32 -07:00
Ilya Kreymer
61ce53a0e0 warc/cdx: include metadata and resource records in default cdx index
emit 200 and 204 responses for metadata and resource, though write '-'
to cdx (for compatibility for now)
include content-length in resource/metadata records
2014-10-28 10:29:50 -07:00
Ilya Kreymer
c9273ee5ed rewrite: add 'deprefix' support to remove wburl prefix from any query
params
2014-10-26 12:12:37 -07:00
Ilya Kreymer
037cf35eb8 wsgi_wrapper: check for str before decoding err msg 2014-10-25 11:42:44 -07:00
Ilya Kreymer
8441b54192 head_insert: add mod to wombat 2014-10-24 14:13:59 -07:00
Ilya Kreymer
67e94d13f4 handlers/wombat: pass in mod to wombat, ability to customize modifier
for embeds
2014-10-24 12:45:41 -07:00
Ilya Kreymer
9b64194342 bump version to 0.6.4 2014-10-24 12:44:52 -07:00
Ilya Kreymer
f394e26cf1 update CHANGES.rst 2014-10-21 19:21:15 -07:00
Ilya Kreymer
05995ad9cf Merge branch 'master' into develop, just README changes 2014-10-21 19:09:31 -07:00
Ilya Kreymer
e8d3965269 pep8 style fixes, remove unused methods 2014-10-21 19:06:16 -07:00
Ilya Kreymer
0a1c053507 Add badge 2014-10-19 08:33:26 -07:00
Ilya Kreymer
dfae25da01 Update README with News! 2014-10-19 08:32:11 -07:00
Ilya Kreymer
1a78fffa22 refactor handlers: simplify handling methods: handle_request() called
for all requests, handle_query() only for url query/calendar, and
handle_replay() only for replay. Improves extensibility of the handling
path
2014-10-19 00:33:32 -07:00
Ilya Kreymer
d99f7f996c urlrewriter refactor: replace get_abs_url and get_timestamp_url with
get_new_ur() which just calls wburl.to_str and applies rewriter prefix
allows creating a new wburl with any component(s) changed
2014-10-19 00:24:00 -07:00
Ilya Kreymer
d01275335b bump version to 0.6.3 2014-10-19 00:19:07 -07:00
Ilya Kreymer
c9c9e9d7ed Add Gratipay link 2014-10-18 17:00:33 -07:00
Ilya Kreymer
e4befd0d85 update README.rst 0.6.2 2014-10-18 15:27:58 -07:00
Ilya Kreymer
729320393a update license statement in js files with github link 2014-10-18 15:18:40 -07:00
Ilya Kreymer
268861b2ea Update README with UI Customization info 2014-10-18 15:14:43 -07:00
Ilya Kreymer
b7d23e4736 Update CHANGES.rst with latest 2014-10-18 14:51:21 -07:00
Ilya Kreymer
7f378c9aab move wb.css include into banner.html for easier overridability 2014-10-18 12:40:02 -07:00
Ilya Kreymer
4a1cc46fa3 framed replay: invert framed replay paradigm, replay always uses
canonical, no-modifier archival url (instead of mp_).
When using frames, the page redirects to a 'tf_' page, which then uses
replaceHistory() to change url back to canonical form.
memento: support for framed replay, include memento headers in top frame
bump version to 0.6.2
2014-10-18 11:21:07 -07:00
Ilya Kreymer
b99dcb41f0 banner: support rel and abs paths for banner_html, relative to current
dir or system absolute
2014-10-17 09:24:16 -07:00
Ilya Kreymer
cede54f0c1 self-redir: remove referrer-based self-redirect check, as it may be
triggered incorrectly during refresh.. (will need to investigate more if
there's an edge-case to test against)
2014-10-17 08:54:03 -07:00
Ilya Kreymer
1c23e12c06 banner: fixes for framed replay with new default banner 2014-10-17 08:40:57 -07:00
Ilya Kreymer
0efa2dc0ad rewrite/banner: add a seperate 'banner_html' setting which allows
overriding just the banner (and not the entire head_insert). Setting
banner_html: False will disable the banner, or setting to a custom
template will insert that template. Default template loads
default_banner.js which does the actual initialization.
2014-10-17 08:28:06 -07:00
Ilya Kreymer
b7a098a9a7 update rules for additional sites 2014-10-17 08:27:56 -07:00
Ilya Kreymer
614938479b jinaj2 replay: use ChoiceLoader to properly load either local file
system or package templates
2014-10-16 20:33:17 -07:00
Ilya Kreymer
aecc847ec1 rewrite: seperate stream_to_gen and text_rewriting_stream_to_gen
The regular stream_to_gen is much simpler and specifically for
binary/unrewritten content. text_rewriting_stream_to_gen() performs
rewriting. Use fixed buffer of 16384 for read size, allows for better
steaming when using live rewrite
2014-10-16 20:13:53 -07:00
Ilya Kreymer
50bf7d2634 rewrite: move extract_client_cookie to utils for access at rewrite
root cookie_rewriter: keep max-age
add csrf token copying (experimental)
update tests
2014-10-12 03:07:54 -07:00
Ilya Kreymer
498a864441 rewriting: support setting cookie_scope at collection level
js rewriting: add custom url rewrite option to per-url rewrite rules
2014-10-06 10:14:45 -07:00
Ilya Kreymer
f1b3f8c76f cookie rewriter work: ability to set a custom 'root scope' rewriter,
which sets the path of all cookies to pywb root.
Option to enable per url-prefix in rules, still more testing, other
options needed
2014-09-30 12:42:11 -07:00
Ilya Kreymer
7feb0893eb rewrite: add 'application/json' to a seperate 'json' regex rewriter type (rewrite links only, no
http), can be customized via rules
wombat: add rewrite_style for rewriting style attrs
query: don't include any filter in latest, custom filter can be used
without any other filters
tests: fix typos in tests
2014-09-30 10:57:25 -07:00
Ilya Kreymer
00efe33870 Merge branch 'master' into develop 2014-09-22 21:15:18 -07:00