1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 16:14:48 +01:00

1750 Commits

Author SHA1 Message Date
Ilya Kreymer
197ed5be98 loader: profile urls: ensure the profile prefix is removed from url before passing to loader, #180 2016-06-04 14:09:18 -04:00
chdorner
b54347f8d1 Allow rewriting of empty srcset attributes
Strictly speaking a `srcset` attribute must consist of one or more
strings
(http://w3c.github.io/html/semantics-embedded-content.html#element-attrdef-img-srcset)
However are websites out there that specify an empty string as the
value.

This commit makes sure that the rewriting does not break and just
returns an empty string.
2016-06-01 11:31:26 +02:00
Ilya Kreymer
d7c74b68de video loader support: add VideoLoader, which uses youtube-dl to create a metadata record
of video info. Activated with explicit content_type param 'application/vnd.youtube-dl_formats+json'
2016-05-28 15:01:33 -07:00
Ilya Kreymer
30f9d0aca7 recorder put custom record: add support for put/post of a custom record. If put_record= param is included, the request body
is written to the specified record type.
move record creation functions to the warcwriter
add tests for custom record
2016-05-26 20:49:40 -07:00
Ilya Kreymer
ea3efdf84d responseloader: use PreparedRequest() to ensure url properly formatted
tests: update tests for latest, live data
2016-05-24 18:01:44 -07:00
Ilya Kreymer
e28f294302 wombat: ensure window.open() rewrite happens even in if open not in prototype
rewrite mod: allow empty "" as set mod, check for undefined
2016-05-24 17:55:17 -07:00
Ilya Kreymer
f858be4d7d Merge branch 'frame-postMessage' into develop 2016-05-24 15:40:51 -07:00
Ilya Kreymer
84c829467b framed replay: use postMessage() instead of custom function to notify of replay frame changing url, include different type of change, eg. load, replaceState, pushState, #181 2016-05-23 12:10:10 -07:00
Ilya Kreymer
8ef6eb97b8 cdx: encoding: use to_native_str() consistently for better py2 compat 2016-05-23 11:47:44 -07:00
Ilya Kreymer
80d9805a58 webagg: tests: flush fakeredis for reentrancy
utils: add load_config() with option for main and override configs
2016-05-19 17:01:09 -07:00
Ilya Kreymer
8ad66249c7 blockloader: support for loader profiles, specified via 'profile+scheme://...' urls. Profiles specify additional settings (eg. credentials) that are not included in the url. To enabl
e custom profiles, BlockLoader.set_profile_loader(callable) to a callable that will return custom config, addresses #180
2016-05-18 16:34:58 -07:00
Ilya Kreymer
d11bd444ad s3 loader: unurlencode username/password 2016-05-17 19:24:14 -07:00
Ilya Kreymer
119074e0ee s3 loader improvements: support AWS cred in username and password part of url, stream s3 response directly 2016-05-17 18:55:10 -07:00
Ilya Kreymer
94afab0bb2 wombat rewrite: don't add duplicate slash in rel-url resolve 2016-05-17 18:53:00 -07:00
Ilya Kreymer
10d8e4b3be bump version to 0.31.0 2016-05-17 18:38:57 -07:00
Ilya Kreymer
45c8fcddbd recorder: add max_idle_secs / close_idle_files() to close any open files that have not been modified longer than set threshold, in prep for webrecorder/webrecorder#92
indexer: add 'full_warc_prefix' for setting full path prefix in add_warc_file() (eg. for http load) for webrecorder/webrecorder#95
2016-05-11 21:40:02 -07:00
Ilya Kreymer
94d6098238 app: separate json_encode() func
compat: py2 fixes
2016-05-11 11:38:59 -07:00
Ilya Kreymer
c45f5cb749 webagg: use werkzeug routing instead of wrapping Bottle app 2016-05-10 16:31:44 -07:00
Ilya Kreymer
464eca2fa0 test apps: enable debugging for test apps
test recorder: write to a temp dir for each run
2016-05-06 16:33:18 -07:00
Ilya Kreymer
e64ae780c6 urlrewrite: improve POST request support for ikreymer/pywb#178 2016-05-06 16:32:13 -07:00
Ilya Kreymer
87da25c703 post request mapping improvements: work on #178, including:
- mapping multipart/form-data same as x-www-form-urlencoded
- parsing application/x-amf with pyamf
- RewriteContentAMF for rewriting AMF response to match request
- default encoding of other POST data as base64 encoded __wb_post_data param
2016-05-06 10:19:08 -07:00
Ilya Kreymer
e5e7c5a7df wombat: ensure Math.random() overrides use the current window 2016-05-06 09:48:38 -07:00
Ilya Kreymer
1e7d4d27e3 bump version to 0.30.2 2016-05-06 09:43:11 -07:00
Ilya Kreymer
ab3af90df2 cookie_tracker: add support for redis-based subdomain cookie tracker, which temp caches cookies with Domain= set in redis and passes them upstream
when rewriting. addresses webrecorder/webrecorder#79
2016-05-04 16:39:47 -07:00
Ilya Kreymer
8e473f01fa add changelist for 0.30.1 2016-05-04 11:33:43 -07:00
Ilya Kreymer
2795802c77 recordloader: for request/response/revisit records, only parse urls starting with http:/https: as http 2016-05-04 11:20:38 -07:00
Ilya Kreymer
af920d77a0 rules: add fuzzy rules for TW video 2016-05-03 17:33:13 -07:00
Ilya Kreymer
07cc4fae0b bump version to 0.30.1 2016-05-03 17:32:35 -07:00
Ilya Kreymer
3a3110efdb fix README typo 2016-05-01 11:57:37 -07:00
Ilya Kreymer
e458bdcc77 CHANGES tweaks 2016-05-01 11:53:23 -07:00
Ilya Kreymer
033909efe0 wombat: set version to 1.12
return 'null' for frameElement ovevrride instead of undefined
2016-05-01 11:46:36 -07:00
Ilya Kreymer
4df45b4338 Update CHANGES for 0.30.0! 2016-05-01 11:45:01 -07:00
Ilya Kreymer
dd8ac42f2c encoding: ensure cdx fields are in the native encoding, except filename, which should stay as unicode in py2 for further use 2016-04-30 16:08:43 -07:00
Ilya Kreymer
e8c77c0538 encoding: encode before quote
setup: enable zip_safe=True again
2016-04-30 15:15:35 -07:00
Ilya Kreymer
ab8b4efaec encoding: cdx: only quote-encode 'url'
warc: ensure path index loads are utf-8 decoded
2016-04-30 14:38:48 -07:00
Ilya Kreymer
228ca58c5b recorer: actually fix content-type on warcinfo, add to test! 2016-04-30 13:07:53 -07:00
Ilya Kreymer
0fbae1c7f8 recorder: ensure warcinfo record has a content-type 2016-04-30 10:19:20 -07:00
Ilya Kreymer
67a02613e7 remove: remove unused/extraneous __iter__ 2016-04-30 01:43:53 -07:00
Ilya Kreymer
1c97a67763 rewrite client-side improvements:
add WB_wombat_frameElement Object prototype property to support frameElement rewriting
document.domain: allow changing to higher-level domain
rewrite_elem: also rewrite <form> action and <input> value, if they are absolute urls
2016-04-30 01:43:40 -07:00
Ilya Kreymer
1bea9d73ed rewrite: rewrite .frameElement -> WB_wombat_frameElement server-side to handle cases when default frameElement can not be overridden 2016-04-30 01:36:26 -07:00
Ilya Kreymer
37609ebdc9 rewrite: support custom cookie_rewriter passed to 'rewrite_content' 2016-04-30 01:35:55 -07:00
Ilya Kreymer
e669ecba15 rewrite: html rewrite fix such that head insert is placed before other <script> tags even if no head 2016-04-30 01:32:16 -07:00
Ilya Kreymer
7a0dd463cd webagg: responseloader: use urllib3 directly instead of requests to
take advantage of connection pooling w/o storing/sharing cookies
2016-04-27 10:16:54 -07:00
Ilya Kreymer
9010e52663 urlrewrite: refactor simpleapp to support live/record/replay 2016-04-27 10:15:48 -07:00
Ilya Kreymer
f119d05724 recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options) 2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44 tests: fix cookie rewriter tests to exclude 2.6 2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577 cookie rewrite improvements: #177
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
a82e2785c7 tests: add basic test for rewriterapp 2016-04-25 14:29:28 -07:00