Ilya Kreymer
d2c37f7d91
html parser: attr_value can now be None -- default to '' for string ops, write attr w/o assignment
2016-06-12 01:38:03 -04:00
Ilya Kreymer
0f530a3e0e
dependencies: remove pyamf, update to latest surt (0.3.0)
2016-06-12 00:44:52 -04:00
Ilya Kreymer
9f299eb8e9
amf rewriting: move to separate file, mark as experimental, and don't include as default (for now)
2016-06-12 00:40:35 -04:00
Ilya Kreymer
527a3bc89c
bufferedreader: be lenient of partially decompressed data: return what was decompressed, rather than just throw exception
...
esp. useful if record was decompressed, but an error in crc check
may add additional options for toggling 'leniency' if needed
2016-06-12 00:37:14 -04:00
Ilya Kreymer
197ed5be98
loader: profile urls: ensure the profile prefix is removed from url before passing to loader, #180
2016-06-04 14:09:18 -04:00
chdorner
b54347f8d1
Allow rewriting of empty srcset attributes
...
Strictly speaking a `srcset` attribute must consist of one or more
strings
(http://w3c.github.io/html/semantics-embedded-content.html#element-attrdef-img-srcset )
However are websites out there that specify an empty string as the
value.
This commit makes sure that the rewriting does not break and just
returns an empty string.
2016-06-01 11:31:26 +02:00
Ilya Kreymer
e28f294302
wombat: ensure window.open() rewrite happens even in if open not in prototype
...
rewrite mod: allow empty "" as set mod, check for undefined
2016-05-24 17:55:17 -07:00
Ilya Kreymer
f858be4d7d
Merge branch 'frame-postMessage' into develop
2016-05-24 15:40:51 -07:00
Ilya Kreymer
84c829467b
framed replay: use postMessage() instead of custom function to notify of replay frame changing url, include different type of change, eg. load, replaceState, pushState, #181
2016-05-23 12:10:10 -07:00
Ilya Kreymer
8ef6eb97b8
cdx: encoding: use to_native_str() consistently for better py2 compat
2016-05-23 11:47:44 -07:00
Ilya Kreymer
8ad66249c7
blockloader: support for loader profiles, specified via 'profile+scheme://...' urls. Profiles specify additional settings (eg. credentials) that are not included in the url. To enabl
...
e custom profiles, BlockLoader.set_profile_loader(callable) to a callable that will return custom config, addresses #180
2016-05-18 16:34:58 -07:00
Ilya Kreymer
d11bd444ad
s3 loader: unurlencode username/password
2016-05-17 19:24:14 -07:00
Ilya Kreymer
119074e0ee
s3 loader improvements: support AWS cred in username and password part of url, stream s3 response directly
2016-05-17 18:55:10 -07:00
Ilya Kreymer
94afab0bb2
wombat rewrite: don't add duplicate slash in rel-url resolve
2016-05-17 18:53:00 -07:00
Ilya Kreymer
10d8e4b3be
bump version to 0.31.0
2016-05-17 18:38:57 -07:00
Ilya Kreymer
87da25c703
post request mapping improvements: work on #178 , including:
...
- mapping multipart/form-data same as x-www-form-urlencoded
- parsing application/x-amf with pyamf
- RewriteContentAMF for rewriting AMF response to match request
- default encoding of other POST data as base64 encoded __wb_post_data param
2016-05-06 10:19:08 -07:00
Ilya Kreymer
e5e7c5a7df
wombat: ensure Math.random() overrides use the current window
2016-05-06 09:48:38 -07:00
Ilya Kreymer
1e7d4d27e3
bump version to 0.30.2
2016-05-06 09:43:11 -07:00
Ilya Kreymer
8e473f01fa
add changelist for 0.30.1
2016-05-04 11:33:43 -07:00
Ilya Kreymer
2795802c77
recordloader: for request/response/revisit records, only parse urls starting with http:/https: as http
2016-05-04 11:20:38 -07:00
Ilya Kreymer
af920d77a0
rules: add fuzzy rules for TW video
2016-05-03 17:33:13 -07:00
Ilya Kreymer
07cc4fae0b
bump version to 0.30.1
2016-05-03 17:32:35 -07:00
Ilya Kreymer
3a3110efdb
fix README typo
2016-05-01 11:57:37 -07:00
Ilya Kreymer
e458bdcc77
CHANGES tweaks
2016-05-01 11:53:23 -07:00
Ilya Kreymer
033909efe0
wombat: set version to 1.12
...
return 'null' for frameElement ovevrride instead of undefined
2016-05-01 11:46:36 -07:00
Ilya Kreymer
4df45b4338
Update CHANGES for 0.30.0!
2016-05-01 11:45:01 -07:00
Ilya Kreymer
dd8ac42f2c
encoding: ensure cdx fields are in the native encoding, except filename, which should stay as unicode in py2 for further use
2016-04-30 16:08:43 -07:00
Ilya Kreymer
e8c77c0538
encoding: encode before quote
...
setup: enable zip_safe=True again
2016-04-30 15:15:35 -07:00
Ilya Kreymer
ab8b4efaec
encoding: cdx: only quote-encode 'url'
...
warc: ensure path index loads are utf-8 decoded
2016-04-30 14:38:48 -07:00
Ilya Kreymer
67a02613e7
remove: remove unused/extraneous __iter__
2016-04-30 01:43:53 -07:00
Ilya Kreymer
1c97a67763
rewrite client-side improvements:
...
add WB_wombat_frameElement Object prototype property to support frameElement rewriting
document.domain: allow changing to higher-level domain
rewrite_elem: also rewrite <form> action and <input> value, if they are absolute urls
2016-04-30 01:43:40 -07:00
Ilya Kreymer
1bea9d73ed
rewrite: rewrite .frameElement -> WB_wombat_frameElement server-side to handle cases when default frameElement can not be overridden
2016-04-30 01:36:26 -07:00
Ilya Kreymer
37609ebdc9
rewrite: support custom cookie_rewriter passed to 'rewrite_content'
2016-04-30 01:35:55 -07:00
Ilya Kreymer
e669ecba15
rewrite: html rewrite fix such that head insert is placed before other <script> tags even if no head
2016-04-30 01:32:16 -07:00
Ilya Kreymer
a1e0c29a85
rules: add rule for twitter timeline
2016-04-26 17:02:54 -07:00
Ilya Kreymer
658303caad
rewrite headers: undo not rewriting x- headers, needs more research and exclusions (eg. x-frame-options)
2016-04-26 13:11:08 -07:00
Ilya Kreymer
cf6cfc0c44
tests: fix cookie rewriter tests to exclude 2.6
2016-04-26 10:32:43 -07:00
Ilya Kreymer
4a60e15577
cookie rewrite improvements: #177
...
- don't remove max-age and expires if in 'live' rewrite mode (flag set on urlrewriter)
- remove secure only if replay prefix is not https
- fix expires UTC->GMT as cookie parsing chokes on UTC
- other rewriting: don't append rewrite prefix to x- headers
tests: add more cookie rewriting tests
2016-04-26 09:45:23 -07:00
Ilya Kreymer
61381fcac6
wombat rewrite: remove cookie domain if hostname is an IP address
2016-04-07 15:53:26 -07:00
Ilya Kreymer
95a212ed79
wombat rewrite: add custom X-Pywb-Requested-With header with turns off rewriting and is never sent upstream
2016-04-06 12:05:53 -07:00
Ilya Kreymer
4b753d2612
Merge branch '0.11.5' into develop
2016-03-31 13:16:53 -07:00
Ilya Kreymer
9381acdaaf
Merge branch 'zip-loc-fix' into develop
2016-03-31 13:14:39 -07:00
Ilya Kreymer
b901343067
update CHANGES.rst
2016-03-31 13:14:04 -07:00
Ilya Kreymer
e5ef51363c
zipnum: backport fix for #173 , paths specified in a zipnum .loc file are relative to the .loc file, not to
...
the working dir of the application
warnings: don't warn on .gz cdx files
2016-03-31 13:09:57 -07:00
Ilya Kreymer
ba7ac56230
release: bump to 0.11.5, update version and changelist
2016-03-31 12:45:16 -07:00
Ilya Kreymer
b5cf79072d
loaders: ensure loader stream closed in load_yaml_config()
2016-03-31 12:42:23 -07:00
Ilya Kreymer
8e51ddc544
archiveiterator: don't reuse entries when post-append, as they may be cached for merge -- can break if records do not alternate
...
request/response fixes #175
2016-03-31 12:42:23 -07:00
Ilya Kreymer
f8f0c3a76e
loader: ensure file closed in load_yaml_config()
2016-03-27 13:56:19 -04:00
Ilya Kreymer
3eac9be00b
warc: ArchiveLoadFailed: add space in exception string
2016-03-26 22:28:38 -04:00
Ilya Kreymer
c5a166f601
tests: use httpbin.org instead of example.com/ for range-request test
2016-03-26 22:28:04 -04:00