1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-24 23:19:52 +01:00

99 Commits

Author SHA1 Message Date
Ilya Kreymer
aa80532987 rules: actual disqus fixes.. 2015-06-18 02:40:11 -04:00
Ilya Kreymer
07c2093020 rules: disqus comments work 2015-06-18 02:33:03 -04:00
Ilya Kreymer
2b9e1b97c3 rules: disable tw rewrite rule as it was page reloads 2015-06-04 17:31:16 -07:00
Ilya Kreymer
c4dad56681 rules: add custom js for resizing poster on twitter video images 2015-05-26 15:15:11 -07:00
Ilya Kreymer
ee20ac66d6 rules: tw video player rules, disable rewriting
rewrite: tweak location rule
wombat: add getAttribute() override, but disabled for now
store default getAttribute()/setAttribute() to refer internally
2015-05-25 17:52:03 -07:00
Ilya Kreymer
adb9448f27 rules: improved rules for googleplus! 2015-05-22 18:45:50 -07:00
Ilya Kreymer
d5b92dbb3c rules: update rules for yt comments 2015-05-21 17:20:40 -07:00
Ilya Kreymer
179f11198b fuzzy match: look at first occurence, not last of match seperator
rules: add new rule for yt comments
2015-05-21 23:52:09 +00:00
Ilya Kreymer
40f15cf6ea rules: add location rewrite only rule for disqus
wombat: ensure _orig_setAttribute is still set even if setAttribute rewriting disabled!
2015-05-14 22:32:09 -07:00
Ilya Kreymer
3a0a18b4e4 rules: update rules for yt 2015-05-14 22:32:07 -07:00
Ilya Kreymer
026873e308 rules: add extra fb rule 2015-04-27 00:44:24 -07:00
Ilya Kreymer
dcc2139fc8 fuzzy: add fuzzy match for vine 2015-04-13 13:02:55 -07:00
Ilya Kreymer
0b72bfe911 add 'none' js regex rewriter, which does not rewrite urls or location regexs
add test for none rewriter in test rule
2015-02-11 15:01:29 -08:00
Ilya Kreymer
c47d3ca925 wombat: add mutation observers, addressing #71 and maybe #67
rules: fix regex for yt, add rx for wikimedia
2015-02-03 11:19:41 -08:00
Ilya Kreymer
80fd47ba3e add rules for vine (#62) 2015-01-22 16:45:09 -05:00
Ilya Kreymer
d9c5345d3c rewrite: add support for Cookie request header rewrite to support sites
which require a cookie to be set. req_cookie_rewrite directive can be
set in rules.yaml per url prefix with a list of match/replace regexs
2015-01-03 12:51:09 -08:00
Ilya Kreymer
1684c14cda bump version to 0.7.2
video: disable yt DASH for better proxy and replay (experiment)
2014-12-28 16:34:48 -08:00
Ilya Kreymer
8d6845a552 fuzzy match: add support for specifying regex and args seperately for
fuzzy_lookup match
2014-12-26 14:29:51 -08:00
Ilya Kreymer
4c08a6a064 video work: improved yt handling:
- disable yt using yt api, for forced html/flash, diable on load
- use yt error event to detect error
- better fallback on recorded video
use seperate cache for range and video info tracking
fix yt rules query to account for & and ?
2014-12-26 13:02:47 -08:00
Ilya Kreymer
e68c0413d1 video rules: update rules for vimeo 2014-12-11 00:20:43 -08:00
Ilya Kreymer
87d7635f6f video: update rules to use new location-only rewriter for YT comments
support
2014-12-07 21:21:51 -08:00
Ilya Kreymer
ab087afa4e Merge branch 'develop' into video, JS rewriter refactoring 2014-12-07 21:11:20 -08:00
Ilya Kreymer
5a11714b41 rewrite: refactor JS rewriters into seperate mixins, allowing for
link only, location only, and link + location JS rewriters.
location-only rewriter is new
js_rewrite_location options: all, location, urls (for now)
2014-12-07 21:09:37 -08:00
Ilya Kreymer
c10df57e07 rules: add support for customizing matchType prefix, adding multiple
filters
2014-11-24 11:10:49 -08:00
Ilya Kreymer
fcb90fde86 rules: work on yt rules 2014-11-23 18:39:58 -08:00
Ilya Kreymer
0d191b338f rules: fix rules typo 2014-11-22 18:39:17 -08:00
Ilya Kreymer
c6a2c83b66 rangecache: always bound range, set default bound of 16384
wombat: work on date override, disable for now
head_insert: check for wombat not being inited to avoid undef error
2014-11-05 10:55:46 -08:00
Ilya Kreymer
88f553dce7 video work: live rewrite pings proxy with full rewrite, proxies direct
range request
reorg rangecache to support is_range() check, yt-specific logic
(experimental)
wombat: add date override (experimental)
bump tentative version to 0.7.0!
yt replays work with native player! (though still issues remain)
2014-11-04 22:11:25 -08:00
Ilya Kreymer
72aa921ce5 video: work on domain-specific range cache rewrites 2014-11-04 08:44:45 -08:00
Ilya Kreymer
5b9dcba15f video: add video rewriting use vidrw client side and youtube-dl on the server
add vi_ modifier:
-on record, gets video_info from youtube-dl, sends to proxy,
if any, via PUTMETA to create metadata record
-on playback, fetches special metadata record with video info and
returns to client as json
-vidrw script: fetches video info, if any, and attempts to replace
iframe and embed tags (so far) which are videos
wombat: export extract_url function, fix spaces and use object instance
semantics
2014-11-01 15:41:00 -07:00
Ilya Kreymer
5be65f2945 rules: better rule def, cleanup spacing 2014-10-30 00:10:39 -07:00
Ilya Kreymer
b7a098a9a7 update rules for additional sites 2014-10-17 08:27:56 -07:00
Ilya Kreymer
498a864441 rewriting: support setting cookie_scope at collection level
js rewriting: add custom url rewrite option to per-url rewrite rules
2014-10-06 10:14:45 -07:00
Ilya Kreymer
f1b3f8c76f cookie rewriter work: ability to set a custom 'root scope' rewriter,
which sets the path of all cookies to pywb root.
Option to enable per url-prefix in rules, still more testing, other
options needed
2014-09-30 12:42:11 -07:00
Ilya Kreymer
4c5a7d6bcd rules: use yaml lists in fuzzy rules, update CHANGES.rst 2014-09-21 19:48:14 -07:00
Ilya Kreymer
ec27ccfbb6 fuzzy match rules: to simplify custom fuzzy match use cases, add support
for matching fuzzy match query params as a list
2014-09-21 14:46:10 -07:00
Ilya Kreymer
7ac98fbfe2 cookie rewriter: use relative path for cookie path rewriting, pass
relative path to urlrewriter
rules: add more rules
2014-09-21 13:23:19 -07:00
Ilya Kreymer
4f9310fe4d rewrite: add support for js rewriting ';http:\\/' urls
add 'parse_comments' rule options for parsing comment contents via regex
banner: simplify banner insertion check, only insert for top frame, and check
for canon_url matching current href at top before redirecting to top
replace em_ -> mp_ as default embedded mod
2014-08-05 01:47:52 -07:00
Ilya Kreymer
de65b68edc rules: additions to rules for FB 2014-06-18 16:45:54 -07:00
Ilya Kreymer
e2349a74e2 replay: better POST support via post query append!
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
7d236af7d7 cdx: fix creation and add test for non-surt cdx (pywb-nonsurt/ test)
archiveindexer: -u option to generate non-surt cdx
tests: full test coverage for cdxdomainspecific (fuzzy and custom canon)
2014-05-16 21:16:50 -07:00
Ilya Kreymer
6eef0afb86 add new custom rewriting rule (flickr) 2014-04-20 21:40:27 -07:00
Ilya Kreymer
2c74ea9f23 fuzzy match: make filter string optionally overridable
setup.py: unset PYWB_CONFIG_ENV
2014-03-27 21:43:30 -07:00
Ilya Kreymer
68878fa72a update domain-specific rules to make flickr replay work better! 2014-03-08 15:53:52 -08:00
Ilya Kreymer
4e71a0b772 better rules.yaml fix 2014-03-06 02:51:54 -08:00
Ilya Kreymer
3718e1d21b rewrite fixes: html_rewriter do not unescape attrs!
rules: don't rewrite past end of block or line
2014-03-06 02:29:52 -08:00
Ilya Kreymer
1e3ef6ec5c cdx: add basic test for CustomUrlCanonicalizer for now
(will likely refactor this configuration)
2014-02-28 09:40:51 -08:00
Ilya Kreymer
22f1f78fca cdx: clean up filters, add '~' modifier for contains
rules: fix regex to be lazy not greedy, turn off unneeded custom
canonicalizer (need tests for custom canon)
cleanup fuzzy match query
fix data package in setup.py
2014-02-27 18:22:10 +00:00
Ilya Kreymer
5a41f59f39 new unified config system, via rules.yaml!
contains configs for cdx canon, fuzzy matching and rewriting!
rewriting: ability to add custom regexs per domain
also, ability to toggle js rewriting and custom rewriting file
(default is wombat.js)
2014-02-26 18:02:01 -08:00