1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-22 03:21:42 +01:00

74 Commits

Author SHA1 Message Date
Ilya Kreymer
b2f3a580c2 wombat work:
- for prototype override, ensure object exists
- for domain setter, ensure location exists, default to window
rules: expand facebook rule to match fbid also
2017-08-22 13:51:10 -07:00
Ilya Kreymer
1360723f95 Fuzzy Rules Improvements (#231)
* separate default rules config for query matching: 'not_exts', 'mimes', and new 'url_normalize'
- regexes in 'url_normalize' applied on each cdx entry to see if there's a match with requested url
- jsonp: allow for '/* */' comments prefix in jsonp (experimental)
- fuzzy rule: add rule for '\w+=jquery[\d]+' collapsing, supports any callback name
- fuzzy rule: add rule for more generic 'cache busting' params, 'bust' in name, possible timestamp in value (experimental)
- fuzzy rule add: add ga utm_* rule & tests
tests: improve fuzzy matcher tests to use indexing system, test all new rules
tests: add jsonp_rewriter tests
config: use_js_obj_proxy=true in default config.yaml, setting added to each collection's metadata
2017-08-21 11:01:31 -07:00
Ilya Kreymer
d8f035642b fuzzymatching: add new ext based rule. fuzzy match if url has an ext except those on the 'not_ext' list (#218) 2017-05-19 10:53:09 -07:00
Ilya Kreymer
762f669d13 rules: fuzzy match update:
- ignore all query args for flash files
- ignore cb= param for all urls
2017-05-12 08:55:03 -07:00
Ilya Kreymer
15ad56c024 rewrite dash: support for using custom rewriting function (for FB)
rewrite_fb_dash() added for rewriting dash xml, embedded in js, embedded in html
todo: refactor to make more general support for custom rewriting functions
regex_rewriter: add ':' to exclude from rewrite again
2017-03-21 11:18:53 -07:00
Ilya Kreymer
a82cfc1ab2 rewriter: add rewrite_dash for rewriting DASH and HLS manifests!
rewriter: refactor to use mixins to extend base rewriter (todo: more refactoring)
fuzzy-matcher: support for additional 'match_filters' to filter fuzzy results via optional regexes by mime type,
eg. allow more lenient fuzzy matching on DASH manifests than other resources (for now)
fuzzy-matching: add WebAgg-Fuzzy-Match response header if response is fuzzy matched, redirect to exact match in rewriterapp
2017-03-20 14:41:12 -07:00
Ilya Kreymer
0f0c20a03a fuzzy matching: new, clean fuzzy matcher implementation for webagg
rules: default rule: fuzzy match urls ignoring prefix match (needs more testing)
tests: update tests for new broad fuzzy match rule
2017-03-14 11:44:15 -07:00
Ilya Kreymer
cec0db1bdd rules: instagram rules tweak, ignore query args 2016-11-14 13:19:26 -08:00
Ilya Kreymer
41f6ca9bb6 rules: update rules for medium, instagram
bump version to 0.33.1
2016-11-13 22:50:53 -08:00
Ilya Kreymer
86cbb366f3 rules: undo yt rules change (will revisit later) 2016-09-15 10:01:36 -07:00
Ilya Kreymer
70fdaae2b3 rules: rewrite location string for periscope js 2016-09-12 20:07:14 -07:00
Ilya Kreymer
782f95fa97 rules: rules for yt video info update 2016-07-24 19:39:43 -04:00
Ilya Kreymer
af920d77a0 rules: add fuzzy rules for TW video 2016-05-03 17:33:13 -07:00
Ilya Kreymer
a1e0c29a85 rules: add rule for twitter timeline 2016-04-26 17:02:54 -07:00
Ilya Kreymer
93045fb39f rules: fuzzy rule for fastly.. 2015-10-16 09:43:22 -07:00
Ilya Kreymer
6efff4cd8f rules: cleanup, remove obsolete rules 2015-10-11 23:50:38 -07:00
Ilya Kreymer
84f49e3291 rule customization: add calendar search fuzzy match for all blogspot.com 2015-10-06 00:05:20 -07:00
Ilya Kreymer
efc690ec97 rules: improve yt rules! disable dash directly html5player 2015-09-14 19:25:32 -07:00
Ilya Kreymer
8ab342c4ca wombat: actually enable style overrides, use CSS2Declaration for FF, keep old rule in place for now 2015-08-09 00:14:26 -07:00
Ilya Kreymer
4b4d7bbc27 wombat: improved style rewriting: override CSSStyleDeclaration params directly to avoid mutation observers,
document.write: override text content of <style> elements, and newly appended Text content added as children
rules: disable special cases rules no longer needed due to improved css rewriting
2015-08-08 23:19:43 -07:00
Ilya Kreymer
6bf6a02868 tests: add explicit 'js_rewrite_location: all' rule for testing all-rewrite (as not default anymore) 2015-08-07 12:02:48 -07:00
Ilya Kreymer
a3c8698cc3 rewrite: disable server-side url rewriting in JS by default! now handled by client-side rewriting 2015-08-07 11:37:43 -07:00
Ilya Kreymer
cee3c8cb61 new wombat! refactor of rewriting:
- use defineProperty overrides on element prototypes
- postMessage() rework: store actual origin with helper function __WB_pmw(window), from
server side rewrite
- Use window.URL (or external jsurl script) to override all properties of HTMLAnchorElement,
override getAttribute() to return original
- rename window -> $wbwindow
2015-07-30 15:10:13 -07:00
Ilya Kreymer
9333ebc843 rules: tweak better twitter rules, more limited custom rules, hopefully fix inline video 2015-07-03 11:53:45 -07:00
Ilya Kreymer
69f6354934 fix typo in rules 2015-06-18 02:49:26 -04:00
Ilya Kreymer
aa80532987 rules: actual disqus fixes.. 2015-06-18 02:40:11 -04:00
Ilya Kreymer
07c2093020 rules: disqus comments work 2015-06-18 02:33:03 -04:00
Ilya Kreymer
2b9e1b97c3 rules: disable tw rewrite rule as it was page reloads 2015-06-04 17:31:16 -07:00
Ilya Kreymer
c4dad56681 rules: add custom js for resizing poster on twitter video images 2015-05-26 15:15:11 -07:00
Ilya Kreymer
ee20ac66d6 rules: tw video player rules, disable rewriting
rewrite: tweak location rule
wombat: add getAttribute() override, but disabled for now
store default getAttribute()/setAttribute() to refer internally
2015-05-25 17:52:03 -07:00
Ilya Kreymer
adb9448f27 rules: improved rules for googleplus! 2015-05-22 18:45:50 -07:00
Ilya Kreymer
d5b92dbb3c rules: update rules for yt comments 2015-05-21 17:20:40 -07:00
Ilya Kreymer
179f11198b fuzzy match: look at first occurence, not last of match seperator
rules: add new rule for yt comments
2015-05-21 23:52:09 +00:00
Ilya Kreymer
40f15cf6ea rules: add location rewrite only rule for disqus
wombat: ensure _orig_setAttribute is still set even if setAttribute rewriting disabled!
2015-05-14 22:32:09 -07:00
Ilya Kreymer
3a0a18b4e4 rules: update rules for yt 2015-05-14 22:32:07 -07:00
Ilya Kreymer
026873e308 rules: add extra fb rule 2015-04-27 00:44:24 -07:00
Ilya Kreymer
dcc2139fc8 fuzzy: add fuzzy match for vine 2015-04-13 13:02:55 -07:00
Ilya Kreymer
0b72bfe911 add 'none' js regex rewriter, which does not rewrite urls or location regexs
add test for none rewriter in test rule
2015-02-11 15:01:29 -08:00
Ilya Kreymer
c47d3ca925 wombat: add mutation observers, addressing #71 and maybe #67
rules: fix regex for yt, add rx for wikimedia
2015-02-03 11:19:41 -08:00
Ilya Kreymer
80fd47ba3e add rules for vine (#62) 2015-01-22 16:45:09 -05:00
Ilya Kreymer
d9c5345d3c rewrite: add support for Cookie request header rewrite to support sites
which require a cookie to be set. req_cookie_rewrite directive can be
set in rules.yaml per url prefix with a list of match/replace regexs
2015-01-03 12:51:09 -08:00
Ilya Kreymer
1684c14cda bump version to 0.7.2
video: disable yt DASH for better proxy and replay (experiment)
2014-12-28 16:34:48 -08:00
Ilya Kreymer
8d6845a552 fuzzy match: add support for specifying regex and args seperately for
fuzzy_lookup match
2014-12-26 14:29:51 -08:00
Ilya Kreymer
4c08a6a064 video work: improved yt handling:
- disable yt using yt api, for forced html/flash, diable on load
- use yt error event to detect error
- better fallback on recorded video
use seperate cache for range and video info tracking
fix yt rules query to account for & and ?
2014-12-26 13:02:47 -08:00
Ilya Kreymer
e68c0413d1 video rules: update rules for vimeo 2014-12-11 00:20:43 -08:00
Ilya Kreymer
87d7635f6f video: update rules to use new location-only rewriter for YT comments
support
2014-12-07 21:21:51 -08:00
Ilya Kreymer
ab087afa4e Merge branch 'develop' into video, JS rewriter refactoring 2014-12-07 21:11:20 -08:00
Ilya Kreymer
5a11714b41 rewrite: refactor JS rewriters into seperate mixins, allowing for
link only, location only, and link + location JS rewriters.
location-only rewriter is new
js_rewrite_location options: all, location, urls (for now)
2014-12-07 21:09:37 -08:00
Ilya Kreymer
c10df57e07 rules: add support for customizing matchType prefix, adding multiple
filters
2014-11-24 11:10:49 -08:00
Ilya Kreymer
fcb90fde86 rules: work on yt rules 2014-11-23 18:39:58 -08:00