backup/pywb - pywb - Source code and issue tracker for Open Eggbert

mirror of https://github.com/webrecorder/pywb.git synced 2025-03-29 00:52:29 +01:00

Author	SHA1	Message	Date
Ilya Kreymer	e4bcef1c8b	rewrite: default HTMLParser entityref and charref are treated as plain data for HTMLRewriter, since they are never rewritten, and to avoid semicolon ambiguity, since no way to determine if there is a ; or not at end. Addresses #43	2014-11-04 12:14:00 -08:00
Ilya Kreymer	4f9310fe4d	rewrite: add support for js rewriting ';http:\\/' urls add 'parse_comments' rule options for parsing comment contents via regex banner: simplify banner insertion check, only insert for top frame, and check for canon_url matching current href at top before redirecting to top replace em_ -> mp_ as default embedded mod	2014-08-05 01:47:52 -07:00
Ilya Kreymer	6e6688beb3	rewrite/testing: add additional test for live rewrite post, invalid post htmlrewrite: annotate untestable sections (unimplemented, 2.6 only exceptions)	2014-08-04 22:51:43 -07:00
Ilya Kreymer	dd9f138bab	disable decoding, by default, of content for html parser	2014-06-27 16:53:33 -07:00
Ilya Kreymer	d7516f4cd7	rewrite: fix <base> rewriting, urlrewriter replacement turn off lxml rewriter by default	2014-06-13 16:44:37 -07:00
Ilya Kreymer	1d674d97d8	pep8 pass!	2014-05-16 22:44:26 -07:00
Ilya Kreymer	2ad41e2b94	rewrite: rewrite data-* attributes if they look like links (http, https, //)	2014-04-22 16:32:36 -07:00
Ilya Kreymer	e1e55ac061	minor tweaks: rewrite 'crossorigin' -> '_crossorigin' param to disable crossorigin as it may interfere with loading rewritten content, add tests for html and lxml parsers add server_cls as optional param to QueryHandler.init_from_config() for easier customization views: dont create template if empty template file specified	2014-04-19 12:04:43 -07:00
Ilya Kreymer	23bb5bd175	rewrite: wombat update 2.0! Using Object.defineProperty() to better override .href and .hash properties when possible. .href returns original url, but on assignment rewrites before redirecting .hash proxies to location.hash Also added: - window.top -> window.WB_wombat_top - document.referrer -> document.WB_wombat_referrer - <source> html tag rewriting	2014-04-18 19:30:48 -07:00
Ilya Kreymer	bfc2e63793	live rewriter: integrate handler with rewrite_live.py module, clean up css, add unit and integration tests clean up cli server now known as 'live-rewrite-server', which performs live rewrite using iframe paradigm	2014-04-09 15:49:55 -07:00
Ilya Kreymer	19f2df4717	refactor: - move is_identity(), is_embed() to wburl from wbrequest - add is_mainpage() predicate - add create_template() to each J2TemplateView to create itself - add HeadInsertView to create a reusable head insert for RewriteContent - add 'mp_' as modifier for frames mode to be used as possible modifier with HTMLRewriter	2014-04-09 15:49:55 -07:00
Ilya Kreymer	d1ad9b5e69	refactor: cleanup HTMLRewrtier/LXMLHTMLRewriter close path, single close in base class delegeating to _internal_close() Also, HTMLRewriter auto-terminates <script> and <style> tags for consistency with lxml	2014-03-17 20:50:35 -07:00
Ilya Kreymer	f35e82a4d5	ensure final output from close() is encoded! add config option to 'use_lxml_parser' if available, if not, will default to regular parser testing on travis with lxml (not adding to dep yet)	2014-03-17 13:19:51 -07:00
Ilya Kreymer	1404177c6f	fixes for unicode (doctests) remove explicit </html> since lxml does not parse past the </html> tag and adds one anyway (not ideal but only workaround for html after closing tag)	2014-03-17 11:55:45 -07:00
Ilya Kreymer	bd10c6c2d2	first pass -- lxml parser!	2014-03-16 23:12:04 -07:00
Ilya Kreymer	a69d565af5	make pywb.rewrite package pep8-compatible move doctests to test subdir	2014-03-14 16:44:23 -07:00
Ilya Kreymer	584d826f05	rewrite: fix html rewriting, if forcing end </script>, </style>, don't actually output to preserve original wombat: copy over all Location settings wburl: convert :/ -> :// if 2nd slash missing, only check for <scheme>:/ and ignore subsequent slashes	2014-03-08 15:10:35 -08:00
Ilya Kreymer	3718e1d21b	rewrite fixes: html_rewriter do not unescape attrs! rules: don't rewrite past end of block or line	2014-03-06 02:29:52 -08:00
Ilya Kreymer	cc22448cc5	fixes for 2.6 and pypy	2014-03-04 19:11:17 -08:00
Ilya Kreymer	5345459298	pywb 0.2! move to distinct packages: pywb.utils, pywb.cdx, pywb.warc, pywb.util, pywb.rewrite! each package will have its own README and tests shared sample_data and install	2014-02-17 10:01:09 -08:00

20 Commits