1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

451 Commits

Author SHA1 Message Date
Ilya Kreymer
d1ad9b5e69 refactor: cleanup HTMLRewrtier/LXMLHTMLRewriter close path,
single close in base class delegeating to _internal_close()
Also, HTMLRewriter auto-terminates <script> and <style> tags
for consistency with lxml
2014-03-17 20:50:35 -07:00
Ilya Kreymer
10c84d8354 embed rewriting: add 'em_' flag for all regex-based rewrites
(js, css, xml) to be able to distinguish between embeds and non-embeds
more conclusively
wbrequest: add is_embed(), is_identity() properties
update tests
don't insert html banner if detected as an embed
2014-03-17 19:36:25 -07:00
Ilya Kreymer
52d99aef57 misc fixes: RemoteCDXServer throws NotFoundException on 404
fix typo in handlers
make WBHandler overridable in pywb_init
make perms_policy optional in IndexReader
2014-03-17 17:35:10 -07:00
Ilya Kreymer
2e7b17ed56 cleanup: move lxml tests to seperate test dir, seperate html, lxml html and regex
tests into seperate files
fix lxml toggle in rewriterrules
2014-03-17 15:30:45 -07:00
Ilya Kreymer
f35e82a4d5 ensure final output from close() is encoded!
add config option to 'use_lxml_parser' if available, if not,
will default to regular parser
testing on travis with lxml (not adding to dep yet)
2014-03-17 13:19:51 -07:00
Ilya Kreymer
1404177c6f fixes for unicode (doctests)
remove explicit </html> since lxml does not parse past the </html>
tag and adds one anyway (not ideal but only workaround for html after closing tag)
2014-03-17 11:55:45 -07:00
Ilya Kreymer
23d60b0bb8 more work on lxml parser.. always write
start/end tags..
rewriterules: experiment defaulting to lxml if possible!
2014-03-17 09:48:31 -07:00
Ilya Kreymer
bd10c6c2d2 first pass -- lxml parser! 2014-03-16 23:12:04 -07:00
Ilya Kreymer
b0a7cafe6d update default static path to: pywb/static/ 2014-03-14 18:37:03 -07:00
Ilya Kreymer
6461af030b refactoring: clean up handlers and replay_views for pep8
use BlockLoader().load for StaticHandler static file resolving
update static paths to point to pywb/static instead of static
2014-03-14 18:17:22 -07:00
Ilya Kreymer
a69d565af5 make pywb.rewrite package pep8-compatible
move doctests to test subdir
2014-03-14 16:44:23 -07:00
Ilya Kreymer
bfffac45b0 remove reference to deleted file wbexceptions.py 2014-03-14 11:22:50 -07:00
Ilya Kreymer
cb244a8c25 more readme tweaks 2014-03-14 11:19:05 -07:00
Ilya Kreymer
535cbc6dde update README 2014-03-14 11:05:05 -07:00
Ilya Kreymer
14a12f95b2 pep8 fixes, improve docs for proxy
move CaptureException into replay_views
2014-03-14 11:02:03 -07:00
Ilya Kreymer
bdcda1df6f add test config for memento #10 2014-03-14 11:01:47 -07:00
Ilya Kreymer
a1ab54c340 first pass at memento support #10!
memento support enabled by default, togglable via 'enable_memento' config property
supporting timegate and memento apis, no timemap yet
supporting pattern 2.3 for archival and pattern 1.3 for proxy modes
also:
simplify exception hierarchy a bit more, move down to utils
make WbRequest and WbResponse extensible with mixins (eg for memento)
2014-03-14 10:46:20 -07:00
Ilya Kreymer
dd9a2c635f disable pypy travis builds for now 2014-03-13 22:38:56 -07:00
Ilya Kreymer
29ecadee54 update README, fix setup.py typo 2014-03-12 18:35:21 -07:00
Ilya Kreymer
3222f3ee08 update setup, remove markdown readme 2014-03-12 17:57:54 -07:00
ikreymer
78af82d6b1 Fixes for README.rst
badges, tables finally working.
2014-03-12 17:50:47 -07:00
ikreymer
681fd79974 Update README.rst
Fix badges
2014-03-10 19:25:14 -07:00
Ilya Kreymer
1c85aebbf0 fix setup.py 2014-03-10 19:19:41 -07:00
Ilya Kreymer
fe9eaea006 update setup.py classifiers 2014-03-10 19:11:19 -07:00
Ilya Kreymer
f578bdf5de add README.rst 2014-03-10 19:01:20 -07:00
Ilya Kreymer
45972df6c4 minor fixes, copyright update 2014-03-10 18:45:45 -07:00
Ilya Kreymer
3322fb233f fixup wb and wombat.js:
fix formatting to 4-tab snake_case, remove obsolete code
2014-03-10 00:55:41 -07:00
Ilya Kreymer
e3d700a50f wombat improvements: override history, ajax and use
seeded random number gen (with seed from capture timestamp)
2014-03-10 00:10:20 -07:00
Ilya Kreymer
e346dfb024 remove accidental logging 2014-03-09 23:03:55 -07:00
Ilya Kreymer
e384425d48 proxy cleanup: move HttpsUrlRewriter to url_rewriter module,
move strip_scheme to replay_views where it is used
regex rewriters: use url rewriter for rewriting http:// in JS,
instead of just prefix, to support custom rewriters (such as
https->http rewriter in proxy mode)
2014-03-09 14:21:32 -07:00
Ilya Kreymer
68878fa72a update domain-specific rules to make flickr replay work better! 2014-03-08 15:53:52 -08:00
archiveit
4fdcdc98ae replay: ignore 304 captures 2014-03-08 23:46:59 +00:00
Ilya Kreymer
584d826f05 rewrite: fix html rewriting, if forcing end </script>, </style>,
don't actually output to preserve original
wombat: copy over all Location settings
wburl: convert :/ -> :// if 2nd slash missing, only check for <scheme>:/
and ignore subsequent slashes
2014-03-08 15:10:35 -08:00
Ilya Kreymer
541c076b77 setup: add cli scripts for wayback, cdx-server
fix logging of app name, make most logging debug
2014-03-08 15:09:53 -08:00
Ilya Kreymer
40b7a8e921 move pytest args to pytest.ini 2014-03-08 09:30:56 -08:00
Ilya Kreymer
3b1afc3e3d replace StringIO with BytesIO 2014-03-08 09:30:19 -08:00
ikreymer
1a6f2e2fe1 Merge pull request #32 from kngenie/add-api-docs
add api doc pages for all modules with sphinx-apidoc
2014-03-07 16:02:36 -08:00
Kenji Nagahashi
4e7ec4dede Merge remote-tracking branch 'origin/master' into add-api-docs 2014-03-07 20:14:40 +00:00
Kenji Nagahashi
1829de2123 add API doc for all packages with sphinx-apidoc 2014-03-07 20:09:33 +00:00
Ilya Kreymer
e3618871c8 proxy: support setting hostname via env variable 2014-03-07 11:42:09 -08:00
Ilya Kreymer
a60ab1f118 routing/proxy: pass in hostpaths to proxy routing
add PYWB_HOST_NAME env var to allow overriding default hostname
add request_hostname jinja filter
2014-03-07 10:29:11 -08:00
Ilya Kreymer
702e5e0143 perms test: moved test perms policy to perms/test/test_perms_policy.py
all perms related configs exist within perms package
2014-03-06 18:24:53 -08:00
Ilya Kreymer
681c2fd8d5 perms: refactor perms config to make interface much clearer
'perms_policy' is a callback which returns a Perms object, which may
filter cdx lines from the response
2014-03-06 18:06:05 -08:00
Ilya Kreymer
7b5cbaa878 cdx: clean up closest, reverse ops
closest takes precedence over reverse
'reverse closest' not supported, add test to reflect that
2014-03-06 16:11:46 -08:00
Ilya Kreymer
c42a96386f cdx: fix the 'yield nothing' case when limit==1
add additional test case for limit==1 and reverse=True,
as limit is optimized out
2014-03-06 16:01:49 -08:00
Ilya Kreymer
4e71a0b772 better rules.yaml fix 2014-03-06 02:51:54 -08:00
Ilya Kreymer
3718e1d21b rewrite fixes: html_rewriter do not unescape attrs!
rules: don't rewrite past end of block or line
2014-03-06 02:29:52 -08:00
Ilya Kreymer
673ff35d15 minor fixes: wombat add document.WB_wombat_location
loaders: file 'urls' starting with . and / are always file paths
pep8 fixes for cdx, utils packages
2014-03-05 17:13:14 -08:00
Kenji Nagahashi
28b49f9aeb add doc directory for Sphinx documentation 2014-03-05 23:03:04 +00:00
ikreymer
03ebca47c0 Merge pull request #29 from kngenie/just-a-cleanup
clean up docstrings: fix reST formatting issues.
2014-03-05 14:36:07 -08:00