Ilya Kreymer
ef8d910d01
banner: remove client side 'capture_str' formatting, just output wbinfo.timestamp,
...
allow js to format as needed, also helps with #41
update tests to only look at timestamp
2014-08-04 22:51:42 -07:00
Ilya Kreymer
8d54153326
refactoring for better extensibility:
...
remove BaseContentView, move top-frame functionality to SearchPageWbUrlHandler
remove RewriteLiveView, fold functionality into the handler
move default mod setting into RewriteContent
2014-08-04 22:51:42 -07:00
Ilya Kreymer
160182ec48
rewrite: add 'bn_' banner only rewrite
...
cleanup rewrite_content/fetch_request api to take a full wb_url
add content-length to responses whenever possible (WbResponse) and static files
bump version to 0.5.2
2014-08-04 22:51:42 -07:00
Ilya Kreymer
0b8a8f0ae2
live rewrite: catch errors from live rewrite and raise a new LiveResourceError with a 400 error code,
...
indicating bad request for live resource. Add test for invalid live rewrite requests
2014-07-21 22:43:34 -07:00
Ilya Kreymer
7c57345363
proxy: add 'unaltered_replay' option to proxy_options to replay
...
all content unaltered (no rewriting html, no banner, no wombat)
use 'proxy_options' instead of 'routing_options', add additional
tests for proxy mode
2014-07-21 16:42:14 -07:00
Ilya Kreymer
e4297ddabe
tests: add integration tests for $liveweb rewrite handler and replay
...
with fallback
2014-07-20 18:25:47 -07:00
Ilya Kreymer
b785cd6f08
memento: use mp_ modifier to support memento with frame or non-frame replay
...
change memento test to use frame replay
2014-07-20 15:43:39 -07:00
Ilya Kreymer
96fcaab521
live-rewrite-server: add ability to specify http/https proxy for live fetching
...
(for example, for use with a recording proxy)
2014-07-19 14:43:28 -07:00
Ilya Kreymer
1317b2b10f
route selection via proxy auth!
...
refactor poute request parsing to happen in the actual router class instead of in the route
in proxy mode, add support for picking a route via proxy-auth
improve test for 'top' rewriting
2014-07-10 21:54:23 -07:00
Ilya Kreymer
70b7e29b36
pass raw bytes to htmlparser, assuming ascii-compatibility
...
(todo: add tests for non-ascii compatible encodings)
improved rendering of certain pages, needs more testing
lxml: remove lxml and complexity associated with having the parser,
as its too unpredictable for older html, does its own decoding.
2014-06-27 19:03:06 -07:00
Ilya Kreymer
fb07775d38
tests: add 'bad.cdx' for testing cdx lines with missing original for revisit,
...
missing/non-existant warc
2014-06-25 12:32:57 -07:00
Ilya Kreymer
913a1e9f31
warc: simplify recordloader a bit more, only response and request records
...
get parsed as http (excluding dns: and whois: uris)
All others have an '-' status and no headers parsing
tests: add test for zero-length revisits
2014-06-25 12:11:26 -07:00
Ilya Kreymer
073f1e142e
test_config: test lxml parser still
2014-06-14 21:33:08 -07:00
Ilya Kreymer
80e80e97d3
replay: support 'framed_replay' option in config for both replay and live rewrite
...
split replay view into BaseContentView and ReplayView
refactor RewriteLiveHandler into RewriteLiveView
add additional tests for framed and non-framed mode
default to framed replay!
2014-06-14 18:26:19 -07:00
Ilya Kreymer
0d3f663ef1
rewrite: disable refer-redirect in case of POST, handle request w/o redirect
...
(can't use 307 because of FF)
2014-06-13 16:23:11 -07:00
Ilya Kreymer
41e1809039
update wombat.js (support for write override, fill in WB_wombat_location on new iframe)
...
disable 307 redirects as FF always displays modal confirmation for these, even for same host
2014-06-11 20:12:05 -07:00
Ilya Kreymer
0c9d88f032
POST replay: treat POST form data same as get query, no '&&&' marker
...
additional testing POST
2014-06-11 11:17:06 -07:00
Ilya Kreymer
e2349a74e2
replay: better POST support via post query append!
...
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
f9710d033c
fix integration test for 307
...
update head_insert for new wombat
remove redundant host jinja func, use 'urlsplit' instead
2014-05-30 11:17:12 -07:00
Ilya Kreymer
923421d637
rewrite_content: add a few tests for cs_, js_, remove redundant except
2014-05-16 22:43:53 -07:00
Ilya Kreymer
2600d870d7
improved test: dsrules remove redundant check
...
static: check invalid static paths and file_wrapper
memento: check non-memento paths
test debug handlers and custom '-cdx' suffix
2014-05-16 22:17:51 -07:00
Ilya Kreymer
ca33287051
test: move non-surt-cdx sample to non-surt-cdx/ dir for clarity / avoid confusion
...
when bulk loading cdx/ dir (surt and non-surt cdx should NOT be mixed)
2014-05-16 21:21:14 -07:00
Ilya Kreymer
7d236af7d7
cdx: fix creation and add test for non-surt cdx (pywb-nonsurt/ test)
...
archiveindexer: -u option to generate non-surt cdx
tests: full test coverage for cdxdomainspecific (fuzzy and custom canon)
2014-05-16 21:16:50 -07:00
Ilya Kreymer
85593696fa
remove rfc3987 validation, was rejecting valid urls
...
add extract_referer_wburl_str() to extract WbUrl str, if any,
from the referrer. Use that for live_rewrite_handler to override
default referrer
2014-04-15 16:38:53 -07:00
Ilya Kreymer
bfc2e63793
live rewriter: integrate handler with rewrite_live.py module,
...
clean up css, add unit and integration tests
clean up cli server now known as 'live-rewrite-server', which performs live rewrite using
iframe paradigm
2014-04-09 15:49:55 -07:00
Ilya Kreymer
b0b0adb043
refactor: rename pywb.core -> pywb.webapp
...
move perms/test/test_perms_policy -> tests/perms_fixture
for rules file, use single DEFAULT_RULES_FILE import
2014-04-04 10:09:26 -07:00
Ilya Kreymer
80f2da9548
refactor: move configs/config.yaml to root again
...
remove cdx-server specific config, instead make cdx server api-only
path configurable from regular config
2014-04-02 21:26:53 -07:00
Ilya Kreymer
91184426b7
test coverage pass:
...
refactor and cleanup to improve coverage for corner cases
2014-04-02 13:16:54 -07:00
Ilya Kreymer
41d51a6427
ensure 'cdx_' modifier is working
2014-03-27 14:46:59 -07:00
Ilya Kreymer
c6c9fe680a
memento: add original link to timemap #10
2014-03-24 14:57:41 -07:00
Ilya Kreymer
2a605652c6
add memento timemap support (for archival mode only)
...
add timemap Link headers to timegate and memento responses
timemap accessible via /timemap/*/ path
2014-03-24 14:00:06 -07:00
Ilya Kreymer
79da12348f
limit stream by warc/arc record length instead of
...
http content length.
track length of StatusAndHeaders also.
add tests to verify content length correct for identity
arc and arcgz replays as well
2014-03-22 11:30:51 -07:00
Ilya Kreymer
6461af030b
refactoring: clean up handlers and replay_views for pep8
...
use BlockLoader().load for StaticHandler static file resolving
update static paths to point to pywb/static instead of static
2014-03-14 18:17:22 -07:00
Ilya Kreymer
bdcda1df6f
add test config for memento #10
2014-03-14 11:01:47 -07:00
Ilya Kreymer
a1ab54c340
first pass at memento support #10 !
...
memento support enabled by default, togglable via 'enable_memento' config property
supporting timegate and memento apis, no timemap yet
supporting pattern 2.3 for archival and pattern 1.3 for proxy modes
also:
simplify exception hierarchy a bit more, move down to utils
make WbRequest and WbResponse extensible with mixins (eg for memento)
2014-03-14 10:46:20 -07:00
Ilya Kreymer
702e5e0143
perms test: moved test perms policy to perms/test/test_perms_policy.py
...
all perms related configs exist within perms package
2014-03-06 18:24:53 -08:00
Ilya Kreymer
681c2fd8d5
perms: refactor perms config to make interface much clearer
...
'perms_policy' is a callback which returns a Perms object, which may
filter cdx lines from the response
2014-03-06 18:06:05 -08:00
Ilya Kreymer
d702a98bbc
url-agnostic revisit testing!
...
add sample warc and cdx for url-agnostic revisits
add unit test and integration test
resolvingloader: pass callback instead of full cdx server
for use for loading cdx in case of url-agnostic revisit
2014-03-04 20:12:09 +00:00
Ilya Kreymer
cf5aaf5de4
add new perms_handler for supporting direct permissions api
...
currently just returning ["allow"] or ["block"] for a single url
2014-03-03 19:37:37 -08:00
Ilya Kreymer
577c74be49
cdx: move perms related handling to pywb.perms package, support
...
custom processing ops, of which perms is a specific type
add lazy_ops test to ensure all cdx processing ops are lazy
perms: set up a 'perms policy' factory and perms policy implementation
perms policy setting results in a custom processing op
update tests to work with new config
IndexReader handles both cdx server + perms policy
2014-03-03 18:27:04 -08:00
Ilya Kreymer
331976748e
cdxops: make sure sort reverse and closest are lazy (create generators)
...
perms: allow_url_lookup() only takes key param for simplicity
2014-03-03 12:16:07 -08:00
Ilya Kreymer
0bf651c2e3
add cdx_server app!
...
port wsgi cdx server tests to test new app!
move base handlers to basehandlers in framework pkg
(remove werkzeug dependency)
2014-03-02 23:41:44 -08:00
Ilya Kreymer
f0a0976038
more refactoring!
...
create 'framework' subpackage for general purpose components!
contains routing, request/response, exceptions and wsgi wrappers
update framework package for pep8
dsrules: using load_config_yaml() (pushed to utils)
to init default config
2014-03-02 21:42:05 -08:00
Ilya Kreymer
f1acad53fc
wsgi wrapper reorg!
...
support pluggable wsgi apps
utils: BlockLoader() supports loading from package
exceptions: base WbException moved to utils
2014-03-02 19:26:06 -08:00
Ilya Kreymer
19f86305bf
update pkg-reorg with changes from master, including
...
CDXQuery configuration
2014-03-02 00:26:29 -08:00
Ilya Kreymer
06a22c845b
ensure cdx loading happens lazily
...
add perms test to ensure 'short-circuiting' in case of
permission exception
2014-03-01 18:40:16 -08:00
Kenji Nagahashi
1f65eff828
Merge remote-tracking branch 'origin/master' into cdx-server
...
Conflicts:
pywb/cdx/cdxdomainspecific.py
pywb/cdx/cdxserver.py
pywb/cdx/test/cdxserver_test.py
setup.py
tests/test_integration.py
2014-02-28 19:47:24 +00:00
Ilya Kreymer
c084b45298
Merge master into pkg-reorg
2014-02-28 10:25:36 -08:00
Ilya Kreymer
304a33aa5b
add coverage badge
2014-02-27 18:52:41 -08:00
Ilya Kreymer
921b2eb2e1
improve testing and a few fixes:
...
archivalrouter: support empty collection, with and without SCRIPT_NAME
cdx: remove cdx source test, including access denied
replay: when content-type present, limit the decompressed stream to content-length
(this ensures last 4 bytes in warc/arc record are not read)
integration tests for identity replay
2014-02-27 18:43:55 -08:00