1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

99 Commits

Author SHA1 Message Date
Ilya Kreymer
8d54153326 refactoring for better extensibility:
remove BaseContentView, move top-frame functionality to SearchPageWbUrlHandler
remove RewriteLiveView, fold functionality into the handler
move default mod setting into RewriteContent
2014-08-04 22:51:42 -07:00
Ilya Kreymer
160182ec48 rewrite: add 'bn_' banner only rewrite
cleanup rewrite_content/fetch_request api to take a full wb_url
add content-length to responses whenever possible (WbResponse) and static files
bump version to 0.5.2
2014-08-04 22:51:42 -07:00
Ilya Kreymer
a2d86fa495 Merge branch 'develop' into https-proxy 2014-08-04 22:01:16 -07:00
Ilya Kreymer
e1e8f679b2 rewrite/testing: add additional test for live rewrite post, invalid post
htmlrewrite: annotate untestable sections (unimplemented, 2.6 only exceptions)
2014-08-04 21:59:46 -07:00
Ilya Kreymer
924f71a4cc Merge branch 'develop' into https-proxy 2014-08-04 18:44:01 -07:00
Ilya Kreymer
86bc2f17ba banner: remove client side 'capture_str' formatting, just output wbinfo.timestamp,
allow js to format as needed, also helps with #41
update tests to only look at timestamp
2014-08-04 18:19:28 -07:00
Ilya Kreymer
492aaa4a01 Merge branch 'develop' into https-proxy 2014-08-04 13:00:25 -07:00
Ilya Kreymer
95028ab692 refactoring for better extensibility:
remove BaseContentView, move top-frame functionality to SearchPageWbUrlHandler
remove RewriteLiveView, fold functionality into the handler
move default mod setting into RewriteContent
2014-08-04 01:18:46 -07:00
Ilya Kreymer
2ca4757599 fix integration test for proxy_pac 2014-07-31 18:03:18 -07:00
Ilya Kreymer
b92eda77f6 rewrite: add 'bn_' banner only rewrite
cleanup rewrite_content/fetch_request api to take a full wb_url
add content-length to responses whenever possible (WbResponse) and static files
bump version to 0.5.2
2014-07-29 12:20:22 -07:00
Ilya Kreymer
7c57345363 proxy: add 'unaltered_replay' option to proxy_options to replay
all content unaltered (no rewriting html, no banner, no wombat)
use 'proxy_options' instead of 'routing_options', add additional
tests for proxy mode
2014-07-21 16:42:14 -07:00
Ilya Kreymer
e4297ddabe tests: add integration tests for $liveweb rewrite handler and replay
with fallback
2014-07-20 18:25:47 -07:00
Ilya Kreymer
1317b2b10f route selection via proxy auth!
refactor poute request parsing to happen in the actual router class instead of in the route
in proxy mode, add support for picking a route via proxy-auth
improve test for 'top' rewriting
2014-07-10 21:54:23 -07:00
Ilya Kreymer
fb07775d38 tests: add 'bad.cdx' for testing cdx lines with missing original for revisit,
missing/non-existant warc
2014-06-25 12:32:57 -07:00
Ilya Kreymer
913a1e9f31 warc: simplify recordloader a bit more, only response and request records
get parsed as http (excluding dns: and whois: uris)
All others have an '-' status and no headers parsing
tests: add test for zero-length revisits
2014-06-25 12:11:26 -07:00
Ilya Kreymer
80e80e97d3 replay: support 'framed_replay' option in config for both replay and live rewrite
split replay view into BaseContentView and ReplayView
refactor RewriteLiveHandler into RewriteLiveView
add additional tests for framed and non-framed mode
default to framed replay!
2014-06-14 18:26:19 -07:00
Ilya Kreymer
0d3f663ef1 rewrite: disable refer-redirect in case of POST, handle request w/o redirect
(can't use 307 because of FF)
2014-06-13 16:23:11 -07:00
Ilya Kreymer
41e1809039 update wombat.js (support for write override, fill in WB_wombat_location on new iframe)
disable 307 redirects as FF always displays modal confirmation for these, even for same host
2014-06-11 20:12:05 -07:00
Ilya Kreymer
0c9d88f032 POST replay: treat POST form data same as get query, no '&&&' marker
additional testing POST
2014-06-11 11:17:06 -07:00
Ilya Kreymer
e2349a74e2 replay: better POST support via post query append!
record_loader can optionally parse 'request' records
archiveindexer has -a flag to write all records ('request' included),
-p flag to append post query
post-test.warc.gz and cdx
POST redirects using 307
2014-06-10 19:21:46 -07:00
Ilya Kreymer
f9710d033c fix integration test for 307
update head_insert for new wombat
remove redundant host jinja func, use 'urlsplit' instead
2014-05-30 11:17:12 -07:00
Ilya Kreymer
923421d637 rewrite_content: add a few tests for cs_, js_, remove redundant except 2014-05-16 22:43:53 -07:00
Ilya Kreymer
2600d870d7 improved test: dsrules remove redundant check
static: check invalid static paths and file_wrapper
memento: check non-memento paths
test debug handlers and custom '-cdx' suffix
2014-05-16 22:17:51 -07:00
Ilya Kreymer
ca33287051 test: move non-surt-cdx sample to non-surt-cdx/ dir for clarity / avoid confusion
when bulk loading cdx/ dir (surt and non-surt cdx should NOT be mixed)
2014-05-16 21:21:14 -07:00
Ilya Kreymer
7d236af7d7 cdx: fix creation and add test for non-surt cdx (pywb-nonsurt/ test)
archiveindexer: -u option to generate non-surt cdx
tests: full test coverage for cdxdomainspecific (fuzzy and custom canon)
2014-05-16 21:16:50 -07:00
Ilya Kreymer
b0b0adb043 refactor: rename pywb.core -> pywb.webapp
move perms/test/test_perms_policy -> tests/perms_fixture
for rules file, use single DEFAULT_RULES_FILE import
2014-04-04 10:09:26 -07:00
Ilya Kreymer
91184426b7 test coverage pass:
refactor and cleanup to improve coverage for corner cases
2014-04-02 13:16:54 -07:00
Ilya Kreymer
41d51a6427 ensure 'cdx_' modifier is working 2014-03-27 14:46:59 -07:00
Ilya Kreymer
79da12348f limit stream by warc/arc record length instead of
http content length.
track length of StatusAndHeaders also.
add tests to verify content length correct for identity
arc and arcgz replays as well
2014-03-22 11:30:51 -07:00
Ilya Kreymer
a1ab54c340 first pass at memento support #10!
memento support enabled by default, togglable via 'enable_memento' config property
supporting timegate and memento apis, no timemap yet
supporting pattern 2.3 for archival and pattern 1.3 for proxy modes
also:
simplify exception hierarchy a bit more, move down to utils
make WbRequest and WbResponse extensible with mixins (eg for memento)
2014-03-14 10:46:20 -07:00
Ilya Kreymer
702e5e0143 perms test: moved test perms policy to perms/test/test_perms_policy.py
all perms related configs exist within perms package
2014-03-06 18:24:53 -08:00
Ilya Kreymer
d702a98bbc url-agnostic revisit testing!
add sample warc and cdx for url-agnostic revisits
add unit test and integration test
resolvingloader: pass callback instead of full cdx server
for use for loading cdx in case of url-agnostic revisit
2014-03-04 20:12:09 +00:00
Ilya Kreymer
f0a0976038 more refactoring!
create 'framework' subpackage for general purpose components!
contains routing, request/response, exceptions and wsgi wrappers
update framework package for pep8
dsrules: using load_config_yaml() (pushed to utils)
to init default config
2014-03-02 21:42:05 -08:00
Ilya Kreymer
f1acad53fc wsgi wrapper reorg!
support pluggable wsgi apps
utils: BlockLoader() supports loading from package
exceptions: base WbException moved to utils
2014-03-02 19:26:06 -08:00
Ilya Kreymer
19f86305bf update pkg-reorg with changes from master, including
CDXQuery configuration
2014-03-02 00:26:29 -08:00
Ilya Kreymer
06a22c845b ensure cdx loading happens lazily
add perms test to ensure 'short-circuiting' in case of
permission exception
2014-03-01 18:40:16 -08:00
Kenji Nagahashi
1f65eff828 Merge remote-tracking branch 'origin/master' into cdx-server
Conflicts:
	pywb/cdx/cdxdomainspecific.py
	pywb/cdx/cdxserver.py
	pywb/cdx/test/cdxserver_test.py
	setup.py
	tests/test_integration.py
2014-02-28 19:47:24 +00:00
Ilya Kreymer
c084b45298 Merge master into pkg-reorg 2014-02-28 10:25:36 -08:00
Ilya Kreymer
304a33aa5b add coverage badge 2014-02-27 18:52:41 -08:00
Ilya Kreymer
921b2eb2e1 improve testing and a few fixes:
archivalrouter: support empty collection, with and without SCRIPT_NAME
cdx: remove cdx source test, including access denied
replay: when content-type present, limit the decompressed stream to content-length
(this ensures last 4 bytes in warc/arc record are not read)
integration tests for identity replay
2014-02-27 18:43:55 -08:00
Kenji Nagahashi
9eda5ad97e address test cases broken by previous commit.
move py.test fixture and fixture classes (TestExclusionPerms, PrintReporter)
  to tests.fixture module. update test_config.yaml accordingly.
2014-02-28 01:39:04 +00:00
Ilya Kreymer
51d61a8738 package reorg!
split up remaining parts of pywb root pkg
into core, dispatch and bootstrap
2014-02-24 03:00:01 -08:00
Ilya Kreymer
9194e867ea - add referrer self-redirect check and test case
- dispatching: cleanup wbrequestresponse, move tests to a seperate file
- wbrequest: store both rel_prefix and host_prefix, with wb_prefix either full
or rel path as needed, so that full and relative paths are
both available in wbrequest
- create WbUrlHandler to differentiate handlers which
support WbUrl (timestamp[mod]/url) semantic vs other request handlers.
2014-02-23 23:31:54 -08:00
Ilya Kreymer
922917a631 rename BufferedReader -> DecompressingBufferedReader
remove max_len from DecompressingBufferedReader as it applied to
the compressed size, not original size.
Add integration test for verifying content length of larger file
2014-02-20 11:53:08 -08:00
Ilya Kreymer
ff428ed43e exclusions: add AllAllowPerms and refactor exclusions interface
add TestExclusionPerms and a sample exclusion integration test
refactor cdx server init params into **kwargs
convert all cdx params to use camelCase
2014-02-19 20:20:31 -08:00
Ilya Kreymer
a09dec4b3e cdx: add domain-specific rules at cdx layer for custom canonicalization!
and 'fuzzy' matching when not found
handled via cdxdomainspecific.py
BaseCDXServer contains a canonicalizer object and a fuzzy query
canonicalizer abstracted to seperate class (in canonicalizer.py)
clean up cdx related exceptions
default rules read from cdx/rules.yaml
filename configurable via 'domain_specific_rules' setting in config.yaml
fix typo in pywb/rewrite
2014-02-18 14:56:13 -08:00
Ilya Kreymer
94f1dc3be5 cleanup wbexceptions, remove unused 2014-02-17 10:23:37 -08:00
Ilya Kreymer
5345459298 pywb 0.2!
move to distinct packages: pywb.utils, pywb.cdx, pywb.warc, pywb.util, pywb.rewrite!
each package will have its own README and tests
shared sample_data and install
2014-02-17 10:01:09 -08:00
Ilya Kreymer
2528ee0a7c refactoring of binsearch and cdxserver into seperate packages
also move complicated doctests and integration tests to tests/
2014-02-12 13:16:07 -08:00