Ilya Kreymer
writing: add write_stream_to_file()function to be able to write to a WARC an existing input stream
refactor _do_write_req_resp to pass callback to actual writing (eg. _write_to_file)
2016-07-31 00:49:57 -04:00
Ilya Kreymer
recorder: split up _open_file() into get_new_filename() and allow_new_file() to customize skipping recording by returning false
from allow_new_file()
create_warcinfo_record() - switch to dict args over kwargs, update tests
2016-07-30 13:11:12 -04:00
Ilya Kreymer
add Dockerfile to git!
2016-07-26 19:42:59 -04:00
Ilya Kreymer
rewriter: range massage for patch as well as record
2016-07-26 19:42:32 -04:00
Ilya Kreymer
custom record: don't override WARC-Date if provided in request header,
return chosen WARC-Date in json response
2016-07-26 19:41:47 -04:00
Ilya Kreymer
custom response: add utf-8 encoding, unless framed replay
2016-07-24 00:14:43 -04:00
Ilya Kreymer
responseloader: quote/unquote Webagg-Source-Coll header as source may contain unicode chars
2016-07-23 21:57:24 -04:00
Ilya Kreymer
temp cookie store: add add_cookie() function for explicitly adding cookie, make expiry configurable
related to webrecorder/webrecorder#79
2016-07-01 10:15:59 -04:00
Ilya Kreymer
rewriter: update for moved RewriterAMF in pywb
2016-06-14 00:14:29 -04:00
Ilya Kreymer
webagg: store original 'source' value in cdx for properly mapping in WARC file resolver
error handling: ensure 'last_exc' is a string
2016-06-14 00:13:01 -04:00
Ilya Kreymer
recorder: support overridings get_params() in subclass
multiwarcwriter: support multiple warcs in same dir, support random component in path, and a custom
key template for selecting current warc file, not related to current directory
2016-06-07 12:55:04 -04:00
Ilya Kreymer
webagg: redis lookup: if url contains wildcard, scan redis keys to check multiple keys until one is found
webagg tests: fix test to include mime in live cdx
2016-06-07 12:54:28 -04:00
Ilya Kreymer
video loader support: add VideoLoader, which uses youtube-dl to create a metadata record
of video info. Activated with explicit content_type param 'application/'
2016-05-28 15:01:33 -07:00
Ilya Kreymer
recorder put custom record: add support for put/post of a custom record. If put_record=
param is included, the request body
is written to the specified record type.
move record creation functions to the warcwriter
add tests for custom record
2016-05-26 20:49:40 -07:00
Ilya Kreymer
responseloader: use PreparedRequest() to ensure url properly formatted
tests: update tests for latest, live data
2016-05-24 18:01:44 -07:00
Ilya Kreymer
webagg: tests: flush fakeredis for reentrancy
utils: add load_config() with option for main and override configs
2016-05-19 17:01:09 -07:00
Ilya Kreymer
recorder: add max_idle_secs / close_idle_files() to close any open files that have not been modified longer than set threshold, in prep for webrecorder/webrecorder#92
indexer: add 'full_warc_prefix' for setting full path prefix in add_warc_file() (eg. for http load) for webrecorder/webrecorder#95
2016-05-11 21:40:02 -07:00
Ilya Kreymer
app: separate json_encode() func
compat: py2 fixes
2016-05-11 11:38:59 -07:00
Ilya Kreymer
webagg: use werkzeug routing instead of wrapping Bottle app
2016-05-10 16:31:44 -07:00
Ilya Kreymer
test apps: enable debugging for test apps
test recorder: write to a temp dir for each run
2016-05-06 16:33:18 -07:00
Ilya Kreymer
urlrewrite: improve POST request support for ikreymer/pywb#178
2016-05-06 16:32:13 -07:00
Ilya Kreymer
cookie_tracker: add support for redis-based subdomain cookie tracker, which temp caches cookies with Domain= set in redis and passes them upstream
when rewriting. addresses webrecorder/webrecorder#79
2016-05-04 16:39:47 -07:00
Ilya Kreymer
recorer: actually fix content-type on warcinfo, add to test!
2016-04-30 13:07:53 -07:00
Ilya Kreymer
recorder: ensure warcinfo record has a content-type
2016-04-30 10:19:20 -07:00
Ilya Kreymer
webagg: responseloader: use urllib3 directly instead of requests to
take advantage of connection pooling w/o storing/sharing cookies
2016-04-27 10:16:54 -07:00
Ilya Kreymer
urlrewrite: refactor simpleapp to support live/record/replay
2016-04-27 10:15:48 -07:00
Ilya Kreymer
recorder: fix simplerec init
tests: improve tests for skipping request and response headers
2016-04-27 09:52:56 -07:00
Ilya Kreymer
tests: add basic test for rewriterapp
2016-04-25 14:29:28 -07:00
Ilya Kreymer
urlrewrite: remove dependency on bottle from rewriterapp,
add overridable error and query views, with extensible get_query_params() and process_cdx_query()
to extend cdx for query view
add get_top_url() for adding custom top_url for frame insert
add call_with_params() for adding custom params to environ
2016-04-25 12:05:43 -07:00
Ilya Kreymer
urlrewrite: add support for index query
2016-04-15 04:01:36 +00:00
Ilya Kreymer
urlrewrite: http range: support skipping record for range requests not starting at 0-
and performing async request,
support converting unbounded 0- to non-ranged and back
2016-04-15 02:21:39 +00:00
Ilya Kreymer
recorder warcwriter: allow skipping writing of only request or only response by overriding _is_write_req and _is_write_resp in subclass
(todo: rethink the interface)
2016-04-15 02:19:34 +00:00
Ilya Kreymer
webagg: add preliminary 'fuzzy matching' fallback support, currently enabled for all sources
(todo: need to only include sources that support it)
2016-04-15 02:18:20 +00:00
Ilya Kreymer
recorder: SkipDupePolicy only skips if url is an exact match (not just by urlkey)
2016-04-07 10:44:05 -07:00
Ilya Kreymer
urlrewrite: generalize support for overridable handle_custom_response() callback for handling modifiers (default support top-frame)
pass headers to add_custom_params, include error message on error if available
headers: use add_header() to support multiple headers with same name
is_ajax(): check for X-Pywb-Requested-With header to make as ajax and not pass to upstream
2016-04-07 10:39:12 -07:00
Ilya Kreymer
urlrewrite templates: add get_top_frame_params() callback for adding custom params for top frame,
also inject env['webrec.template_params'] if set
2016-04-05 02:45:00 -07:00
Ilya Kreymer
warcwriter: add create_warcinfo_record() for creating a warcinfo and a SimpleTempWARCWriter for writing records to temp buff/file
2016-04-03 12:19:54 -07:00
Ilya Kreymer
urlrewriter: allow passing in existing jinja_env wrapper
2016-04-02 21:36:54 -07:00
Ilya Kreymer
recorder: redis indexer accepts arg list, supports separate redis and key_template args
add length param to add_urls_to_index() in redis indexer, return cdx list
2016-04-02 21:36:36 -07:00
Ilya Kreymer
testutils: when mock patching FakeStrictRedis, use a subclass with a shared pubsub (to match real redis)
2016-04-02 21:33:39 -07:00
Ilya Kreymer
webagg: rename key_prefix -> key_template
2016-04-02 21:33:23 -07:00
Ilya Kreymer
ulrewrite: fix typos, add full package paths
2016-03-28 22:59:22 -07:00
Ilya Kreymer
urlrewrite app: add bottle-based app, templateview separate from pywb webapp framework
2016-03-27 17:34:45 -04:00
Ilya Kreymer
tests: fix fakeredis patch not running on test_handlers,
use exc str instead of repr for error message for consistency
all tests pass on py2 and py3 again!
2016-03-26 22:32:21 -04:00
Ilya Kreymer
webagg app: support bottle debug properly as opt param
2016-03-26 22:30:47 -04:00
Ilya Kreymer
recorder: close_file() by params rather than exact path, update tests
2016-03-26 13:07:53 -04:00
Ilya Kreymer
add urlrewrite pywb-adapter PlatformHandler for using traditional pywb
setup with webrecorder components recorder and webagg
2016-03-24 16:33:03 -04:00
Ilya Kreymer
inputreq: only use REQUEST_URI if no SCRIPT_NAME is set (otherwise reconstruct the path)
2016-03-24 16:17:46 -04:00
Ilya Kreymer
self-redirect: if 'status' is a 3xx, call raise_on_self_redirect() to check Location for exact url redirect.
supports both WARC and live loaders, addresses #1
2016-03-24 16:08:29 -04:00
Ilya Kreymer
tests: add FakeRedisTests class mixin for patching in FakeRedis for tests
2016-03-24 10:45:48 -04:00