1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-04-02 03:56:13 +02:00

14 Commits

Author SHA1 Message Date
Ilya Kreymer
52ca95eba5 redis: redisindexsource and pathresolver:
- for wildcard/multi-key lookup, support redis hashmap as well as redis set to be used as member lookup key
- if using hashmap, the propery names are used for lookup
- track type of redis key in RedisIndexSource
tests: add tests with set and hashmap member keys
2018-01-28 18:17:51 -08:00
Ilya Kreymer
8fea623c52 optimization: redisindexsource scan_keys: use cached key list, if available
bump requirements to gevent 1.2.2
2017-08-21 22:30:25 -07:00
Ilya Kreymer
c6d196c9fe misc test improvements:
- add tests for WBMementoIndexSource, member-list based RedisIndexSource
- convert redis aggregator and index source tests to use testutils BaseTestClass system
- rename configwarcserver -> warcserver
2017-08-09 12:17:50 -07:00
Ilya Kreymer
36abd032ce warcserver: logging: use 'warcserver' logger for index and response load errors
wbmementoindexsource: use timegate_url for initial head query to allow for different urls (proxy, etc..)
2017-07-03 23:25:25 -07:00
Ilya Kreymer
f3487a1922 indexsource: use logging for failure reports
don't add connection: close by default now that better pooling is in place
2017-07-02 17:09:01 +00:00
Ilya Kreymer
84eb070938 warcserver: support different default adapters, for live web and remote sources
warcserver.http.DefaultAdapters.live_adapter used if is_live, else DefaultAdapters.remote_adapter
tests: fix test to ignore order in dir listing
2017-07-02 03:58:55 +00:00
Ilya Kreymer
324a36b5b7 indexsource: if filtering enabled, live index source can check status and mime (excluding fuzzy match)
cdxops: cleanup filtering, move class to CDXFilter, avoid ambiguous naming
2017-06-30 17:57:07 -07:00
Ilya Kreymer
dd7c1bd752 warcserver: define default HTTPAdapter in warcserver.http.default_adapter, for use with both index sources and responseloader
responseloader uses existing pool from shared HTTPAdapter
fix tests: call_release_conn() checks if release_conn() exists before calling, else default to close()
2017-06-29 22:33:16 -07:00
Ilya Kreymer
1bd8a85a4d mementoindexsource: add 'connection: close' to ensure connection closed after memento timegate query!
io utils: StreamIter() supports custom closer
responseloader: use release_conn() instead of close() to recycle urllib3 connections!
2017-06-29 20:03:42 -07:00
Ilya Kreymer
9bda61cab5 mementoindexsource improvements:
- use shared session for timegate/timemap queries
- catch timegate query exceptions and treat as not found
- skip fuzzy match queries (ensure 'is_fuzzy' is set on params)
wbmementoindexsource improvements:
- fix errors related to exception handling
- hook up 'wb-memento' config, add tests
jsonp_rewriter: fix typo
2017-06-29 19:08:44 -07:00
Ilya Kreymer
d12f715d81 refactor: split warcserver.utils into utils package:
- utils.io for stream/compression related utils
- utils.format for string formatting
- utils.memento for memento
- load_config -> utils.loaders.load_overlay_config
- also: use warcio.utils.to_native_str instead of utils.loaders.to_native_str
2017-06-05 17:43:46 -07:00
Ilya Kreymer
3bd682e3d3 Merge branch 'aggregator-improvements' into refactor2 2017-06-05 16:22:49 -07:00
Ilya Kreymer
dbc56b864b Merge branch 'aggregator-improvements' into refactor2 2017-06-02 21:33:23 -07:00
Ilya Kreymer
ad33dc6728 refactor: webagg -> warcserver rename
- ResAggApp -> BaseWarcServer
- AutoApp -> WarcServer
- move index related files to warcserver.index package, tests to warcserver.index.test
- move resource loading related files to warcserver.resource package, tests to warcserver.resource.test
- pywb.cdx -> pywb.warcserver.index
- split pywb.warc -> pywb.warcserver.resource or pywb.indexer (for cdx generation)
- bump to 0.51.0 for now!
- tests for pywb.warcserver should be working
2017-05-23 09:21:43 -07:00