1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 08:04:49 +01:00

Commit Graph

  • 10d9a6ac9a jinja templates: add 'templates' as default lookup dir, allow specifying custom dir via config. when specifiying custom paths, need not use full dir as per usual paradigm Ilya Kreymer 2015-04-04 12:53:07 -07:00
  • 1844597889 wburl to_uri: catch error on idna encode with very invalid urls Ilya Kreymer 2015-04-04 12:51:49 -07:00
  • f9bd2ba55a jinja template: use shared template in J2Template, init on first use Ilya Kreymer 2015-04-03 10:42:47 -07:00
  • 4a85869427 cli refactor: use classes in cli to allow custom options get rid of custom init for live_rewrite_handler, just use create_wb_router() with custom config for consistent init Ilya Kreymer 2015-04-03 10:13:27 -07:00
  • 6ba5163e72 jinja template: refactor jinja template setup, use a shared jinja env instead of a new env for every template can pass in an existing env via the config Ilya Kreymer 2015-04-03 10:03:12 -07:00
  • fcb6e94736 framework refactor: move rel_request_uri() call down to the routers, for easier reuse each router now calls ensure_rel_uri_set() to ensure that REL_REQUEST_URI field is set before use. allows router to be called directly without setup. add optional fallback_app to allow acting as middleware Ilya Kreymer 2015-04-03 08:45:18 -07:00
  • a34607764e manager: validate name on collection init: must start with wordchar and can contain wordchar or - Ilya Kreymer 2015-04-03 01:18:35 -07:00
  • 134b90eca5 bump version to 0.9.4-dev Ilya Kreymer 2015-04-03 00:45:12 -07:00
  • b379d15066 changelist fixes Ilya Kreymer 2015-04-01 17:07:27 -07:00
  • 6064b45bd4 set version to 0.9.3 Ilya Kreymer 2015-04-01 17:05:29 -07:00
  • b0773ca8b8 pywb_init: ability to override DirectoryCollsLoader with custom class Ilya Kreymer 2015-04-01 17:03:36 -07:00
  • 7a8a0e5244 update .gitignore Ilya Kreymer 2015-04-01 16:32:49 -07:00
  • 4d74d65b61 update CHANGES for 0.9.3 Ilya Kreymer 2015-04-01 16:32:13 -07:00
  • 8bd6787595 'inverse' framed replay: ensure memento headers point to actual memento in inverse framed replay add additional test for inverse framed replay, #92 fix framed replay url replace slash Ilya Kreymer 2015-04-01 16:21:44 -07:00
  • bd21480db9 framed replay: add supporting for 'inverting' frame and replay modifiers, setting default mod to be top-frame and inner frame to be 'mp_' #92 can enable this mode by setting framed_replay: inverse instead of true modifiers passed to client side script via wbinfo as well Ilya Kreymer 2015-04-01 10:13:56 -07:00
  • 546cd8ac3a frame redirect: only attempt redirect if in 'framed' mode (add flag to wbinfo) ensure both uris are decoded before comparing for top-frame redirect Ilya Kreymer 2015-04-01 09:13:55 -07:00
  • c378cb5188 rewrite: check for closed before any use of readline() (2.6 may throw if closed), only use readline() if line alignment needed (non-html), related to #86 work Ilya Kreymer 2015-04-01 07:54:17 -07:00
  • e806a33289 add unclosed script sample Ilya Kreymer 2015-04-01 07:13:51 -07:00
  • 8e60a6464c chunkeddatareader: read(): catch ValueError when attempting to read again in case stream is already closed Ilya Kreymer 2015-03-31 23:31:49 -07:00
  • 990af5ee79 rewrite: add extra test for rewriting html with <script> tag that's never closed Ilya Kreymer 2015-03-31 23:30:56 -07:00
  • c137dd30b8 misc fixes: remove extra debug logging add --framed option to 'live-rewrite-server' cli app Ilya Kreymer 2015-03-31 23:08:56 -07:00
  • 199f552f73 rewrite: if no charset specified, attempt to read first 1024 bytes and set charset in header, to avoid charset warning if head insert exceeds 1024 bytes (#86) also encode head insert with detected charset, if possible chunkeddatareader: add read() function to ensure read will read upto specified length across chunks Ilya Kreymer 2015-03-31 22:38:20 -07:00
  • 30ab27bb1c indexing: support indexing (and even replay of) records where target-uri is a 'urn:' identifier (#91) for canonicalzation, treat urns as is, already canonical for wburl, don't add http:// prefix if urn: prefix is present add example-wpull warc for testing Ilya Kreymer 2015-03-30 17:21:17 -07:00
  • 002fe6a338 certauth: change 'get_cert_for_host' -> 'cert_for_host' Ilya Kreymer 2015-03-30 15:47:53 -07:00
  • dd30e3f2a7 refactor: fixes for compat with latest certauth>=1.1.0 Ilya Kreymer 2015-03-30 09:38:42 -07:00
  • cda7705075 split and refactor: remove certauth.py / test_certauth.py and instead use this functionality from 'certauth' package. Also remove proxy-cert-auth cli as the 'certauth' tool superceeds this functionality. (#90). To use https proxy mode, 'pip install certauth' is required. (update travis config) Ilya Kreymer 2015-03-29 17:38:57 -07:00
  • 273176bce5 cdx: when reading cdxj, and run into non-ascii chars in url, utf-8 encode and %-encode Ilya Kreymer 2015-03-29 09:21:50 -07:00
  • fc9d659b5d loaders: switch BlockLoader to use requests instead of urliib2 Ilya Kreymer 2015-03-28 16:41:52 -07:00
  • f3a066f58b cdx-server query & zipnum: fixes for showNumPages query: - if query contained in <1 secondary index block, must read first line of cdx to determine if any matches - if no matches, don't throw 404 exception but always return json info with 0 pages Ilya Kreymer 2015-03-28 16:15:24 -07:00
  • 313a2efeac bump version to 0.9.3-dev Ilya Kreymer 2015-03-28 16:12:28 -07:00
  • c3a108b169 minor readme tweaks Ilya Kreymer 2015-03-27 09:26:53 -07:00
  • d2be90d4a1 test case tweak Ilya Kreymer 2015-03-27 08:56:43 -07:00
  • 41487dd9d4 update changelist for 0.9.2 cdx: include match type in cdx query error Ilya Kreymer 2015-03-27 07:58:51 -07:00
  • 8d686a4a98 README typos fix Ilya Kreymer 2015-03-26 19:56:09 -07:00
  • 6bbbb51f6e manager: relax template requirements, allow any collection template to also be added to shared dir Ilya Kreymer 2015-03-26 19:40:43 -07:00
  • 753300d5ed manager: use absolute path when adding warcs, (#84) Ilya Kreymer 2015-03-26 19:18:55 -07:00
  • 6ce75f80f5 replay: remove restricting to provided http Content-Length (in addition to record content-length) as it may be incorrect for variety of reasons Ilya Kreymer 2015-03-26 17:12:38 -07:00
  • 0a4e97baa1 revisit resolving: if cdx digest is missing, attempt to resolve revisits based on url + timestamp only, if warc-refers-to-target-uri and warc-refers-to-date are available, even if warc-refers-to-target-uri == target-uri (see #88 for more info) Ilya Kreymer 2015-03-26 14:20:08 -07:00
  • 85082e46bf cdxj: ensure revisit resolve is skipped if the digest is missing, as may be case in cdxj (#85) Ilya Kreymer 2015-03-26 11:11:10 -07:00
  • 2dbde35d74 bump to version to 0.9.2 Ilya Kreymer 2015-03-26 09:14:27 -07:00
  • cf4b5c50dd more README.rst fixes Ilya Kreymer 2015-03-25 22:08:53 -07:00
  • e8b6a1af88 README typo fixes Ilya Kreymer 2015-03-25 21:52:38 -07:00
  • 1cfe73c9db zipnum: fix block count off-by-1 error in showNumPages query Ilya Kreymer 2015-03-25 20:43:59 -07:00
  • 72ddb54f82 Minor README tweaks Ilya Kreymer 2015-03-25 15:01:12 -07:00
  • 3efbfaa8c8 pywb_init: simplify DictChain usage, remove unused methods Ilya Kreymer 2015-03-25 13:28:45 -07:00
  • f808f34ba7 Update CHANGES for 0.9.1 Ilya Kreymer 2015-03-25 12:16:26 -07:00
  • 0e8b305adc Update README to 0.9.1, add cdx api link, fix typo Ilya Kreymer 2015-03-25 12:06:05 -07:00
  • 15d1aea5ec Update README, improve existing collection instructions. Ilya Kreymer 2015-03-25 12:02:57 -07:00
  • a6c24c2882 autoindex: undo stop/join call for indexing, breaks os x unit test.. (autoindex test may need more improvements on windows) Ilya Kreymer 2015-03-25 11:09:17 -07:00
  • 90eee03cdb fixes for windows: indexing: ensure '/' always written to cdx autoindex: improved test case, ensure threads exit with join style: fix long lines Ilya Kreymer 2015-03-25 10:56:53 -07:00
  • a7307a6d98 pywb_init: auto-collections init: inherit shared archive_paths, if any are set in main config.yaml Ilya Kreymer 2015-03-25 09:36:00 -07:00
  • 6a3ca566db zipnum: cleanup shared location resolution, in addition .loc file, support a prefix resolver, where can be a regex replacement on the index path (default is unchanged index path) (#83) Ilya Kreymer 2015-03-25 09:07:54 -07:00
  • 1a8211d752 cdx server: add simplified matchType notation, using host* for prefix and *.host for domain matchType (#34) Ilya Kreymer 2015-03-24 19:49:54 -07:00
  • 2af5a25009 zipnum: support for pagination api! #34 and #83. cdx server now bounded by pageSize (default 10 blocks), showNumPages=true returns json indicating num pages, page=N can be set to page number 0-numPages - 1 loaders: add read_last_line() to read last line of a seekable file, used to read last line of index file when at end tests: additional test for binsearch boundary conditions zipnum: secondary index output supports json also Ilya Kreymer 2015-03-24 18:56:13 -07:00
  • 872607c07d README: move new features towards the top Ilya Kreymer 2015-03-24 10:56:56 -07:00
  • 3dd600c530 wombat: improve document.write override to write each elem at a time for body as well as head, #82 Ilya Kreymer 2015-03-24 10:46:10 -07:00
  • e5f321e32f bump version to 0.9.1 for further dev Ilya Kreymer 2015-03-23 20:21:09 -07:00
  • 57be9ca7bc tweak CHANGES.rst and INSTALL.rst for release 0.9.0 Ilya Kreymer 2015-03-23 17:38:22 -07:00
  • cda9f435a3 update README for final 0.9.0 release Ilya Kreymer 2015-03-23 17:34:16 -07:00
  • c93501e16d more changes.rst updates Ilya Kreymer 2015-03-23 16:29:18 -07:00
  • 500a441ea9 README tweaks and edits from Dragan (@despens) Ilya Kreymer 2015-03-23 16:16:16 -07:00
  • ec7a29a3ba static paths: ensure consistent renaming of static/default -> static/__pywb for bundled static path Ilya Kreymer 2015-03-23 16:15:37 -07:00
  • 5b4d12eb05 wombat: fix wombat_location.href assign when url is already rewritten, compare against current url not passed in url fixes ikreymer/pywb-webrecorder#9 Ilya Kreymer 2015-03-23 16:12:58 -07:00
  • 5020a09004 more CHANGES.rst updates Ilya Kreymer 2015-03-23 15:43:05 -07:00
  • 4aa6512b05 rewrite: fix WbUrl parsing for urls that start with a digit, eg. 1234.example.com split latest replay url from timestamped replay regex add additional rewrite tests Ilya Kreymer 2015-03-23 15:38:10 -07:00
  • 6acac67d3c rewrite: fix js rewrite again to ensure '// comments' are not rewritten as scheme-rel urls add tests Ilya Kreymer 2015-03-23 11:49:24 -07:00
  • bf0996c27a uwsgi: run with gevent loop by default, install gevent in run script Ilya Kreymer 2015-03-23 11:05:17 -07:00
  • da7532a1f8 wb-manager: rename 'migrate' to 'cdx-convert' for clarity Ilya Kreymer 2015-03-23 11:05:02 -07:00
  • 0faa6aac3e setup: set version in pywb __init__.py Ilya Kreymer 2015-03-23 11:04:41 -07:00
  • ced0ed208e Update CHANGELIST for 0.9.0 Ilya Kreymer 2015-03-23 10:48:58 -07:00
  • 7681b4a634 Update INSTALL.rst Ilya Kreymer 2015-03-23 10:36:37 -07:00
  • 317a6c6e8e Update INSTALL.rst Ilya Kreymer 2015-03-23 10:31:59 -07:00
  • 6d879c10bb README work Ilya Kreymer 2015-03-23 10:18:46 -07:00
  • 4cfeb6d958 More README tweaks Ilya Kreymer 2015-03-23 10:15:33 -07:00
  • e2623ed149 Update README.rst for latest update Ilya Kreymer 2015-03-23 09:52:07 -07:00
  • df76bc3500 cli: change cdx-server and live-rewrite-server to go through shared cli entry point Ilya Kreymer 2015-03-23 09:08:09 -07:00
  • ae363ad368 autoindex and cli: add autoindex to cli with 'wayback -a' option, #81 Ilya Kreymer 2015-03-22 23:03:39 -07:00
  • e8db31d066 cli: improve wayback cli to take optional port, threads and working dir arguments switch to waitress as default WSGI server instead of wsgiref Ilya Kreymer 2015-03-22 21:50:56 -07:00
  • 6a9a09d602 setup: add 'watchdog' as a dependency Ilya Kreymer 2015-03-22 18:24:56 -07:00
  • 733642551d manager: support autoindexing! (#91) wb-manager autoindex will use watchdog library to detect creation/updates to any warc/arc in specified collection or across all and update autoindex cdx cdx indexing: add --dir-root option to specify custom relative root dir for filenames used in cdx Ilya Kreymer 2015-03-22 17:55:38 -07:00
  • cc068f8ee8 init/import path: move DEFAULT_CONFIG to __init__ for faster shared import proxy: move certauth/openssl init to only happen in enable_https_proxy is set to make slow openssl import run only when used Ilya Kreymer 2015-03-22 17:52:07 -07:00
  • aa427bd6d0 rewrite: js regex: fix js rewrite regex to only match beginning of url for rewriting, since rewrite just adding prefix for abs urls in js use case. (avoid dealing with any invalid chars that may occur later in url) Ilya Kreymer 2015-03-21 13:58:36 -07:00
  • d31ff68b93 auto_init: resolve rel paths only on init only if not http (though should support other protocols eventually) Ilya Kreymer 2015-03-20 20:14:21 +00:00
  • b43a7f94f3 manager: add cdx -> cdxj migration tool #80, which will convert all cdxs in a directory to cdxj, removing original files migration will also recanonicalize the urlkey to surt form add migration test using non-surt, 9-field cdx (created from samples) cdxindexer: fix multi warc->multi cdx indexing options Ilya Kreymer 2015-03-19 20:52:00 -07:00
  • c5b5c8ee4b manager: fix index path to index.cdxj Ilya Kreymer 2015-03-19 13:41:48 -07:00
  • ea460bb0f0 cdxj: support cdx json output from cdx server with output='json' (not yet default) cdx field renaming: canonical cdx field name changes statuscode -> status mimetype -> mime original -> url old names still accept for query/filtering, however, cdx json will use new names ensures consistency between .cdxj field names and names used by cdx server json output collections manager now creates .cdxj by default bump version to 0.9.0b2! Ilya Kreymer 2015-03-19 13:29:29 -07:00
  • 5221cbc64a add cdxj sample Ilya Kreymer 2015-03-19 12:49:46 -07:00
  • fe1c32c8f7 cdxj: support loading cdxj (#76) cdx obj: allow alt field names to be used (eg. mime, mimetype, m) (status/statuscode/s) in querying and reading cdx cdx minimal: (#75) now implies cdxj to avoid more formats minimal includes digest always and mime when warc/revisit tests for cdxj loading indexing optimization: reuse same entry obj for records of same type Ilya Kreymer 2015-03-19 11:20:40 -07:00
  • 73f24f5a2b manager: fixes for windows: use shutil.move instead of os.rename to allow move to existing file tests: reset workdir before deleting temp dir Ilya Kreymer 2015-03-18 13:14:05 -07:00
  • 3f084625b0 indexing: cdx json support (#76): use OrderedDict when indexing json to ensure consistent ordering skip empty or '-' fields add tests for cdx json Ilya Kreymer 2015-03-17 21:11:35 -07:00
  • 6f9808f090 indexing: refactor ArchiveIndexEntry to be a dict instead of adding attrib. Allows for better track of indexed properties. Add json-based cdx! (cdxj) output where all fields except url + key are in json dict. Support for both minimal and full json cdx, tracked via #76 Ilya Kreymer 2015-03-08 12:01:24 -07:00
  • bfe590996b auto-config: add support for loading from root ./static/ directory, available under /static/__shared/ path default path changed from /static/default -> /static/__pywb/ rename wayback-manager to wb-manager Ilya Kreymer 2015-03-17 19:05:39 -07:00
  • 0b8fd1e82e fix readme typos Ilya Kreymer 2015-03-17 09:28:46 -07:00
  • 0345e36daa readme: improve samples section Ilya Kreymer 2015-03-17 01:13:10 -07:00
  • 5b7215a6b1 readme tweaks and typo fixes Ilya Kreymer 2015-03-17 01:06:06 -07:00
  • 32ed176988 Update CHANGELIST for 0.9.0b1 Ilya Kreymer 2015-03-17 00:39:24 -07:00
  • e9e0412e1d More README tweaks Ilya Kreymer 2015-03-17 00:28:14 -07:00
  • a60a735bd0 Update INSTALL.rst for 0.9.0 Ilya Kreymer 2015-03-17 00:14:10 -07:00
  • ab89ecd445 Brand new README for 0.9.0! Ilya Kreymer 2015-03-17 00:01:32 -07:00
  • 4b45e789df templates: ensure shared templates are loaded from root templates/ subdir manager: add shared templates to templates subdir, not root dir #55 and #74 Ilya Kreymer 2015-03-16 19:57:28 -07:00