1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 00:03:28 +01:00

Commit Graph

  • 304ddbec84 Support for new UI, as per #16 * Refactor views class to support more Jinja2 views (J2Template) * Add a home page, collection search page, and error pages, all optional * all exceptions appear on error page * wbrequest supports a request with an empty or / wb_url Ilya Kreymer 2014-01-31 10:04:21 -08:00
  • 57fe9515db - support for running uwsgi with virtualenv - text changes in banner - some info about testing in README Ilya Kreymer 2014-01-29 17:23:19 -08:00
  • 467d880681 update README Ilya Kreymer 2014-01-29 15:15:39 -08:00
  • 53eb5072ec more README tweaks Ilya Kreymer 2014-01-29 15:12:57 -08:00
  • 28618c69c6 update query.html, listing unique timestamps update README Ilya Kreymer 2014-01-29 15:07:45 -08:00
  • e7b70ae496 fix links in README Ilya Kreymer 2014-01-29 12:08:51 -08:00
  • f45234f39b README tweaks Ilya Kreymer 2014-01-29 12:07:33 -08:00
  • a6cfe9a87b update README.md Ilya Kreymer 2014-01-29 12:01:03 -08:00
  • 937fc7229e update README, fix typo Ilya Kreymer 2014-01-29 02:12:58 -08:00
  • 9cde058ccf check for osx uwsgi path and use that, otherwise run 'uwsgi' Ilya Kreymer 2014-01-29 02:12:54 -08:00
  • 84ffec9b8d update README.md Ilya Kreymer 2014-01-29 01:52:30 -08:00
  • eb9cef9e28 update README.md for beta!!! Ilya Kreymer 2014-01-29 01:36:31 -08:00
  • 7a20d26d5f support non-surt ordered cdx add unsurt() util func and surt_ordered init param to LocalCDXServer test make_best_resolver() Ilya Kreymer 2014-01-29 00:58:37 -08:00
  • 9a3449dfd5 add pyyaml to dependency Ilya Kreymer 2014-01-29 00:04:54 -08:00
  • 411e7fe8a3 cleanup pywb_init, work on documenting config.yaml! Ilya Kreymer 2014-01-29 00:03:24 -08:00
  • 43a46b373d move sample/test data to ./sample_archive/warcs and ./sample_archive/cdx pywb_init now driven by config.yaml! (#14) Ilya Kreymer 2014-01-28 22:03:01 -08:00
  • 35f7cb0477 new-feature: support jinja2 template generated banner template receives cdx and wbrequest default template inserts capture time into banner Ilya Kreymer 2014-01-28 20:18:47 -08:00
  • 6de794a4e1 style fixes: convert camelCase func and var names to 'not_camel_case' WbHtml -> HTMLRewriter ArchivalUrl -> WbUrl Ilya Kreymer 2014-01-28 19:37:37 -08:00
  • c0f8edf517 more refactoring: seperate top-level handlers (WBHandler) from views (html, text) Add CDXHandler for interfacing with cdx server directly, #12 Ilya Kreymer 2014-01-28 17:23:44 -08:00
  • 1a234f2953 refactor: remove intermediate query object. rename query -> views wbhandler queries index, replayer and renders via view Ilya Kreymer 2014-01-28 16:41:19 -08:00
  • a83d527702 add surt to dependency list Ilya Kreymer 2014-01-27 22:07:27 -08:00
  • a6458b056f some tweaks on transfer-encoding: always remove and serve unchunked (should allow front-end serve can rechunk as needed) Ilya Kreymer 2014-01-27 22:05:49 -08:00
  • 8732499dd5 - cdx server bootstrap configured, #12 - pywb_init module inits from ./test directory Ilya Kreymer 2014-01-27 21:46:38 -08:00
  • c55bdf0e1f -binsearch: add tests, support both prefix and exact loading, for #11 -cdx server first pass for #12: implement cdx parsing and transforming -operations supported: merge sort, regex filter, resolve revisits, closest sort, reverse sort, timestamp collapse timestamp parsing utils Ilya Kreymer 2014-01-27 17:02:48 -08:00
  • e1b669fdea improved customization: can setup pywb_init.pywb_config() config, or specify custom init module <initmodule>.py_config() by setting PYWB_INIT=<initmodule> fix run.sh to support testing with custom mount point Ilya Kreymer 2014-01-24 12:25:27 -08:00
  • 44f68158a9 update README and comments Ilya Kreymer 2014-01-24 01:17:18 -08:00
  • 1033feb2f8 use sample settings if driver file not found Ilya Kreymer 2014-01-24 00:59:15 -08:00
  • 391f3bf81d remove pycdx_server pkg for now, move binsearch into pywb package, update setup.py Ilya Kreymer 2014-01-24 00:54:48 -08:00
  • 03b6938b9c referer fallback: check for non empty SCRIPT_NAME when parsing referrer Ilya Kreymer 2014-01-24 00:53:55 -08:00
  • 94326dafc1 html_rewriter: default attrs without value to empty str value, instead of no value Ilya Kreymer 2014-01-24 00:49:51 -08:00
  • 5987a0c047 update README.md! Ilya Kreymer 2014-01-23 16:30:37 -08:00
  • cbf0e23ad9 add .travis.yml for Travis CI! Ilya Kreymer 2014-01-23 16:20:51 -08:00
  • e95e17b9e6 pycdx_server initial binsearch module, with support exact match iterator! fix html_rewriter missing ; on entities js rewriter: only rewrite full document.domain PathIndexPrefixResolver using binsearch on path index, for #9 resolvers moved to replay_resolvers.py Ilya Kreymer 2014-01-23 01:38:09 -08:00
  • b237b144ff further refactor steaming of responses related to #13: always create a generator from response stream, and if buffering, read entire generator into temp buffer remove duplicate reading logic Ilya Kreymer 2014-01-22 17:55:55 -08:00
  • 2d0cb5745d enable bulk doctest testing via nosetests --with-doctest as well as individual doctests andd utils.enable_doctests() func which checks if executing app is nosetests (is there a better a way?) Ilya Kreymer 2014-01-22 15:28:01 -08:00
  • 7722014a96 Cleanup rewrite interfaces to address #13 All rewriters can support either buffered or streaming mode. In buffered mode, the full text content is written into a buffer and served with a Content-Length in streaming mode, text is streamed as it is rewritten and no Content-Length is written Default is to stream the response Ilya Kreymer 2014-01-22 14:03:41 -08:00
  • 33c135b337 Merge pull request #7 from jcushman/master ikreymer 2014-01-21 19:23:03 -08:00
  • 6581f54fad Robust chunked data exception handling. Jack Cushman 2014-01-21 20:00:52 -05:00
  • a1cd40fba1 support replay of records that have Transfer-Encoding: chunked, but were not actually rewritten to the warc as chunked. Attempt to parse chunk length, and if failed, fallback to treating record as not chunked Ilya Kreymer 2014-01-20 23:06:45 -08:00
  • 8fd10673e8 refactor: cleanup the revisit resolving logic in replay also, update documented logic on wiki at: https://github.com/ikreymer/pywb/wiki/PyWb-Record-Lookup-and-Revisits Ilya Kreymer 2014-01-20 17:52:14 -08:00
  • 9a28a2ec6e Merge pull request #6 from jcushman/master ikreymer 2014-01-20 13:08:35 -08:00
  • 903583c3d7 Handle ArchivalUrl subclasses. Jack Cushman 2014-01-20 14:12:59 -05:00
  • 9ff3fc300b Fix #5, bringing back customParams optional params sent to cdx server Rename archivalrouter.MatchRegex -> archivalrouter.Route, supporting regex/prefix matching add redir_to_exact to turn off redirect to exact timestamp in RewritingReplayHandler update README Ilya Kreymer 2014-01-20 10:50:06 -08:00
  • 2056f56112 Merge 81d34c6423dee940d5663053f0ad62e0ae0c7e1a into 80b2585d22a3362d00d69273e2ca4d3381d442cd jcushman 2014-01-20 10:32:38 -08:00
  • 80b2585d22 Should resolve #4 -- supports pywb running as a non-root app * Instead of relying on REQUEST_URI, pywb constructs a REL_REQUEST_URI, from PATH_INFO + QUERY_STRING. SCRIPT_NAME auto-added to prefix * MatchPrefix is now superceded by MatchRegex, which can match a plain string -- collId defaults to the full match * Added optional archivalurl_class to router to allow for customized ArchivalUrl implementations to be specified * run.sh can test on a non-root mountpoint, eg. ./run.sh "/approot" Ilya Kreymer 2014-01-19 21:13:48 -08:00
  • 2e4d78d079 request_uri: only generate REQUEST_URI manually if not provided by wsgi framework only encode chars that are not allowed in path segment, per http://tools.ietf.org/html/rfc3986#section-3.3 Ilya Kreymer 2014-01-19 16:51:17 -08:00
  • 628c130261 Merge pull request #3 from jcushman/master ikreymer 2014-01-19 16:00:13 -08:00
  • 81d34c6423 Re-enable customParams. Jack Cushman 2014-01-19 16:56:09 -05:00
  • d8c47415c0 Merge branch 'master' of https://github.com/jcushman/pywb Jack Cushman 2014-01-19 16:25:17 -05:00
  • 595c9b0c3c wsgiref compatibility fixes. Jack Cushman 2014-01-19 15:08:14 -05:00
  • 6cb1743163 Merge branch 'master' of github.com:ikreymer/pywb into work Ilya Kreymer 2014-01-19 12:31:53 -08:00
  • 354040a7e0 support for url-agnostic dedup, eg loading payload from a different url than the revisit Ilya Kreymer 2014-01-19 12:31:19 -08:00
  • 3f04f63a3f wsgiref compatibility fixes. Jack Cushman 2014-01-19 15:08:14 -05:00
  • ab955c411b Merge pull request #2 from jcushman/master ikreymer 2014-01-19 12:05:57 -08:00
  • c9d0b0ba7b Handle transfer-encoding:chunked; misc. replay bugs. Jack Cushman 2014-01-18 21:32:49 -05:00
  • 7ce6d0d22b first pass on html rendering via jinja, support for query (cdx) rendering Ilya Kreymer 2014-01-17 16:24:36 -08:00
  • bcc9588c00 * archivalrouter: to take a list of handlers, currently MatchPrefix and MatchRegex. handler returns a single response (no chaining for now) * rewriting: don't rewrite anchor only urls * perf: add a very basic profiler in WBHandler for testing Ilya Kreymer 2014-01-16 20:33:51 -08:00
  • 8ff2f2fc0c update gitignore Ilya Kreymer 2014-01-06 21:57:33 -10:00
  • bc104321c4 Update README.md Ilya Kreymer 2014-01-04 06:12:27 +00:00
  • c60493bfdc update README.md Ilya Kreymer 2014-01-04 05:55:17 +00:00
  • c4457abc4c Update README Rename FullHandler -> WBHandler Add additional comments! Ilya Kreymer 2014-01-03 21:44:20 -08:00
  • d820a8c06a add some comments, make charset parsing lower() Ilya Kreymer 2014-01-03 17:40:20 -08:00
  • c255f4e47f fix typos Ilya Kreymer 2014-01-03 17:04:15 -08:00
  • 246b3fba43 cleanup, setup runnable testwb, or pluggable 'globalwb' Ilya Kreymer 2014-01-04 00:21:52 +00:00
  • c3767cd31b fix css url parsing typo always default to utf-8 if chardet thinks ascii tweak banner Ilya Kreymer 2014-01-03 21:38:18 +00:00
  • 1e03cad25c update setup.py, static files Ilya Kreymer 2014-01-03 13:06:27 -08:00
  • 2357f108a3 rename rewriters header_rewriter added! support for encoding detection various fixes xmlrewriter Ilya Kreymer 2014-01-03 13:03:03 -08:00
  • edbcaaf108 big update: refactor archiveloader, StatusAndHeaders obj and StatusAndHeaders parser remove dependency on hanzo Add sample example.warc.gz for very basic unit testing Ilya Kreymer 2014-01-02 20:21:18 -08:00
  • cca9071c53 minor tweaks, increase num closest searched, upper case url check css remove fixed pos Ilya Kreymer 2013-12-31 21:01:18 +00:00
  • d9930322f1 support utf-8 (so far) support protocol-agnostic prefix // failedFile list for warc loading Ilya Kreymer 2013-12-31 00:18:12 +00:00
  • b8c4a453c9 wbhtml: add utf-8 tests Ilya Kreymer 2013-12-29 22:42:29 -08:00
  • 997dc5df0f fixes! Fix typos, in html parsing, fix base, support attrs w/o values Ilya Kreymer 2013-12-30 03:03:33 +00:00
  • a84ec2abc7 first iteration of archival mode working w/ banner insertion!! Ilya Kreymer 2013-12-28 17:39:43 -08:00
  • 16f458d5ec archiveloader: Support for loading warc/arc records using hanzo parser (for record header parsing only) ReplayHandler: load replay from query response, find best option basic support for matching url, checking self-redirects! Ilya Kreymer 2013-12-28 05:00:06 -08:00
  • 787dfc136e wbhtml: add script and style doctests override close() to handle open <script> and <style> tags by forcing an end tag, otherwise parser does not process the remainder Ilya Kreymer 2013-12-24 22:51:33 -08:00
  • 6050ea1ffa standard JS and CSS rewriting working, with generic regex rewriter which supports extensions! Ilya Kreymer 2013-12-23 23:57:13 -08:00
  • 3a896f7cd3 move norewrite prefixs down to ArchivalUrlRewriter (was in html parser) Add new general regex match work, (several attempts, though last one is simplest/best!) Ilya Kreymer 2013-12-23 15:52:33 -08:00
  • 37e57f7013 html parser fleshed out! Ilya Kreymer 2013-12-22 18:12:05 -08:00
  • fbf29e80d6 add html parser! urlrewriter support for changing modifier Ilya Kreymer 2013-12-20 19:11:52 -08:00
  • 072befe3c8 archivalrouter: support handler chaining, using call convention and pass prev response Ilya Kreymer 2013-12-20 15:10:12 -08:00
  • 4cf4bf3bbb add wburlrewriter, ReferRedirect uses the rewriter more refactoring, ReferRedirect moved into archivalrouter module wbrequest: parses from uri directly, keeps track of wburl and prefix Ilya Kreymer 2013-12-20 14:54:41 -08:00
  • 0a2b16407d better exception handling, specific status codes for exceptions, detect access control and not found exceptions more consistently Ilya Kreymer 2013-12-19 12:06:47 -08:00
  • ebc76c0791 update readme Ilya Kreymer 2013-12-18 18:57:55 -08:00
  • c8d2271e8a archiveurl: add support for url_query, format modifier for more unit tests archivalrouter: flesh out router seperately indexreader: RemoteCDXServer reader unit tests for req/resp wbapp -- cdx output for query, urlquery, replay and latest_replay! Ilya Kreymer 2013-12-18 18:52:52 -08:00
  • 5d42cc0cac rename aurl -> archiveurl, add default scheme, test for empty url Ilya Kreymer 2013-12-13 15:43:07 -08:00
  • 6b78f59e49 Merge branch 'master' of github.com:ikreymer/pywb Ilya Kreymer 2013-12-13 15:22:06 -08:00
  • 27b35f31e8 add basic wsgi app for parsing archivalurls, fallback on a referrer based redirect Ilya Kreymer 2013-12-13 15:20:13 -08:00
  • d546fcc82c Merge pull request #1 from nlevitt/setup.py ikreymer 2013-12-09 20:26:14 -08:00
  • 89481f162e setuptools config Noah Levitt 2013-12-09 11:58:50 -08:00
  • b10f0cd041 switch to IRI Ilya Kreymer 2013-12-08 19:44:14 -08:00
  • 10bf465367 add aurl.py with a few tests Ilya Kreymer 2013-12-08 19:31:58 -08:00
  • 0dc56ee074 Initial commit ikreymer 2013-12-08 19:30:31 -08:00