1
0
mirror of https://github.com/webrecorder/pywb.git synced 2025-03-15 00:03:28 +01:00

161 Commits

Author SHA1 Message Date
Hellseher
b44c93bf6e
requirements: Adjust installation of Py3AMF module. (#920)
Move Py3AMF from setup.py load_requirements to requirements.txt

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-11-07 12:09:35 -05:00
Ed Summers
b4955cca66
Upgrade dependencies (#839)
- Update and pin dependencies to specific versions that support Python 3.7-3.11
- Replace deprecated werkzeug.pop_path_info with wsgiref.shift_path_info
- Use the latest httpbin from psf/httpbin
- Remove unused flask test dependency
- Drop Python 2 and Python <3.7 support
- Ensure greenlet 2 is used for now, as psf/httpbin doesn't yet work with greenlet 3

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-04-02 17:16:50 -04:00
Ilya Kreymer
2ccd8eb2c3
tests run improvements: update from python setup.py test -> tox (#754)
* tests cleanup:
- move test requirements to test_requirements.txt to share between setup.py and tox.ini
- README: update to recommend using 'tox --current-env' for running tests locally
- replaces #741

* test tweaks:
- don't require i18n to import locmanager, instead set flag on load (to avoid breaking tox / pytest)
- don't add werkzeug to test requirements
2022-08-31 16:04:55 -07:00
Ilya Kreymer
cff2a9efc5
more locale fixes: (#653)
* more locale fixes:
- fix running wb-manager w/o i18n dependencies
- dependencies: move babel to extra_requires, show warning if locale used or 'wb-manager i18n' called and i18n are not installed
- not found page: don't language switch header banner on nested content frame
2021-06-18 14:58:21 -07:00
Ilya Kreymer
f7bd84cdac
Localization / doc fixes (#650)
* localization / doc fixes:
- add missing header.html
- docs: support 'i18n' extra, mention in docs
- use 'default_locale' for html lang tag
- access control docs: fix documentation for adding user with acl command

* localization: add compile_catalog after extract as well to simplify updates for identity (en) locale

* ui: 
- include locale in home page collection listing
- keep locale on error page home link

* autoescape:
- ensure jinja2 templates are autoescaped to prevent xss issues (thanks @sebastian-nagel for suggested fix)
- ensure banner inserts are not double-escaped
- update tests for template autoescaping

* update CHANGES.rst

* bump version to 2.6.0b1
2021-06-14 17:09:00 -07:00
Ilya Kreymer
0eedd1502f remove fakeredis from tests_require, fixes #644 2021-06-09 12:41:08 -07:00
Jon Betts
ad9b431eaf
Update the classifiers to match the Python factors in tox.ini (#634)
This advertises the Python support that is already in place.
2021-04-26 21:00:18 -07:00
Ilya Kreymer
54d8bccf4a setup/pypi: drop 'text/rst' as pypi doesn't like it 2020-07-10 20:54:08 -07:00
Ilya Kreymer
bb1c2a3ec9
Fix logo (#575)
* pypi fixes: fix README to use logo w/o raw to work on pypi, resize logo

* update changelist, use abs path
2020-07-10 20:46:18 -07:00
Ilya Kreymer
f0b9d5b8e8
Rewriting fix for DASH FB and document.write (#529)
* rewrite fixes:
- dash rewrite fix for fb: when rewriting, match quoted '"dash_prefetched_representation_ids"' as well as w/o quotes,
update tests to ensure rewriting both old and new formats
- wombat update to fix #527: ensure document.write() doesn't accidentally remove end-tag if end-tag was not lowercase (see webrecorder/wombat#21)

* tests: fix recorder cookie filtering test, use https://www.google.com/ for testing

* appveyor: fix appveyor builds
2020-01-11 10:44:49 -08:00
John Berlin
61b6ff21e1
added missing comma to setup.py's tests_require list
removed package.json from project as it is no longer required
removed npm install command from .travis/install.sh
2019-09-04 13:41:56 -04:00
Ilya Kreymer
b8124e3931
lxml query parsing fix: (addressing part of ukwa/ukwa-pywb#38)
- ensure lxml-enabled parsing in XmlQueryIndexSource works by passing the raw bytestring instead of unicode text to the parser
- tests: add lxml and non-lxml parsing tests to test_xmlquery_indexsource.py, add lxml to test install
- misc fixes: fix typo in banner.html, update gevent api to support latest gevent
2019-09-03 18:13:19 -04:00
Ilya Kreymer
11610f6e04
2.3 Changelist + Docs Update (#487)
* docs: update changelist and add docs about new wombat

* update to latest wombat

* update wombat, fix pytest cmdline in setup
2019-07-09 17:50:57 -07:00
Ilya Kreymer
b90ee427cf
Docker Improvements (#446)
* Misc improvements, including fixes from @funkyfuture:
- Dockerfile: Reduces number of created layers and source contents
- Support for automatic collection creation if INIT_COLLECTION is defined
- Add entry point script docker-entrypoint.sh
- update to latest python (3.7.2 currently)
- additions to .dockerignore
- setup.py and requirements cleanup (just use plain 'gevent' requirement)

* docker-entrypoint.sh improvements:
- before running cmd, match uid/gid to that of volume dir (specified via $VOLUME_DIR, defaulting to /webarchive)
- if volume is owned by root (default if none mounted), just run as root
- if volume is owned by different user, create/update user 'archivist' to match the uid/gid of $VOLUME_DIR, then run cmd as 'su archivist'
2019-02-27 09:13:38 -08:00
Ilya Kreymer
c86add9b40 setup: use 'fakeredis<1.0' until fully ported to new fakeredis version 2019-01-27 14:26:50 -05:00
John Berlin
ec0df7b9ae Refactor of auto-fetch worker system with support for proxy mode, fixes https://github.com/webrecorder/pywb/issues/371: (#379)
- Split wombat and auto-fetch worker into two files (proxy mode and non-proxy mode)
- Renamed preservationWorker to autoFetchWorker in order to better convey what it does
- Root config file control over including wombat and auto-fetch worker in proxy or non-proxy mode
- Added additional proxy mode + auto-fetch worker only route for fetching the auto-fetch worker code nicely for CORS
- templateview: add 'tobool' formatter to more cleanly format python bools to JS 'true'/'false'
- proxy options: config and command line: 
  'use_auto_fetch_worker' and '--proxy-with-auto-fetch'
  'use_wombat' and '--proxy-with-wombat'
- head_insert.html: only include wombat in proxy mode when use_wombat or use_auto_fetch_worker are set.
- wombatProxyMode.js: slimmed down wombat for proxy mode only including auto-fetch support.
- more consistent naming: rename 'preserveWorker' and 'autoArchive' to 'auto-fetch'

Updated tests:
- test_wbrequestresponse.py: added tests covering constructor defaults, _init_derived, options_response, json_response, encode_stream, text_stream
- test_auto_colls.py: fixed broken test test_more_custom_templates, reason using ujson now not json so spacing was off
- test_proxy.py: updated existing tests to reflect splitting wombat into proxy and non-proxy mode, added tests covering auto-fetch worker specific endpoints in proxy mode
removed duplicate addons key in .travis.yml
- test_cli.py: updated to properly test the cli with these changes
added ultrajon dep to tests_require in setup.py to reflect its usage by wbrequestresponse.py

Fully documented:
- cli.py
- frontendapp.py
- templateview.py
- wbrequestresponse.py

Removed duplicate addons key in .travis.yml
Added ultrajson dependency to tests_require in setup.py to reflect its usage by wbrequestresponse.py

Fixes #371
2018-10-03 16:27:49 -04:00
humberthardy
dc883ec708 Handle amf requests (#321)
* Add representation for Amf requests to index them correctly

* rewind the stream in case of an error append during amf decoding. (pyamf seems to have a problem supporting multi-bytes utf8)

* fix python 2.7 retrocompatibility

* update inputrequest.py

* reorganize import and for appveyor to retest
2018-05-21 19:29:33 -07:00
Ilya Kreymer
bef63b4c6c
Local httpbin tests + LiveIndexSource improvement (#318)
tests and LiveIndexSource improvements:
- run local instance of httpbin in separate gevent server for any httpbin.org requests
- LiveIndexSource: has overridable get_load_url(), also use 'load_url' for HEAD check, remove unused proxy_url
- test update: add HttpBinLiveTests which patches LiveIndexSource.get_load_url() to redirect httpbin requests to local instance
- test update: just use httpbin.org/get instead of httpbin.org/anything, unsupported in older version (0.5.0) require for windows support
- setup: add 'httpbin==0.5.0' to test requires, remove jinja2 pin to old version
2018-04-28 18:20:37 -07:00
Ilya Kreymer
e2fa14bc2d tests: add 'importorskip' for tests that require 'extra' dependencies, (youtube-dl, socks), addresses #270
setup: remove 2.6 classifier, update repo path
bump to 2.0.1
2018-01-30 18:26:53 -08:00
Ilya Kreymer
610b6414a3 setup: fix for pypi build: remove obsolete 'provides=', package init from tests_disabled 2018-01-30 08:37:51 -08:00
Ilya Kreymer
8304762cf5 setup.py: exclude tests_disabled from install 2018-01-30 08:17:02 -08:00
Ilya Kreymer
34902df80c docs work and misc:
- set depth in main toc to 3
- add info on cli apps in apps.rst
- fix typos, update links
setup: add 'pywb' cli script to be same as 'wayback'
appveyor: remove coveralls
2018-01-29 18:36:14 -08:00
Ilya Kreymer
85f093e356
Fix Query UI (#278)
* query fix:
setup: ensure all static files included in package_data recursively to add new query assets
test: add test for nested static asset
query: correctly display 0 captures, 'capture' and 'captures' text moved to Text block
2018-01-15 19:54:15 -08:00
Ilya Kreymer
61f825330c Docs Update (#256)
* docs work:
- write warcserver and beginnings of recorder docs!
- add cdx api docs!
- add indexing docs
- refactor architecture section, remove readme
- update readme with better new features list, work-in-progress list
- add placeholder docs for apps, indexing
- remove unused readme
- update README with better docs link, features
2017-10-18 10:12:44 -07:00
Ilya Kreymer
903fa6c6a2 renaming pass:
- webagg->warcserver
- setup.py: packages and entry points
- templateview param: 'webrec.template_params' -> 'pywb.template_params'
2017-10-01 10:09:17 -07:00
Ilya Kreymer
582966bb2f rewriterapp: add 'matchType=exact' to avoid edge case issues
setup: fix cdx-indexer cli entry point
2017-06-20 20:42:03 -04:00
Ilya Kreymer
58f39f0558 setup: update to warcio==1.2
add ensure_http_headers=True when reading WARC records
tests: fix pytest warnings, use webtest.TestApp instead of TestApp
2017-04-29 13:47:54 -07:00
Ilya Kreymer
26662f7df3 setup: generate current git_hash into autogenerated 'pywb.git_hash' file, add to .gitignore 2017-03-28 10:31:43 -07:00
Ilya Kreymer
544df71302 setup: use latest webtest again
tests: use geventwebserver for LiveServerTests instead of separate process
2017-03-10 11:19:27 -08:00
Ilya Kreymer
d4321792b7 tests: convert test_inputreq to use werkzeug (same as the app), remove bottle from test dependencies 2017-03-08 23:09:19 -08:00
Ilya Kreymer
e86e3e6d32 build process: simplify build process by moving essential deps to requirements.txt, and extras to extra_requirements.txt
setup.py just loads from requirements.txt
Dockerfile pip installs requirements, then extra requirements for improved cacheing
travis runs setup install, then installs extra requirements
2017-03-08 17:05:29 -08:00
Ilya Kreymer
0784e4e5aa spin-off warcio!
update imports to point to warcio
warcio rename fixes:
- ArcWarcRecord.stream -> raw_stream
- ArcWarcRecord.status_headers -> http_headers
- ArchiveLoadFailed single param init
2017-03-07 10:58:00 -08:00
Ilya Kreymer
2a7da54be9 Merge branch 'master' into new-pywb 2017-02-17 18:49:01 -08:00
Ilya Kreymer
31bf7a47f1 new-wayback cli script, using new FrontEndApp (rewriting) + AutoConfigApp (config-driven aggregator)
support for dynamic collections: check all .cdxj files in /<coll>/indexes/*.cdxj when accessing /<coll>
support for fixed routes: specified in config.yaml as per https://github.com/ikreymer/pywb/wiki/Distributed-Archive-Config
werkzeug routing in FrontEndApp: default query, replay, search pages working
route listing: /_coll_info.json for listing fixed + dynamic routes
autoindexing enabled, indexing WARCs added to archives directory to .cdxj index
Addresses #196
2017-02-17 18:04:07 -08:00
Ilya Kreymer
60f3c0a213 setup: further update trove classifiers, add python 2 and 3, switch to production stable, closes #208 2017-02-16 15:12:22 -08:00
Ilya Kreymer
77960c1311 setup: bound webtest for 2.6 support 2017-02-16 14:32:14 -08:00
Ilya Kreymer
7f95396be0 setup: use jinja2<2.9 for now 2017-02-16 11:29:30 -08:00
Ilya Kreymer
58b141bd53 add python 3 classifiers (#208) 2017-02-16 11:02:53 -08:00
Ilya Kreymer
9773eba47d setup reqs: use webassets==0.12.1 (with pyinstaller support), remove dependency on custom branch 2017-01-26 01:37:57 -08:00
Ilya Kreymer
84796ba810 setup req: fix jinja<2.9 for now due to issues in 2.9+ 2017-01-26 01:14:10 -08:00
Ilya Kreymer
2d54bb87be setup.py: ensure gevent monkey-patch is called before running tests with python setup.py test 2017-01-26 00:37:35 -08:00
raffaele messuti
524d9bfd26 portalocker for file locking check instead of fcntl. more portable on windows 2016-12-26 10:27:20 +01:00
Ilya Kreymer
0e414acfda setup: remove pyamf as default dep for now 2016-12-21 17:15:41 -08:00
Ilya Kreymer
3b82416ad3 setup: add specific dependencies for webassets, pyamf 2016-12-21 16:11:48 -08:00
Ilya Kreymer
50a3353da3 wsgi server: default to gevent-based wsgi server for all cmd line server apps, add -s command for specifying server #201
cli: add 'webagg-server' cli command for running new webagg system
tests: fix cli test for gevent server
2016-12-09 16:46:33 -08:00
Ilya Kreymer
e37900b9c6 tests: add test dependency, remove 2.6 from travis 2016-11-11 11:03:16 -08:00
Ilya Kreymer
8765de4fe7 refactor: updated dependencies, remove watchdog, add gevent and webassets
update tests, tests should pass for python 2 and 3!
2016-11-11 10:32:19 -08:00
Ilya Kreymer
6b4b038471 refactor: fix pywb.webagg package paths
all webagg tests working!
move testdata cdxj into sample_archive, remove rest (duplicates) #200
2016-11-08 14:30:09 -08:00
Ilya Kreymer
457a1a564c bufferedreader: support brotli decompression
rewrite: handle Content-Encoding: br using brotli decompressor
setup: add brotlipy as dependency
2016-06-15 01:37:29 -04:00
Ilya Kreymer
0f530a3e0e dependencies: remove pyamf, update to latest surt (0.3.0) 2016-06-12 00:44:52 -04:00