Noah Levitt
|
320df0565e
|
support "soft limits" which result in a different response code (430) than regular (hard) limits (which result in a 420)
|
2016-06-27 16:07:20 -05:00 |
|
Noah Levitt
|
9df2ce0fbe
|
convert command-line executables to entry_points console_scripts, best practice according to Python Packaging Authority (eases testing, etc)
|
2016-06-27 14:46:42 -05:00 |
|
Noah Levitt
|
84767af0f6
|
check if already started/stopped in WarcproxController.{start,shutdown}, fix bugs
|
2016-06-27 14:36:06 -05:00 |
|
Noah Levitt
|
6410e4c8c7
|
reorganize WarcproxController.run_until_shutdown, moving parts of it into new start() and shutdown() methods, for easier integration into a separate python program
|
2016-06-27 14:18:21 -05:00 |
|
Noah Levitt
|
2fe0c2f25b
|
support for tallying substats of a configured bucket by host, and enforcing limits host limits using those stats, with tests
|
2016-06-24 20:04:27 -05:00 |
|
Noah Levitt
|
4bb3556709
|
implement enforcement of Warcprox-Meta header block rules; includes automated tests
|
2016-05-10 23:11:47 +00:00 |
|
Noah Levitt
|
4fd17be339
|
started adding some docstrings, and moved some of the more generally man-in-the-middle recording proxy code from warcproxy.py into mitmproxy.py
|
2016-05-10 01:11:17 -07:00 |
|
Noah Levitt
|
0809c78486
|
add Strict-Transport-Security to list of http response headers to swallow, to avoid some problems with HSTS when browsing through warcprox (doesn't solve the case of preloaded HSTS though)
|
2016-04-08 23:26:20 -07:00 |
|
Noah Levitt
|
6f10e2708d
|
disable tor test to give travis build a chance to pass tests (waiting on https://github.com/travis-ci/apt-package-whitelist/issues/1753)
|
2016-04-06 19:39:28 -07:00 |
|
Noah Levitt
|
2c65ff89fa
|
add license headers
|
2016-04-06 19:37:55 -07:00 |
|
Noah Levitt
|
6490583dd0
|
this brozzler branch will be warcprox 2.0, today it's 2.0.dev4
|
2016-03-18 02:07:29 +00:00 |
|
Noah Levitt
|
42a81d8f8f
|
fix bug where two warc-payload-digest headers were written to revisit records
|
2016-03-15 06:27:21 +00:00 |
|
Noah Levitt
|
910cd062ee
|
bump version number
|
2016-03-08 22:55:42 +00:00 |
|
Noah Levitt
|
89f965d1d3
|
use kafka-python 1.0 recommended api; use kafka capture feed specified in warcprox-meta header, if any
|
2016-03-03 18:58:52 +00:00 |
|
Noah Levitt
|
ee3ee5d621
|
call this 1.5.0.dev1 for now
|
2016-02-25 01:36:36 +00:00 |
|
Noah Levitt
|
00dc9eed84
|
new option --onion-tor-socks-proxy, host:port of tor socks proxy, used only to connect to .onion sites
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
2ecd2facd9
|
surt 0.3b2 is in pypi now, no need for devpi
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
95e611a5d0
|
update stats in RethinkDb asynchronously, since profiling shows this to be a bottleneck in WarcWriterThread (which in turn makes it a bottleneck for the whole app)
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
a41c426b0a
|
giving up on using git revision in version number :( latest issue is when installing a package that calls git to compute a version number, but cwd is some other git project, you get the wrong thing
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
97a30eb319
|
back to setup.py now that we have devpi
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
c430f81883
|
some refactoring to prep for big rethinkdb capture table
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
e66dc3a9fb
|
rethinkdb dedup
|
2016-01-26 18:46:13 -08:00 |
|
Noah Levitt
|
274a2f6b1d
|
refactor warc writing, deduplication for somewhat cleaner separation of concerns
|
2016-01-26 18:45:36 -08:00 |
|
Ilya Kreymer
|
574f1f3f52
|
remove certauth.py and use the seperate certauth package release
|
2015-03-30 09:32:10 -07:00 |
|
Noah Levitt
|
016749a822
|
bump version since api has changed as a result of reorganization
|
2015-03-18 16:33:07 -07:00 |
|
Noah Levitt
|
5f84b061f3
|
make it work with python 2.7 again
|
2015-03-18 16:29:44 -07:00 |
|
Noah Levitt
|
b34edf8fb1
|
split into multiple files
|
2014-11-15 03:20:05 -08:00 |
|
Noah Levitt
|
16f21b2e76
|
https://github.com/internetarchive/warcprox/issues/9 record warcprox version in warcinfo metadata, and add --version command line option
|
2014-08-08 12:10:45 -07:00 |
|
Noah Levitt
|
b434e33fdd
|
bump version number for updated submission to pypi
|
2014-08-05 19:04:07 -07:00 |
|
Noah Levitt
|
ccbe3522c5
|
timestamps in utc!
|
2014-08-01 16:00:53 -07:00 |
|
Kelsey Hawley
|
ae3a039d95
|
updated setup.py to use pytest (for compatilibity with dump-anydbm tests)
|
2014-01-17 12:13:39 -08:00 |
|
Noah Levitt
|
3bc4294227
|
oops, adding missing comma
|
2013-12-19 17:08:13 -08:00 |
|
Noah Levitt
|
115b7c03ee
|
add some classifiers
|
2013-12-19 17:03:40 -08:00 |
|
Noah Levitt
|
f07437f64d
|
since we depend on warctools trunk now, update the readme, and update the version number, so we can push latest to pypi
|
2013-12-19 16:48:28 -08:00 |
|
Noah Levitt
|
e880deddb6
|
oops, fix dependency_links warctools github url
|
2013-12-13 06:02:14 +00:00 |
|
Noah Levitt
|
81974bb014
|
warctools mainline has the good stuff now
|
2013-12-12 21:28:25 -08:00 |
|
Noah Levitt
|
313bc62bf1
|
gdbm not in pip, can't be listed as a requirement
|
2013-12-09 17:45:00 -08:00 |
|
Noah Levitt
|
e9e152ca7d
|
tox (and travis ci?) were hiding the fact that the gdbm dependency was the problem
|
2013-12-07 00:27:59 -08:00 |
|
Noah Levitt
|
b6774da603
|
more fiddling trying to get test runs to work with various invocation methods, esp travis
|
2013-12-06 16:50:02 -08:00 |
|
Noah Levitt
|
9c6c18d274
|
nose.collector wasn't working
|
2013-12-06 15:22:29 -08:00 |
|
Noah Levitt
|
2dd9ecb718
|
not sure why tox wasn't working, but this fixes it
|
2013-12-04 17:50:55 -08:00 |
|
Noah Levitt
|
dc9fdc3412
|
tests pass with python2.7 and 3.2! (tox fails though oddly)
|
2013-12-04 17:25:45 -08:00 |
|
Noah Levitt
|
8ae164f8ca
|
finish switch from README.md to README.rst
|
2013-11-28 01:28:59 -08:00 |
|
Noah Levitt
|
9c53f1b2d3
|
spec warctools dependency more precisely
|
2013-11-28 00:40:30 -08:00 |
|
Noah Levitt
|
0237a00f3f
|
test_require requests>=2.0.1 for https://github.com/kennethreitz/requests/pull/1636
|
2013-11-20 16:28:34 -08:00 |
|
Noah Levitt
|
25464dee80
|
test_archive_and_playback_http_url
|
2013-11-20 12:06:29 -08:00 |
|
Noah Levitt
|
555517ab78
|
WarcproxController to ease use of warcprox as a module
|
2013-11-19 17:12:58 -08:00 |
|
Noah Levitt
|
b8ad8abffe
|
working on packaging
|
2013-11-15 22:35:32 -08:00 |
|
Noah Levitt
|
556e969465
|
for now warcprox.py is just a command, not a module
|
2013-10-15 15:57:14 -07:00 |
|
Noah Levitt
|
a950d199d5
|
progress towards warc writing
|
2013-10-15 10:54:18 -07:00 |
|