Noah Levitt
|
734b2f5396
|
limit max number of threads to 500; make sure connection with proxy client has a timeout; log errors from connection with proxy client
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
e3a5717446
|
hidden --profile option to enable profiling of warc writer thread and periodic logging of memory usage info; at shutdown, close stats db and unregister from service registry; logging improvements
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
a9fc550453
|
oops, argparse.SUPPRESS isn't supposed to be in quotes
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
3e2696525b
|
make sure svcreg is set
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
d7d992731c
|
register self for service discovery
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
2169369dab
|
working on benchmarking code... so far they seem to reveal that warcprox behaves poorly under load (perhaps timeouts are configured too short?)
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
a41c426b0a
|
giving up on using git revision in version number :( latest issue is when installing a package that calls git to compute a version number, but cwd is some other git project, you get the wrong thing
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
3b9345e7d7
|
use nicer rethinkdbstuff.Rethinker api
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
d98f03012b
|
kafka capture feed, for druid
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
44a62111fb
|
support for deduplication buckets specified in warcprox-meta header {"captures-bucket":...,...}
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
6d673ee35f
|
tests pass with big rethinkdb captures table
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
c430f81883
|
some refactoring to prep for big rethinkdb capture table
|
2016-01-26 18:47:08 -08:00 |
|
Noah Levitt
|
df38cf856d
|
rethinkdb for stats
|
2016-01-26 18:46:13 -08:00 |
|
Noah Levitt
|
e66dc3a9fb
|
rethinkdb dedup
|
2016-01-26 18:46:13 -08:00 |
|
Noah Levitt
|
a876152026
|
fix exception, make some tweaks
|
2016-01-26 18:46:13 -08:00 |
|
Noah Levitt
|
4ce89e6d03
|
basic limits enforcement is working
|
2016-01-26 18:46:13 -08:00 |
|
Noah Levitt
|
274a2f6b1d
|
refactor warc writing, deduplication for somewhat cleaner separation of concerns
|
2016-01-26 18:45:36 -08:00 |
|
Noah Levitt
|
771383d0a6
|
refactor proxy handler to use do_* methods for custom http verbs; refactor warc writer thread to use new WarcWriterPool class
|
2016-01-26 18:45:36 -08:00 |
|
Noah Levitt
|
084bd75ed6
|
dump thread tracebacks on sigquit, more logging and exception handling tweaks
|
2016-01-26 18:45:12 -08:00 |
|
Noah Levitt
|
0647c0c76d
|
support for writing to different warcs based on Warcprox-Meta http request header warc-prefix setting
|
2016-01-26 18:44:16 -08:00 |
|
Ilya Kreymer
|
574f1f3f52
|
remove certauth.py and use the seperate certauth package release
|
2015-03-30 09:32:10 -07:00 |
|
Noah Levitt
|
a2c25d4242
|
split into even more source files
|
2014-11-20 00:04:43 -08:00 |
|