Barbara Miller
848c089afa
Merge pull request #194 from vbanos/socksproxy
...
Thank you, @vbanos!
2023-10-17 09:18:11 -07:00
Vangelis Banos
9fd5a22502
fix typo
2023-10-17 06:12:28 +00:00
Vangelis Banos
3d653e023c
Add SOCKS proxy options
...
Add options `--socks-proxy`, `--socks-proxy-username,
`--socks-proxy-password`.
If enabled, all traffic is routed throught the SOCKS proxy.
2023-10-16 18:33:42 +00:00
Barbara Miller
4cb8e0d5dc
Merge pull request #192 from internetarchive/Py311
...
updates for 3.11 (and back to 3.8)
@vbanos and @avdempsey have agreed this PR is ok to merge
2023-09-27 12:03:26 -07:00
Barbara Miller
a20ad226cb
update version to 2.5, for Python version updates
2023-09-27 11:58:39 -07:00
Barbara Miller
bc0da12c48
bump version for Py311
2023-09-20 10:57:54 -07:00
Barbara Miller
8f0039de02
internetarchive/doublethink.git@Py311
2023-09-19 13:57:34 -07:00
Barbara Miller
c620d7dd19
use galgeek for now
2023-09-13 18:03:38 -07:00
Barbara Miller
4fbf523a3e
get doublethink from github.com/internetarchive
2023-09-12 16:05:23 -07:00
Barbara Miller
3b5d9d8ef0
update rethinkdb import
2023-09-12 14:39:09 -07:00
Barbara Miller
5e779af2e9
trough and doublethink updates
2023-09-11 17:38:10 -07:00
Barbara Miller
a90c9c3dd4
trough 0.20 maybe
2023-09-11 17:01:02 -07:00
Barbara Miller
99a825c055
initial commit, trying trough branch jammy+focal
2023-09-11 16:40:39 -07:00
Barbara Miller
c01d58df78
Merge pull request #189 from vbanos/idna-update
...
Thank you, @vbanos!
2023-07-11 14:13:47 -07:00
Vangelis Banos
6eb2bd1265
Drop idna==2.10 version lock
...
There is no need to use such an old `idna` version.
The latest works with py35+ and all tests pass.
Newer `idna` supports the latest Unicode standard and latest python
versions.
https://github.com/kjd/idna/blob/master/HISTORY.rst
2023-07-09 10:02:13 +00:00
Barbara Miller
d864ea91ee
Merge pull request #187 from vbanos/cryptography-limit
...
Thanks, @vbanos!
2023-06-22 08:55:33 -07:00
Vangelis Banos
83c109bc9b
Change cryptography version limit to >=2.3,<40
2023-06-22 12:22:24 +00:00
Vangelis Banos
1cc08233d6
Limit dependency version cryptography>=2.3,<=39.0.0
...
cryptography 41.0.0 crashes warcprox with the following exception:
```
File "/opt/spn2/lib/python3.8/site-packages/warcprox/main.py", line 317, in main
cryptography.hazmat.backends.openssl.backend.activate_builtin_random()
AttributeError: 'Backend' object has no attribute 'activate_builtin_random'
```
Also, cryptography==40.0.0 isn't OK because when I try to use it I get:
```
pyopenssl 23.2.0 requires cryptography!=40.0.0,!=40.0.1,<42,>=38.0.0, but you have cryptography 40.0.0 which is incompatible.
```
So, the version should be <=39.0.0
2023-06-18 09:09:07 +00:00
Barbara Miller
ca02c22ff7
Merge pull request #180 from cclauss/patch-1
...
Thanks, @cclauss!
2023-04-12 11:45:41 -07:00
Barbara Miller
1fd3b2c7a1
update readme — rm travis
2023-04-12 11:44:01 -07:00
Christian Clauss
ba14480a2d
Delete .travis.yml
2023-04-12 11:37:56 +02:00
Barbara Miller
50a4f35e5f
Merge pull request #177 from internetarchive/blocks-shrink
...
@adam-miller ok'd this elsewhere
2022-08-05 15:44:05 -07:00
Barbara Miller
9973d28de9
bump version
2022-08-04 17:28:33 -07:00
Barbara Miller
ee9e375560
zlib decompression
2022-08-04 11:14:33 -07:00
Barbara Miller
c008c2eca7
bump version
2022-07-01 14:18:17 -07:00
Barbara Miller
7958921053
Merge pull request #175 from vbanos/random-tls-fingerprint
...
Thanks, @vbanos!
2022-07-01 14:16:05 -07:00
Vangelis Banos
329fef31a8
Randomize TLS fingerprint
...
Create a random TLS fingerprint per HTTPS connection to avoid TLS
fingerprinting.
2022-07-01 17:39:49 +00:00
Barbara Miller
d253ea85c3
Merge pull request #173 from internetarchive/increase_batch_sec
...
tune MIN_BATCH_SEC, MAX_BATCH_SEC for fewer dedup errors
2022-06-24 11:13:18 -07:00
Barbara Miller
8418fe10ba
add explanatory comment
2022-06-24 11:07:35 -07:00
Adam Miller
fcd9b2b3bd
Merge pull request #172 from internetarchive/adds-canonicalization-tests
...
Adding url canonicalization tests and handling of edge cases to reduc…
2022-04-27 09:57:03 -07:00
Adam Miller
731cfe80cc
Adding url canonicalization tests and handling of edge cases to reduce log noise
2022-04-26 23:48:54 +00:00
Adam Miller
9521042a23
Merge pull request #171 from internetarchive/adds-hop-path-logging
...
Adds hop path logging
2022-04-26 12:11:11 -07:00
Adam Miller
daa925db17
Bump version
2022-04-26 09:55:48 -07:00
Adam Miller
d96dd5d842
Adjust rfc3986 package version for deployment across more versions
2022-04-21 18:37:27 +00:00
Adam Miller
1e3d22aba4
Better handle non-ascii urls for crawl log hop info
2022-04-20 22:48:28 +00:00
Adam Miller
5ae1291e37
Refactor of hop path referer logic
2022-03-24 21:40:55 +00:00
Barbara Miller
05daafa19e
increase MIN_BATCH_SEC, MAX_BATCH_SEC
2022-03-03 18:46:20 -08:00
Adam Miller
ade2373711
Fixing referer on request with null hop path
2022-03-04 02:01:55 +00:00
Adam Miller
3a234d0cec
Refactor hop_path metadata
2022-03-03 00:18:16 +00:00
Adam Miller
366ed5155f
Merge branch 'master' into adds-hop-path-logging
2022-02-09 18:18:32 +00:00
Barbara Miller
c027659001
Merge pull request #167 from galgeek/WT-31
...
fix logging buglet iii
2021-12-29 12:14:56 -08:00
Barbara Miller
9e8ea5bb45
fix logging buglet iii
2021-12-29 12:06:18 -08:00
Barbara Miller
bc3d1e6d00
fix logging buglet ii
2021-12-29 11:55:39 -08:00
Barbara Miller
6b372e2f3f
Merge pull request #166 from galgeek/WT-31
...
fix logging buglet
2021-12-29 11:04:03 -08:00
Barbara Miller
5d8fbf7038
fix logging buglet
2021-12-29 10:25:04 -08:00
Barbara Miller
a969430b37
Merge pull request #163 from internetarchive/idna2_10
...
idna==2.10
2021-12-28 13:50:23 -08:00
Barbara Miller
aeecb6515f
bump version
2021-12-28 11:58:30 -08:00
Adam Miller
e1eddb8fa7
Merge pull request #165 from galgeek/WT-31
...
in-batch dedup
2021-12-28 11:52:41 -08:00
Barbara Miller
d7aec77597
faster, likely
2021-12-16 18:36:00 -08:00
Barbara Miller
bcaf293081
better logging
2021-12-09 12:19:45 -08:00