Information
ID: 102
PHID: PHID-TASK-buo4lmumyhknvqpjbj7e
Author: Patrick
Status at Migration Time: resolved
Priority at Migration Time: Normal
Description
Requirements :
Not using libcurl
(because that’s a C binding, then we’d have no security enhancement and could just stick to curl
).
force use TLSv1, not SSL
Download head only. (Similar to curl
s --head
.)
Features. With example.
/usr/lib/sdwdate/url_to_unixtime \
--max-time 180 \
--socks5-hostname 10.152.152.10:9108 \
--tls true \
https://check.torproject.org
Expected output, unixtime, example:
1413814230
Bonus :
--max-file-size-bytes 2097152
--user-agent
--verbose
SSL :
Depending on the outcome of this we might not need SSL support.
date_to_unixtime :
The code for date to unixtime is already done:
https://github.com/Whonix/sdwdate/blob/master/usr/lib/sdwdate/date_to_unixtime
python-requests :
Implementing this using the python-requests
library was trivial , but unfortunately, python-requests
does not support socks proxies yet , which is a deal breaker for Whonix.
urllib3 :
Has no yet socks proxy support either.
TODO :
So we have to find some python library that has socks proxy as well as TLSv1 support, that is installable from Debian repository. Does this exist?
Comments
troubadour
2015-01-23 23:33:38 UTC
Might be some progress on this one. It’s not really my domain of competence, so…
In Whonix-Gateway, install python-socksipy python-openssl packages.
Try this script.
#!/usr/bin/python
import socks, ssl
s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, “127.0.0.1”, port=9050)
s.connect((‘check.torproject.org ’, 443))
ss = ssl.wrap_socket(s)
print ss.cipher()
s.close()
ss.close()
s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, “127.0.0.1”, port=9050)
s.connect((‘whonix.org ’, 443))
ss = ssl.wrap_socket(s)
print ss.cipher()
s.close()
ss.close()
If that looks sound,we’ll have to find the way to get the time from the header. That must be possible, because the shell openssl command returns (in Whonix-Workstation):
$ openssl s_client -connect whonix.org:443
CONNECTED(00000003)
depth=1 C = FR, O = GANDI SAS, CN = Gandi Standard SSL CA
verify error:num=20:unable to get local issuer certificate
verify return:0
Certificate chain
0 s:/OU=Domain Control Validated/OU=Gandi Standard SSL/CN=whonix.org
i:/C=FR/O=GANDI SAS/CN=Gandi Standard SSL CA
1 s:/C=FR/O=GANDI SAS/CN=Gandi Standard SSL CA
i:/C=US/ST=UT/L=Salt Lake City/O=The USERTRUST Network/OU=http://www.usertrust.com
Server certificate
-----BEGIN CERTIFICATE-----
MIIE1zCCA7+gAwIBAgIRAKuuBHjHjvVq22Vahgk4QZYwDQYJKoZIhvcNAQEFBQAw
QTELMAkGA1UEBhMCRlIxEjAQBgNVBAoTCUdBTkRJIFNBUzEeMBwGA1UEAxMVR2Fu
~~
A1UW8F3H49PDn/FmBM0qOXhiWY9O0wcyZcOVUiBkw6Phq163lqkeleDlqA==
-----END CERTIFICATE-----
subject=/OU=Domain Control Validated/OU=Gandi Standard SSL/CN=whonix.org
issuer=/C=FR/O=GANDI SAS/CN=Gandi Standard SSL CA
No client certificate CA names sent
SSL handshake has read 3128 bytes and written 431 bytes
New, TLSv1/SSLv3, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES256-GCM-SHA384
~~
Start Time: 1422055124
Timeout : 300 (sec)
Verify return code: 20 (unable to get local issuer certificate)
closed
If we pipe a command in it
$ ls | openssl s_client -connect whonix.org:443
the connection closes immediately (no 10~20 seconds waiting).
~~
Start Time: 1422055553
Timeout : 300 (sec)
Verify return code: 20 (unable to get local issuer certificate)
DONE
Patrick
2015-01-24 02:46:18 UTC
Will try soon.
If that looks sound,we’ll have to find the way to get the time from the header. That must be possible, because the shell openssl command returns (in Whonix-Workstation):
We don’t want to extract from SSL. (Because likely unreliable in long term. + Doesn’t work for .onion
domains.) We want to extract time from http headers.
Similar to this.
curl --head check.torproject.org
curl --silent --head check.torproject.org | grep "Date:"
Patrick
2015-01-24 03:00:27 UTC
troubadour
2015-01-24 09:01:05 UTC
I was mislead by this:
So we have to find some python library that has socks proxy as well as TLSv1 support,
Perhaps this is more acceptable (actually simpler, since we can use the socket directly).
Script ‘socks_socket.py’.
#!/usr/bin/python
import sys, socks
site = sys.argv[1]
s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, ‘127.0.0.1’, 9050)
s.connect((site, 80))
s.send(‘HEAD / HTTP/1.1\r\n\r\n’)
data = ‘’
buf = s.recv(1024)
while len(buf):
data += buf
buf = s.recv(1024)
s.close()
print(‘HTTP header from “%s:”:\n\n%s’ % (site, data))
These commands have ((almost) the same output as curl.
python socks_socket.py "check.torproject.org"
python socks_socket.py "check.torproject.org" | grep "Date:"
It’s for example (no error handling, time out…). If you try ‘google.com ’, it hangs indefinitely.
troubadour
2015-01-24 14:36:36 UTC
The example above supports socks proxies, but for TLSv1, it looks like we would need python-socksipychain, which is available in jessie, not in wheezy.
Patrick
2015-01-24 17:44:41 UTC
We switch to jessie sooner or later anyhow. So that should not be considered a blocker.
If it helps a lot to leave out timeout, then leave out timeout. Because, after thinking again, timeout isn’t that important inside the python script. The python script can be called using timeout
, which is apparently a reliable tool to enforce timeouts from “the outside” (“from bash”). Maybe that’s easiest/best and should be done in any case anyhow.
Do you think you can re-use python-request
’s code to parse the http header? I.e. to extract the date field from the http header.
date = response.headers.get('date')
Similar to:
sdwdate/usr/lib/sdwdate/url_to_unixtime at 5a6c1729df8caaaddcabc8caeb2108faca199ca7 · Kicksecure/sdwdate · GitHub
Patrick
2015-01-24 17:49:09 UTC
Good news: the last script you posted works while transparent dns and transparent tcp is disabled! So stream isolation is apparently functional.
Curl gets a HTTP/1.1 302 Found
response while the script gets a HTTP/1.1 400 Bad Request
response.
troubadour
2015-01-24 19:40:44 UTC
We switch to jessie sooner or later anyhow. So that should not be considered a blocker.
Yes, I have donwloaded the package from Debian (not available from backports), checked with sha256sum. Currently testing, working even if still rather obscure.
Regarding timeout, if the script hangs, sdwdate won’t be able to call it again. Or is the external timeout
managing the process?
Do you think you can re-use python-request’s code to parse the http header? I.e. to extract the date field from the http header.
That’s the next logical step. The script could take the URL as argument and returns unixtime.
Curl gets a HTTP/1.1 302 Found response while the script gets a HTTP/1.1 400 Bad Request response.
Yes, and the location field is missing with the script too. Is that a problem?
troubadour
2015-01-24 19:45:04 UTC
Patrick
2015-01-24 20:08:30 UTC
Yes, the external timeout
command is managing the process. It sends signal SIGTERM
after a configurable amount of time, optionally followed (using this) of signal SIGKILL
after another configurable amount of time. See also:
timeout(1) — coreutils — Debian bookworm — Debian Manpages
It’s a GNU coreutil and quite old already. Expected to be very reliable.
The timeout would need to be configurable by commandline if you wish to add it.
Is that a problem?
Not sure. Better to look similar to curl, similar to a usual request. So sysadmins don’t get afraid and add some magic to block it.
troubadour
2015-01-24 20:47:14 UTC
troubadour
2015-01-25 09:41:13 UTC
Before starting the next stage (on github, I guess), an update to the script.
#!/usr/bin/python
import sys, socks
site = sys.argv[1]
s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, ‘127.0.0.1’, 9050)
try:
s.connect((site, 80))
except IOError as e:
print e
sys.exit(1)
s.send(‘HEAD / HTTP/1.0\r\n\r\n’)
data = ‘’
buf = s.recv(1024)
while len(buf):
data += buf
buf = s.recv(1024)
s.close()
print(‘HTTP header from “%s:”:\n\n%s’ % (site, data))
Setting aside the exceptions handling, the significant change is the request with ‘HTTP/1.0’ instead of ‘HTTP/1.1’. check.torproject.org returns ‘200 OK’, which looks like an improvement, less likely to be tagged by sysadmins.
troubadour
2015-01-25 09:46:40 UTC
Patrick
2015-01-25 16:18:03 UTC
Best answer is “very same as a mainstream browser as possible”, but that is kinda impossible. Realistic answer is “similar like curl or wget”.
Getting this.
HTTP header from "check.torproject.org:":
HTTP/1.1 504 Gateway Time-out
Date: Sun, 25 Jan 2015 16:15:39 GMT
Connection: close
troubadour
2015-01-26 14:14:26 UTC
It’s kinda impossible too to have a reply consistently similar to curl or wget. Depending on the URL, It can be identical, differ by one line, or the the reply be different (“200 OK” instead of “302 Found”).
Have integrated date_to_unixtime.
#lang=
#!/usr/bin/python
import sys, socks
from dateutil.parser import parse
try:
socket_ip =sys.argv[1]
socket_port = int(sys.argv[2])
url = sys.argv[3]
except IndexError as e:
print >> sys.stderr, "Parsing command line parameter failed. | e: %s" % (e)
sys.exit(1)
s = socks.socksocket()
s.setproxy(socks.PROXY_TYPE_SOCKS5, socket_ip, socket_port)
try:
s.connect((url, 80))
except IOError as e:
print >> sys.stderr, e
sys.exit(1)
s.send('HEAD / HTTP/1.0\r\n\r\n')
data = ''
buf = s.recv(1024)
while len(buf):
data += buf
buf = s.recv(1024)
s.close()
date = ''
date_pos = data.find('Date:') + 6
date = data[date_pos:date_pos + 30].strip()
if date == '':
print >> sys.stderr, 'Parsing HTTP header date failed.'
sys.exit(2)
try:
## Thanks to:
## eumiro
## http://stackoverflow.com/a/3894047/2605155
unixtime = parse(date).strftime('%s')
except ValueError as e:
print >> sys.stderr, ('Parsing date from server failed. | date: %s \
| dateutil ValueError: %s' % (date, e))
sys.exit(3)
#print data
#print date
print "%s" % unixtime
Example:
python url_to_unixtime.py 127.0.0.1 9100 whonix.org
If this is OK, is it necessary to create a separate package? It could be bundled in sdwdate.
Patrick
2015-01-26 21:44:17 UTC
Please remove the #lang=
(unless that’s good for something) and the trailing spaces.
Yeah. Unless we get a feature requests to maintain this in a separate repository, it’s fine to keep this in the sdwdate package. /usr/lib/sdwdate/url_to_unixtime
?
troubadour
2015-01-27 13:24:11 UTC
The #lang=
was for syntax highlighting.
Tested with the different pools from /usr/bin/sdwdate
. The replies range from 200 OK
to 403 Forbidden
, which looks OK.
Do you want me to add it to sdwdate?
Patrick
2015-01-27 15:21:50 UTC
troubadour
2015-01-28 10:20:42 UTC
Patrick
2015-01-29 01:02:26 UTC
Patrick
2015-01-29 01:04:45 UTC
Patrick
2015-01-29 01:11:26 UTC
Patrick
2015-01-29 01:19:46 UTC
Patrick
2015-01-29 01:44:41 UTC
Just noticed, that file needs a license header. Could you add it please?
Do my changes look good so far?
Can you make data.find
case insensitive please? Some servers in the wild indeed used date:
. We could use tolower, but I don’t know what provides best performance. (Imagine faulty or even malicious replies.) IF that makes sense at all.
And would it make sense to check the return code of data.find
? To abort if it didn’t find such a line? I am trying to imagine all sorts of invalid input.
Also would it make sense to check if the size of data
is reasonable before processing it further? (Having performance in mind here again if a server replies a super long string so we would abort earlier and waste less processing power.)
troubadour
2015-01-30 20:50:16 UTC
All the checks should be in the last two commit.
max data length
= 1024
data.find
sarch for uppersace and lowercase in the header, return an error if not found.
date string length
: max length = date string length, min length = max length, return an error if too short.
your unixtime sanity checks (minor modification).
extra: time offset
: check local time/HTTP header time difference, max offset value hard-coded, return an error if outside.
Not sure the last check is relevant, as I have to check which local time python is returning.
Patrick
2015-01-30 22:25:50 UTC
Not sure the last check is relevant, as I have to check which local time python is returning.
The last check is overkill. That’s something sdwdate should do itself.
http_time = http_time(data)
I am not much of a python coder, but is it a good or common way to have a variable that has the same name as a function?
## "Date:" not found.
print >> sys.stderr, 'Parsing HTTP header date failed.'
Does speak anything against writing data
to stderr? In case this happens at some time somewhere and some user reports it, then it would be interesting to see what went wrong instead of needing to add more debug code by then.
if unixtime_sanity_check(unixtime_http):
Wondering if we could simplify the code by either using if not ...
(non-ideal) take make the indent shorter or to just run unixtime_sanity_check(unixtime_http)
and leave it there - because that function is supposed to exit anyhow if some sanity check goes wrong. (You did this that way using function http_time
already.)
troubadour
2015-01-31 15:12:17 UTC
Patrick
2015-01-31 17:21:17 UTC
Patrick
2015-01-31 17:45:53 UTC
Done for now. Please review and merge.
Manually tested all the functions with bogus input to see if they correct report and exit if something goes wrong.
One exception remains.
./usr/lib/sdwdate/url_to_unixtime 127.0.0.1 9050 "nonexisting"
Traceback (most recent call last):
File "./usr/lib/sdwdate/url_to_unixtime", line 81, in <module>
s.connect((url, 80))
File "/usr/lib/python2.7/dist-packages/socks.py", line 369, in connect
self.__negotiatesocks5(destpair[0],destpair[1])
File "/usr/lib/python2.7/dist-packages/socks.py", line 236, in __negotiatesocks5
raise Socks5Error(ord(resp[1]),_generalerrors[ord(resp[1])])
TypeError: __init__() takes exactly 2 arguments (3 given)
Can you look into it please?
troubadour
2015-02-01 10:02:15 UTC
Patrick
2015-02-01 13:48:34 UTC
Merged.
Okay. Because that error message sounds more like a python syntax error than url not found.
I would also like to make the port configurable (easy, can do).
Have you experience creating unit tests for python scripts yet? While I am at it, now that everything is inside functions I would be motivated to write unit test that test the functions using valid and invalid input to see if they output as expected. If you get that started, I could finish it. Or you do it. And if you don’t know, I can also research the whole thing. I don’t really mind.
troubadour
2015-02-02 16:44:00 UTC
Done the translation in python "__init__ error" -> "URL not found" · troubadoour/sdwdate@c34a933 · GitHub .
I have never created uni tests. A quick research tells me that it might be a bit late for that script (using python-test or mock, to name a few). It makes sense to create the unit test BEFORE starting to write the code.
url-to-unixtime has been extensively tested for good and bad input, first by me, then you (spotting the bad URL). And I made another round of bogus inputs before pushing the last commit.
troubadour
2015-02-02 16:47:23 UTC
Patrick
2015-02-02 16:58:06 UTC
Test suite cocumber is different. It is more like a simulated user who boots up the system, clicks here and there, runs several tests and checks the results. It’s more for testing the interaction with the whole system. For example, see:
Cucumber (software) - Wikipedia
Unit tests work on a lower level, testing function in/output alike. Useful when functions are later changed/refactored/whatever to see if they still work as expected. I agree it’s not super important, just nice to have unit tests for this.
One isn’t supposed to be a replacement for the other. A unit test is more for us devs who have enough ram, dependencies installed and so forth checking functions. A test suite is more testing if the whole thing is functional or if some interaction breaks it.
Patrick
2015-02-02 17:09:25 UTC
Patrick
2015-02-02 19:46:00 UTC
Made remote port configurable.
Added usage example comment on top.
I am wondering, if these comments could be improved?
## max accepted string length.
http_time = data[date_string_start_position:date_string_start_position + 29].strip()
## min string length = max string length.
if http_time_string_length < 29:
Patrick
2015-02-02 20:02:22 UTC
Added verbose output mode. In verbose mode, variables will be written to stderr
.
Usage (also as per comment at the top of the script):
usr/lib/sdwdate/url_to_unixtime 127.0.0.1 9050 check.torproject.org 80 true
This is useful to make the script show what the server replied, and if the conversion back and forth is fully functional.
date --date "@$(usr/lib/sdwdate/url_to_unixtime 127.0.0.1 9050 check.torproject.org 80 true)"
data: HTTP/1.1 200 OK
Date: Mon, 02 Feb 2015 19:56:01 GMT
Server: Apache
Last-Modified: Thu, 23 Feb 2012 18:45:14 GMT
ETag: "211-1e8-4b9a60b6ecade"
Accept-Ranges: bytes
Content-Length: 488
Vary: Accept-Encoding
Connection: close
Content-Type: text/html
X-Pad: avoid browser bug
http_time: Mon, 02 Feb 2015 19:56:01 GMT
parsed_unixtime: 1422906961
Mon Feb 2 19:56:01 UTC 2015
The last line shows how date
converted url_to_unixtime
’s stdout
back to a human readable date. We see that the value of the Date:
field exactly matches (besides formatting, but it’s the unixtime
that matters, because that will be used by sdwdate).
Patrick
2015-02-04 20:37:58 UTC
Patrick
2015-02-05 13:52:04 UTC
troubadour
2015-02-15 20:35:05 UTC
It returns a pure python exception from socks.py “: init () takes exactly 2 arguments (3 given)” which seems to mean that the the url was not found. Perhaps we could translate it in something more meaningful.
An update for this issue. In jessie, python-pysocks is an upgrade to python-socksipy, which fixes some bugs, amongst them this situation.
Examples, without code modification:
connect error: 0x04: Host unreachable```
```$ ./url_to_unixtime 127.0.0.1 9050 whonix+org 80 true
connect error: 0x01: General SOCKS server failure```
That looks better.
troubadour
2015-02-15 21:39:59 UTC
Patrick
2015-02-16 07:57:37 UTC
Good catch. Although that change made it incompatible with wheezy. (Because python-pysocks is not available in wheezy.) Therefore I added a commit on top to make it compatible with wheezy and jessie:
wheezy + jessie compatibility · Kicksecure/sdwdate@b015142 · GitHub
At the current rate of progress, and the few left tickets , I think chances are good, that Whonix 10 will be ready before jessie becomes the new Debian stable. So Whonix 10 might still be a wheezy based release. (Although hopefully ready to be upgraded to jessie without hassle.) (The related, “maybe wait for jessie ticket”: T24)
Patrick
2015-02-16 09:20:40 UTC