Information
ID: 274
PHID: PHID-TASK-ahfvo6wysi42ai7hwrwc
Author: Patrick
Status at Migration Time: resolved
Priority at Migration Time: Normal
Description
! In T273#3852, @nrgaway wrote (edited):
There are a few deficiencies we may want to consider for tor-controlport-filter
for the future:
1. cpfp.py should daemonize itself and write the PID to the proper location (done)
/usr/lib/tor-controlport-filter should implement a watchdog
(sd_notify
) function
The watchdog feature would be nice to have, which would restart the service if it appeared to be running, but became unresponsive.
python-daemon looks interesting. In Debian:
with sd-notify:
python3-sdnotify:
python3-systemd:
https://packages.debian.org/stretch/python3-systemd
TODO:
Comments
troubadour
2015-04-27 22:03:11 UTC
The daemonized control-port-filter-python is nearly completed (with some tricks for wheezy).
Remarks and suggestions.
I will create a new repository, suggested name control-port-filter-daemon
. Instead of cpfp.py, the file name could be cpfpd
without the .py extension. Since it’s a daemon, perhaps a better path would be /usr/bin. I don’t know in systemd, but in init.d, it would look like
start)
echo "Starting control port filter"
cpfpd start
;;
stop)
echo "Stopping control port filter"
cpfpd stop
;;
restart)
"Restarting control port filter"
cpfpd restart
;;
esac
Which leads to the first issue. There is no status
in python-daemon. It could probably be implemented in a non standard way. To be explored.
Also, when the daemon is running, starting it again outputs a timeout error because the lock file (pid.lock) exists, it seems. Looks like a bug, the daemon is still running.
Quoting @nrgaway
cpfp.py should daemonize itself and write the PID to the proper location
What is the proper location for a daemon pid?
Patrick
2015-04-27 22:14:03 UTC
! In T274#4070, @troubadour wrote:
The daemonized control-port-filter-python is nearly completed (with some tricks for wheezy).
I guess this is sufficient for Whonix 11 / jessie.
I will create a new repository, suggested name control-port-filter-daemon
.
I don’t think this should become a separate package. Never saw something like this. Like apache-server and apache-daemon, doesn’t exist.
Just installing control-port-filter-python without the -daemon package, what’s the use case for that? Anyone up to that could rather just disable the systemd unit and have the same effect.
Instead of cpfp.py, the file name could be cpfpd
without the .py extension. Since it’s a daemon, perhaps a better path would be /usr/bin.
I guess /usr/sbin as per Filesystem Hierarchy Standard - Wikipedia .
Which leads to the first issue. There is no status
in python-daemon. It could probably be implemented in a non standard way. To be explored.
Hm. Do you know any existing python daemon(s) so we can do a standard thing?
What is the proper location for a daemon pid?
/var/run/control-port-filter-python.pid
? Looks like most packages do it that way.
Patrick
2015-04-27 22:17:23 UTC
I thought it’s working like this: you somehow import a library into a python script, that does the listening to events. Systemd starts the binary. Once the system shuts down or [manually] or so, systemd sends a signal to the daemon. It receives it and acts accordingly.
The python-daemon package is more like separate binary in between?
nrgaway
2015-04-27 22:21:05 UTC
! In T274#4070, @troubadour wrote:
The daemonized control-port-filter-python is nearly completed (with some tricks for wheezy).
Remarks and suggestions.
I will create a new repository, suggested name control-port-filter-daemon
. Instead of cpfp.py, the file name could be cpfpd
without the .py extension. Since it’s a daemon, perhaps a better path would be /usr/bin. I don’t know in systemd, but in init.d, it would look like
start)
echo "Starting control port filter"
cpfpd start
;;
stop)
echo "Stopping control port filter"
cpfpd stop
;;
restart)
"Restarting control port filter"
cpfpd restart
;;
esac
Which leads to the first issue. There is no status
in python-daemon. It could probably be implemented in a non standard way. To be explored.
We will not need a status
function for systemd
; sysv init
can also implement following procedure. status
will be based on PID and if PID is running, status will return with success when using systemctl status control-port-filter-daemon
.
If you also implement the watchdog function (sd_notify
), systemd will automatically restart the daemon if the watchdog has not heard from daemon in the specified time period which indicates the daemon process is most likely locked up.
Also, when the daemon is running, starting it again outputs a timeout error because the lock file (pid.lock) exists, it seems. Looks like a bug, the daemon is still running.
Quoting @nrgaway
cpfp.py should daemonize itself and write the PID to the proper location
What is the proper location for a daemon pid?
Either:
/var/run/control-port-filter-daemon.pid
/var/run/control-port-filter-daemon/pid
If a /var/run/control-port-filter-daemon
directory is not required for other related files, then it saves having to create such a directory by placing the pid in directly in /var/run
.
whonixcheck
would also need to be update to use new location for pid
troubadour
2015-04-28 18:53:07 UTC
! In T274#4071, @Patrick wrote:
I don’t think this should become a separate package. Never saw something like this. Like apache-server and apache-daemon, doesn’t exist.
Just installing control-port-filter-python without the -daemon package, what’s the use case for that? Anyone up to that could rather just disable the systemd unit and have the same effect.
I thought it’s working like this: you somehow import a library into a python script, that does the listening to events. Systemd starts the binary. Once the system shuts down or [manually] or so, systemd sends a signal to the daemon. It receives it and acts accordingly.
The python-daemon package is more like separate binary in between?
I should have mentioned that the daemon and control-port-filter are in a single package (single file). Instead of importing control-port-filter in the daemon, it’s in the same script. Same behavior, but self-contained.
Created a temporary repository. You can see the code at cpfpd/cpfpd at master · troubadoour/cpfpd · GitHub
Was thinking of a new repository because of the major code change and the different path. And the name of the package would become control-port-filter-daemon
.
The pid file should be in /var/run/control-port-filter-daemon/
, becuse the daemon creates three files:
pid
pid.lock```
>>! In T274#4073, @nrgaway wrote:
> We will not need a `status` function for `systemd`; `sysv init` can also implement following procedure. `status` will be based on PID and if PID is running, status will return with success when using `systemctl status control-port-filter-daemon`.
That's good news. It's probably why it's not implemented in python-daemon.
Patrick
2015-04-28 18:52:21 UTC
During upgrades we have more trouble when we deprecate packages (ex: control-port-filter-python
), want them uninstalled in favor of new packages (ex: control-port-filter-daemon
). The control-port-filter
→ control-port-filter-python
transition has cost quite some effort.
I see no reason why we should abstain from huge refactoring/moving around files in existing packages. We can keep control-port-filter-python
as is (perhaps with minor maintenance fixes if really urgent) in Whonix 10 and have a major revision in Whonix 11.
Sounds good.
troubadour
2015-04-28 19:01:24 UTC
Yes, nothing urgent. This change is not intended for Whonix 10, say we have a good base and we can fine tune the script before Whonix 11 release. I guess some small issues might solve themselves after it’s tested in a jessie based Whonix. We could change the tag, make tit Whonix 11
.
Patrick
2015-04-28 19:16:31 UTC
Yes.
(I just didn’t add the #Whonix_11 tag, because I didn’t intent to do it myself for #Whonix_11 . Was more like adding it to the pile. But if you plan on doing stuff for a specific release tag, please feel free to add the tag and/or to claim the task.)
(I’ll start merging your code and testing it on my system shortly after you’re done as usually so I am not very concerned about missing issues.)
troubadour
2015-05-03 21:01:55 UTC
troubadour
2015-05-08 21:27:42 UTC
Patrick
2015-05-09 15:06:22 UTC
I think using /bin/touch
in systemd unit files might work, but it’s non-ideal. systemd units aren’t shell scripts, and therefore fast. Forking to touch
is “slow”. The canonical way looks like using /usr/lib/tmpfiles.d/
- tmpfiles.d .
stdout stderr > /dev/null · troubadoour/control-port-filter-python@9226b29 · GitHub
In ControlPortFilter() init(), replace /dev/tty
with /dev/null
for stdout and stderr. /dev/tty
is unknown when the service is started by systemd.
That works and if we don’t need /dev/tty
, that’s fine. Otherwise we could tell systemd to wait until /dev/tty
is available by figuring out on which dependency to add.
Other commits:
create /var/run/control-port-filter-python directory · troubadoour/control-port-filter-python@f0eedc8 · GitHub
Create /var/run/control-port-filter-python directory, needed because three files are written by the daemon.
That’s fine for ports and running as root. But otherwise I think it’s up to systemd creating these files as owned by user debian:tor and running as user debian:tor.
Still trying to get the man page without warning.
override_dh_install:
make manpages
dh_installman $(CURDIR)/debian/tmp-man/*
Patrick
2015-05-11 11:13:57 UTC
Patrick
2015-05-13 12:25:05 UTC
An issue. When running sudo -u debian-tor /usr/sbin/cpfpd start
while /var/run/control-port-filter-python
does not exist, cpfpd will exit 0
. When it fails to write into /var/run/control-port-filter-python
it would be great to have it exit non-zero. Fixing that could help with debugging the systemd unit file.
Apart from that, running sudo -u debian-tor /usr/sbin/cpfpd start
and sudo -u debian-tor /usr/sbin/cpfpd stop
works for me.
Patrick
2015-05-13 12:54:53 UTC
troubadour
2015-05-14 20:13:19 UTC
sudo -u debian-tor /usr/sbin/cpfpd start
and sudo -u debian-tor /usr/sbin/cpfpd stop
works for me too. The problem is when the daemon is run with systemd.
sudo systemctl start control-port-filter-python
works but you have to ctrl-c to get back the terminal. sudo systemctl stop control-port-filter-python
also work, and here the the output of sudo systemctl status control-port-filter-python
:
Loaded: loaded (/lib/systemd/system/control-port-filter-python.service; enabled)
Active: activating (start) since Thu 2015-05-14 19:59:31 UTC; 4s ago
Process: 19753 ExecStartPre=/bin/chown --recursive debian-tor:debian-tor /var/log/%p.log (code=exited, status=0/SUCCESS)
Process: 19750 ExecStartPre=/bin/touch /var/log/%p.log (code=exited, status=0/SUCCESS)
Process: 19748 ExecStartPre=/bin/chown --recursive debian-tor:debian-tor /var/run/%p (code=exited, status=0/SUCCESS)
Process: 19745 ExecStartPre=/bin/mkdir -p /run/%p (code=exited, status=0/SUCCESS)
Process: 19741 ExecStartPre=/usr/lib/anon-shared-helper-scripts/torsocks-remove-ld-preload (code=exited, status=0/SUCCESS)
Control: 19756 (cpfpd)
CGroup: /system.slice/control-0.port-filter-python.service
└─19756 /usr/bin/python /usr/sbin/cpfpd start
The service is activating instead of running, and is restarted regularly at the timeout interval set in the unit file. It looks like the process is not detached. Looking into the bcfg2-server
code, there might be something useful in there (thanks for that one, by the way).
troubadour
2015-05-14 20:41:59 UTC
Patrick
2015-05-14 22:54:56 UTC
Patrick
2015-05-14 23:04:31 UTC
Patrick
2015-05-14 23:27:40 UTC
Anything left to do here?
Well, you could go for perfection. Implement sd_notify. Some pointers (dunno if they lead somewhere).
But from my perspective for now, that would be just for the sake of it. For learning. As an intellectual challenge. I don’t see an actual issue that would be solved. Perhaps @nrgaway has some argument in favor of sd_notify? We would open a follow up task if needed.
Otherwise, unless I missed something or you think something… This should be closeable or moveable to Whonix 12?
nrgaway
2015-05-15 19:10:20 UTC
sd_notify
would be used if you want to have watchdog monitor the service. If the service does not respond within X amount of seconds (IE did not send a notification it was still active within this time period), then systemd
would then restart the service.
Since this seems to be a critical service, I would suggest implementing it for Whonix 11
.
I leave that up to @troubadour if he even wants to implement this and @Patrick to decide the level of importance.
I am also assuming Whonix 11 is at least more than 4-6 weeks away from release. If you have a shorter release frame in mind, please let me know as I would need to shift my coding efforts around. My goal was to complete build of Whonix 11 this weekend; then shift to Qubes Pre-configuration
for a few weeks, then back to streamlining the Whonix 11
build
Patrick
2015-05-15 20:55:04 UTC
Difficult question. I am debating with myself. The normal case should be no daemons crashing. If they do crash. something is very wrong that we should apply a-real-fix ™ to, no? On the other hand, If the service was crashing due to some hypothetical parsing issue, I would actually prefer the affected user noticing and reporting an issue. An automatically restarted service with no one noticing because of the auto restart would be bad. The daemon is very stable at this preset. How much code would be added by implementing the sd_notify protocol? A reason against would be if it’s getting complexer so we actually understand it less. If it’s just a few lines, that doesn’t apply.
troubadour
2015-05-15 20:56:09 UTC
I am putting sd_notify
in the pipeline, without guarantee that it will be completed before Whonix 11 release.
On the other hand, systemd restarts the service afterTimeoutSec
if it does not receive a proper notification. It happens in one situation at least, when using Type = forking
in the unit file (cpfpd is running in that configuration, but systemd does not seem aware of it).
As Patrick suggests, we could move it to Whonix 12 and open a follow up task when required.
Moved import gevent
where it should be (beginning of the script).
import gevent at beginning of script (python-gevent 1.0.1 fix startin… · troubadoour/control-port-filter-python@a608e24 · GitHub
That should be it for the daemon part. There will be some PEP8 / pylint compliance adjustments, will use T243.
nrgaway
2015-05-15 21:21:52 UTC
! In T274#4522, @Patrick wrote:
Difficult question. I am debating with myself. The normal case should be no daemons crashing. If they do crash. something is very wrong that we should apply a-real-fix ™ to, no? On the other hand, If the service was crashing due to some hypothetical parsing issue, I would actually prefer the affected user noticing and reporting an issue. An automatically restarted service with no one noticing because of the auto restart would be bad. The daemon is very stable at this preset. How much code would be added by implementing the sd_notify protocol? A reason against would be if it’s getting complexer so we actually understand it less. If it’s just a few lines, that doesn’t apply.
Daemons can crash for many reasons, many of which have nothing to do with the daemon itself. There are kernel issues, memory, cpu, races, so many things that are not in a daemons control and that’s why watchdog exists
Patrick
2015-05-16 04:35:49 UTC
Patrick
2015-05-16 04:46:05 UTC
Patrick
2017-02-21 22:54:40 UTC
Great work by @joysn1980 with the initial implementation.
Merged. And added a few minor commits on top.
Tests:
“sudo service tor-controlport-filter status” contains
Status: “Serving Thread started”
Status: “Serving Thread active. count: 12 / 10000”
Status: “Serving Thread active. count: 55 / 10000”
etc.
Now, if n.notify("WATCHDOG=1")
gets commented out (to simulate what would happen), systemd correctly detects the stale daemon and restarts it.
Feb 21 17:48:26 host systemd[1]: Started Tor control port filter proxy.
Feb 21 17:48:36 host systemd[1]: tor-controlport-filter.service: Watchdog timeout (limit 10s)!
Feb 21 17:48:36 host systemd[1]: tor-controlport-filter.service: Killing process 5419 (tor-controlport) with signal SIGABRT.
Feb 21 17:48:36 host systemd[1]: tor-controlport-filter.service: Main process exited, code=killed, status=6/ABRT
Feb 21 17:48:36 host systemd[1]: tor-controlport-filter.service: Unit entered failed state.
Feb 21 17:48:36 host systemd[1]: tor-controlport-filter.service: Failed with result ‘watchdog’.
Feb 21 17:48:37 host systemd[1]: tor-controlport-filter.service: Service hold-off time over, scheduling restart.
Feb 21 17:48:37 host systemd[1]: Stopped Tor control port filter proxy.
Feb 21 17:48:37 host systemd[1]: Starting Tor control port filter proxy…
Similarly, if n.notify("READY=1")
gets commented out to simulate a daemon that fails to start, systemd rightly never considers the daemon started and tries starting it again.
I am not sure if it was the best idea to move n.notify("READY=1")
to def service_actions(self):
. By adding n.notify("READY=1")
just before server.serve_forever()
I was not sure if a race condition might happen. Currently not the very most pretty solution to keep calling n.notify("READY=1")
but it works great.
Also arbitrarily chosen 10
for WatchdogSec=10
. No idea if that is a good value or if a lower/higher one should be used.
Patrick
2017-03-30 01:59:47 UTC