poll() and EOF

poll() is a UNIX system call which allows a program to monitor multiple file descriptors and detect when they are readable or writable. A reasonable question to ask is: what does it do when a file descriptor reaches EOF (end of file)? Should it set POLLIN (as one might expect from select()), POLLHUP, or both?

The Single UNIX Specification says, in what is basically a copy of the SVR4 poll() man page:

POLLIN	Data other than high-priority data may be read without blocking. For STREAMS, this flag is set in revents even if the message is of zero length.
POLLHUP	The device has been disconnected. This event and POLLOUT are mutually exclusive; a stream can never be writable if a hangup has occurred. However, this event and POLLIN, POLLRDNORM, POLLRDBAND or POLLPRI are not mutually exclusive. This flag is only valid in the revents bitmask; it is ignored in the events member.

It’s not immediately clear from this whether “disconnected” means any end-of-file condition, or refers only to things like modem hangups. (When you type ^D at a program, only that program sees EOF - the calling shell doesn’t lose its input too. That only happens when the connecting modem hangs up, the telnet session is terminated, or whatever.)

Stevens thought the latter, and said in APUE that this implies that plain EOFs should be signalled by POLLIN, not POLLHUP.

Do the implementors agree? Unfortunately, my tests show that the behaviour is neither consistent within a single implementation, nor between different versions of the same implementation, and certainly not across platforms:

Platform	Version	File Descriptor
		pipe	socketpair			regular file
		pipe	closed	SHUT_WR	SHUT_RD	regular file
Linux	2.2.19	POLLHUP	POLLHUP	?	?	POLLIN
Linux	2.4.25, 2.6.18, 2.6.32, 3.2.0	POLLHUP	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN
SunOS	5.6, 5.7, 5.8, 5.10	POLLHUP	POLLIN	POLLIN	POLLIN	POLLIN
Mac OS	10.3.4	POLLIN	POLLIN	POLLIN	POLLIN	POLLIN
Mac OS	10.4.11, 10.6.4, 10.7.4	POLLIN\|POLLHUP	POLLIN\|POLLHUP	POLLIN\|POLLHUP	POLLIN\|POLLHUP	POLLIN
FreeBSD	4.2, 4.5, 4.9, 6.2	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN	POLLIN
FreeBSD	8.0, 9.0	POLLIN\|POLLHUP	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN
NetBSD	1.6.1	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN	POLLIN
OpenBSD	3.1	POLLIN	POLLIN	POLLIN	POLLIN	POLLIN
OpenBSD	5.4	POLLIN\|POLLHUP	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN
AIX	4.3, 5.1L	POLLIN	POLLIN	POLLIN	POLLIN	POLLIN
AIX	5.3	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN	POLLIN
Cygwin		POLLIN	POLLHUP	POLLHUP	POLLERR	POLLIN
HP-UX	10.20	POLLIN	POLLIN	?	?	POLLIN
	11.00, 11.11, 11.23	POLLIN	POLLIN\|POLLHUP	0	POLLIN\|POLLHUP	POLLIN
	11.31	POLLIN	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN
UnixWare	7.1.1	POLLHUP	POLLIN\|POLLHUP	?	?	POLLIN
IRIX	6.4, 6.5	POLLIN\|POLLHUP	POLLIN	POLLIN	POLLIN	POLLIN
OSF1	4.0	POLLIN	POLLIN\|POLLHUP	?	?	POLLIN

“?” in the table reflects combinations I didn’t have convenient access to when doing the tests. Feel free to send updates (see below for a test program). The other entries are the bits set in revents when the file descriptor as described is at end of file.

In all cases, poll() is asked to check for POLLIN on the reading end of the connection (where there’s a distinction).

The pipe column means that the other end of the pipe was closed. For the socketpair columns, the first means that the other end of the socket was closed; the second that it was shut down for writing; and the third that this end of the socket was shut down for reading (in which case, polling for read is probably a bug, but it’s interesting to know what happens in any case).

The final regular file column describes what happens when you poll a regular (disk) file. This is pretty pointless as they are always considered immediately readable (perhaps even if they are opened write-only, e.g. on Linux).

HPUX shutdown(2) appears to be broken.

On Cygwin, reading from a socket that you’ve shutdown with SHUT_RD produces ESHUTDOWN rather than EOF, and POLLERR if you poll it. This sees fair enough to me.

The lesson for authors of portable code is clear: you should test whether either bit (POLLIN or POLLHUP) is set, and you should rely on the subsequent read() to tell you whether you reached EOF:

if(revents & (POLLIN|POLLHUP))
  n = read(fd, buffer, sizeof buffer);
  if(n < 0 && (errno == EINTR || errno == EAGAIN)) {
    // try again later
  } else if(n < 0) {
    // error
  } else if(n == 0) {
    // EOF
  } else {
    // n bytes read
  }
}

Also you have to avoid at least one of shutdown() and HPUX.

What should implementors do? I think interpreting POLLHUP as referring to modem hangups &c rather than just end of file is most accurate interpretation of the SUS specification, and that implementations should therefore set POLLIN alone for other kinds of end of file.

poll.c is the program I use to update these results. If you run it on your own system, mail me the output. Thankyou to all the people who’ve submitted results so far.

poll() and EOF

The Question

The Specification

The Implementations

Conclusions

Code