Tomcat Error: Too many open files

  • 7011205
  • 20-Jul-2011
  • 19-Oct-2012

Resolution

If you are running Tomcat as your Web Application Server (WAS) on Linux, you may run into a condition where the log files will grow exponentially and could exceed the available disk space upon your server rather suddenly.The evidence of this issue can be found in the log files themselves as they will display something similar to this message:

19-Jul-2011 14:13:56 org.apache.tomcat.util.net.AprEndpoint$Acceptor run
SEVERE: Socket accept failed
org.apache.tomcat.jni.Error: Too many open files
        at org.apache.tomcat.jni.Socket.accept(Native Method)
        at org.apache.tomcat.util.net.AprEndpoint$Acceptor.run(AprEndpoint.java:1002)
        at java.lang.Thread.run(Thread.java:595)

This message is stating the problem for us, and that is that Tomcat is experiencing a jni error: Too many open files.  JNI or the Java Native Interface (jni) is a programming framework that allows Java code running in a Java Virtual Machine (JVM) to call and to be called by native applications (programs specific to a hardware and operating system platform) and libraries written in other languages.

This is not a limitation with either Tomcat or Access Governance Suite, but is a configuration issue that needs to be reviewed upon the server itself. The simple solution to resolve the issue without identifying the cause would be to purge the log files and then restart the Tomcat service. This workaround alleviates the problem, but it does not alleviate the underlying cause. The issue will arise again unless and until the root cause is addressed within the Operating System (OS).

In order to being troubleshooting the issue, you should first start by verifying the open files limit on the OS itself. This can be done with a simple command:

ulimit -a

This may return soemthing similar to this:

chuck@nowhere~$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

In the messages above, the interesting bits are the open files which is shown to have an upper limit of 1024. Does this mean that Tomcat has 1024 open files? Not necessarily, as the OS itself could have many of the open files in use and Tomcat could simply have run out of available files to use. Somehow a process has taken over too many fo the open files, and you need to  take corrective actions to resolve the issue at hand.

You can reset the ulimit permanently to resolve the issue in the system for the filesystem, by raising some values on the server permanently to resolve these issues. Before you begin though, you should understand that these limits ahve been put in place by design to iunsure that no one user can take down an entire machine.  Having said that and forearmed with this knowledge though, yo can choose to edit the file:

/etc/security/limits.conf

and add the following lines:

* soft nofile 16384
* hard nofile 65536

which will permit all users to have a soft limit of open files of 16384 and a hard limit of 65536.

You can also add the lines:

* soft nproc 4096
* hard nproc 16384

To reset the number fo processes that users are allowed to run if you so choose, but the caveats about security and granting users this much capability per machine are to be carefully considered by your infrastructure team before making any changes to the processes. You should thoroughly understand the implications of making these decisions when doing so, and limit the number of actual users that have access to the machine. Careful auditing of those users should be undertaken as well.

Although these changes will survive the reboot, there is one more thing you should do as well, and that is to verify the configured value of the filesystem max filesize limits . You may wish to change or alter this setting if necessary as too small of a limit here could affect performance in your environment, or cause things not to operate as expected.  In order to verify this value, run the command:

cat /proc/sys/fs/file-max

which should return something similar to:

32678   

In my system this is set at 588108 - but this is my system and yours will vary depending upon the available RAM, kernel and processor model and OS bit-level available to you. Preferably, you have a 64-bit )Sand will have a file-max size that is larger than 32678, and if so, no futher modifications should be required.

If not however, you may have to modify your /etc/sysctl.conf file parameters to alter the fs.file-max variable (which we had just used the cat command above to determine the settting). Preferably, you would use the following command and your favorite text editor (and no I won't choose for you) to alter the file:

sudo $text-editor /etc/sysctl.conf

where $text-editor represents your text editor of choice, be it vi, gvim, emacs, ed, joe, gedit, leafpad, or whatever you choose to use. The line you would need to add should be written like so (if it does not already exist):

fs.file-max = 32678

Again, 32678 is considered a minimum here, your system may already have a number much higher than this, and if so, you should not require any alteration to this parameter. You will need to shutdown and completely reboot the server as these are kernel-level changes that do require reboot to take effect without issue and persist effectively. Once you have done so, they should persist and remain effective in the future.

Another very important and often overlooked item here is to cleanup the initial issue that was reported wherein the logs were being filled with the messages about too many open files. The best thing you can hope to do is to compress those logs prior to restart of the WAS (Tomcat), so that you can clear up space on the drives for continued logging. Using the servers on-board compression tools, you can tar gzip the files down to manageable sizes as these logs contain huge amounts of similar lines. Once this operation is completed, you can then begin the verification process.

Next, you should insure that logrotate is setup to monitor the ~/Tomcat/logs/ directory and will auto-rotate the logs when they reach a certain size and then compress the files and date timestamp the files when compressed and stored. This will insure that you retain these web server logs in a compressed format on the server as required. Should space become an issue, these archives may be offloaded to additional storage if required. Setting up logrotate is an institutional matter, so your choices will vary, but I will discuss the basic methodology herein. The configuration of logrotate is set in the file:

/etc/logrotate.d/tomcat

whose contents may look like this:

/home/Chuck/tomcat//logs/catalina.out {
    copytruncate
    daily
    rotate 7
    compress
    missingok
    size 5M
    }

Now that file is just for my local developer's box, your server implementation will be different and will not be the same at all. You should really consult the man logrotate facility of your server which will describe in detail the available parameters and functions from which you may choose to set your own logrotate script to the local path of your Tomcat install to rotate the logs on your instance.

In order to verify your work from above has survived a reboot, you would need to run the following commands:

ulimit -a (note the output for open file handles)

cat /proc/sys/fs/file-max  (note the output)

If everything is as expected and you now have sufficient drive space, you should then restart your WAS (Tomcat) and insure that everything is running properly on your server(s). If so, you now have a working environment, and should not expect to see the return of the issue with too many open files, as the open file handles are sufficiently increased and should be able to process the requests as required.