Log4Shell - Lessons learned?
We currently see a lot of articles that focus on the current situation with the Log4J vulnerability and what you can do about it right now. In this post I will try to look at the situation from a more long-term perspective and see if we can find ways to prevent the exploitation of similar vulnerabilities in the future.
I would like to start with a supposedly simple question: What is the core problem of the current Log4Shell vulnerability? Your answer would probably be, well Log4J!!! And of course that seems to be the obvious answer. But I would argue, the core problem is actually a bit broader than that: Having functionality available that you're not using.
One important rule for securing any system is to get rid of things you don't use. Every bit of functionality that a system offers but actually doesn't need, is available for an attacker as well. And every line of code, every library and every feature can potentially contain bugs.
Don't get me wrong, I am not arguing against the use of 3rd-party-libraries - not at all. Which functionality you really need, and whether you should write it yourself or use a library, is another discussion that I don't want to go into here. But features you are not using are an entirely different thing. No one will argue that it's a good idea to have those.
Analysis of the attack vectors
Let's look at the different ways the Log4Shell vulnerability can be exploited and see if we can learn some general lessons from that. In the following I am assuming, that an affected version of Log4J is used, user input is actually logged and the attacked server is able to open connections to arbitrary host on the internet. These are obviously prerequisites for this exploitation.
As a side note, before we continue: If it sounds strange to you, that a server can open connections to arbitrary host on the internet, I would probably agree.
A server must of course be able to answer incoming requests. But this is different than opening connections to other hosts on its own. I would guess that many, if not most servers could operate fine with a small whitelist of hosts they are allowed to contact and all other connections blocked.
This could be the first lesson, but as server administration is not really my area, I don't want to recommend things that might later turn out to be impractical. Still if you can restrict your server from initiating connections itself, you should certainly do that.
Scenario 1
If your target uses a Java version before 8u191 (which was released in October 2018 - more than 3 years ago!), you basically got yourself a jackpot, as exploitation is as easy as it gets.
You need an LDAP or RMI server (both of which you could quickly write yourself in Java with a few lines of code) and a simple web server, where you host your payload: A Java class. You can basically write any code you like in that class. It will be downloaded and executed when you trigger the exploit.
Lesson 1: You probably read that here for the first time ever but try to keep the software you are running on your server up to date.
If you cannot update to the latest major version (like Java 11 or 17) even minor versions offer new security features. In this case Java 8u191 and Java 11.0.1 (both released in October 2018) no longer allow the download of classes from remote hosts via LDAP or RMI. You will only find an exception like this in your logs:
Error looking up JNDI resource [...]. javax.naming.ConfigurationException:
The object factory is untrusted.
Set the system property 'com.sun.jndi.rmi.object.trustURLCodebase' to 'true'.
Scenario 2
Your target uses a Java version >= 8u191 but has not yet updated to Java 17. You can no longer make your target download your payload, but you can still use all classes that are already on the server classpath. In many cases the server you are trying to exploit will be an Apache Tomcat which is very popular in the Java world.
Even Spring (Boot) uses it by default. There are attack vectors for other application servers as well, but we are going with Tomcat here. For the exploit to work, you still need your LDAP or RMI server but this time we are returning a so called ResourceRef.
A ResourceRef is basically a recipe how to construct and initialize a class. You define which class should be instantiated and which factory class to use for that. Both of these classes must already be on the classpath.
Luckily Tomcat ships with two classes that come in very handy. But it gets a bit crazier from here on. We use the constructed class to execute a payload that initializes Javas JavaScript engine (yes, Java has that build-in, it is called Nashorn). Then we pass the actual payload, that we want to execute as JavaScript code to this engine which will be executed.
I don't think that lesson 2 is already justified here, but I would like to re-iterate: The core problem is functionality that is available although you are not using it. Well unless you are using the JavaScript engine on your server (which might actually happen in a few cases).
Nonetheless this is most likely not the common case and Java 17 removed the JavaScript engine (to be precise Java 15 did, but I'm only talking about LTS versions here). So with Java 17 this concrete payload would not have worked. This emphasizes the previous lesson to keep the used software up to date but also to keep up with newer Java major versions.
But again, to be honest, that would also not have helped you very much as there are other payloads that don't use the JavaScript engine.
Scenario 3
Now your target uses Java 17, meaning there is no JavaScript engine available. The attack basically works as before, using an LDAP or RMI server and returning a ResourceRef. However this time our payload is a bit simpler and just executes one Java statement directly to run a shell command of our choice. Not much more to say here.
Scenario 4
Your target is using a command line argument, that restricts certain abilities of the JVM. Again Java 17 is used, but this time the JVM parameter "jdk.jndi.object.factoriesFilter" is also set to "!*" e.g.
-Djdk.jndi.object.factoriesFilter='!*'
This parameter is available since Java 8u291 or Java 11.0.11 (both released in April 2021). If you remember the previous two scenarios, we used a factory class to instantiate another class which was then used to execute a payload. With this JVM parameter the factory classes that are allowed to instantiate other classes can be restricted.
When "!*" is passed (which quite literally translate to "not any") all factory classes are forbidden. What is nice about this command line parameter is that no exceptions are thrown. Instead you will find a log entry that contains the text representation of the payload:
ResourceRef[className=javax.el.ELProcessor,factoryClassLocation=null,
factoryClassName=org.apache.naming.factory.BeanFactory,
{type=scope,content=},{type=auth,content=},
{type=singleton,content=true},{type=forceString,content=x=eval},
{type=x,content=...}]
This closes (in theory) the attack vector we used in the previous two scenarios. "In theory" because as I am not a security expert, I don't want to discard the possibility that there is some way to circumvent this filter. Nonetheless I am currently not aware of any. By the way, this command line argument does not have to be set to "!*". If you seriously need this feature for your application, you could also set it to something like
-Djdk.jndi.object.factoriesFilter='tld.yourcompany.*'
or even a concrete class. Of course you have to make sure that the allowed class(es) cannot be misused for any attacks.
It seems, we again need a new attack vector. There is one more, that I am aware of, that does not rely on any factory classes. This time we need an LDAP server that returns an Entry which contains an attribute with a special name ("javaSerializedData").
In this attribute we pass the byte code of a compiled Java class (which can be generated with ysoserial). When you trigger the exploit, the class that is contained in the attribute will be deserialized and loaded. Any code that runs when the class is accessed (static block) will be executed. Similar to before ysoserial uses certain classes to execute your payload. These classes have to be on the server classpath for this to work.
It's time for lesson 2: There are certain command line switches available to restrict features of the JVM. If you don't use the features, disable them using these switches. We will see two more parameters in the following sections.
Scenario 5
The setup is the same as in scenario 4. In addition the JVM parameter "com.sun.jndi.ldap.object.trustSerialData" is set to "false"
E.g.
-Dcom.sun.jndi.ldap.object.trustSerialData=false
This parameter is also available since Java 8u291 or Java 11.0.11. Since then the parameter has been extended to also apply to the attribute "javaReferenceAddress", which might also be used for an attack. This change is available since Java 8u311, Java 11.0.13 and Java 17.0.1. In your logs you will find an exception like this:
Error looking up JNDI resource [...].
javax.naming.NamingException: Object deserialization is not allowed;
remaining name '...'
For this setup I am currently not aware of a working attack vector, which does not mean very much. To make this very clear: This does not mean that there are no working attack vectors.
As you have seen in the five scenarios above, often there are alternative options available. Does that mean that setting these parameters is useless? I don't think so. As I am trying to make clear in this post: The fewer features, that you aren't using are available, the more secure you are. Security is rarely about absolutes. The fewer options you give an attacker the better.
If you read carefully, you noticed, that I was talking about two more parameters that we would discuss. So far I only mentioned one other. All those parameters, and in fact all attack vectors, that we have seen, work by using a Java feature called serialization.
If your application does not use serialization, you can disable it entirely. When I say serialization, I don't mean something like JSON serialization (and deserialization) for which you use a library like Jackson. Such serialization will continue to work fine. I am talking about the build-in Java serialization, which was used more often in the early Java years.
In modern Java applications this feature is usually no longer used and it's also not recommended to use it anymore (for example in basically every edition of Effective Java by Joshua Bloch). If you are still using Java serialization in your application, you should seriously consider getting rid of it.
You can set the JVM parameter "jdk.serialFilter" to "!*"
E.g.
-Djdk.serialFilter='!*'
This parameter is available since Java 8u121 and should disable serialization. As with "jdk.jndi.object.factoriesFilter" you can also use the same filter syntax here. Meaning you can only disable serialization for certain classes and allow it for others. Again you have to make sure, that the allowed class(es) cannot be misused for any attacks. In your logs you will find an exception with root cause:
java.io.InvalidClassException: filter status: REJECTED
Summary
Attacking systems by making use of Java serialization is not a new thing. In fact most of the attack vectors we've seen here are known for years. Log4J just made those attack vectors available again in a very, very simple way. Nothing more, but also nothing less.
Having these attack vectors lurking on your server, just waiting for a new method to access them is risky, to say the least. Still Java serialization is a core feature and I would be surprised if would be removed or even deprecated anytime soon. Too many (older) Java applications, that still use it, would break. Nonetheless the Java maintainers seem to have realized that this poses a security risk for many applications.
Especially newer Java versions contain a bunch of switches to mitigate this risk. The only downside is, you have to enable them manually. To be able to profit from all three parameters you need at least one of the following Java versions:
- Java 8u311
- Java 11.0.13
- Java 17.0.1
If your application does not use serialization, disable it with the command line parameter
-Djdk.serialFilter='!*'
All these parameters can also be added as environment variable "JAVA_TOOL_OPTIONS". If you need a more fine-grained control over which serializations are allowed, use the parameters
-Dcom.sun.jndi.ldap.object.trustSerialData=false
and
-Djdk.jndi.object.factoriesFilter='!*'
(or with a concrete class instead of '!*' if you have to). This will make serialization attacks more difficult in the future. As always there is no guarantee, that this will prevent all attacks. If you are unsure whether your application uses serialization, enable those switches on a QA environment first and test for a while if everything still works.
Furthermore, security is a process. Meaning you have to work on it continuously and ideally have a process in place that facilitates it. Having your dependencies up to date is a good first step. This usually enables you to painlessly update to newer Java versions. One way to facilitate dependency updates might be Renovate which will regularly scan your repository and open pull requests with updates for outdated ones.
Bonus - custom JRE and Docker image
One other thing which we did not discuss in this post, is the ability of Java to talk to LDAP or RMI servers although it is also often not used. If we could disable this ability, we would also have prevented many of the discussed attack vectors.
This can be partly achieved by building a custom JRE. Although that may sound very complicated, it is actually not that hard. The necessary tools are available since Java 9 (which introduced the Java module system), meaning you cannot build your own JRE with Java 8. You can use the command line tool "jlink" (which is part of any JDK) like this:
jlink --add-modules <comma separated list of Java modules>
--compress=2 --no-man-pages --no-header-files --output <target directory>
The only tricky part is, figuring out which modules your application requires. In general, this can be achieved using the tool "jdeps". But this tool is not as flexible and fault tolerant as you would wish it to be. If you have a fat jar from a Spring Boot application and use jdeps to determine the module dependencies like this:
jdeps --print-module-deps --recursive --ignore-missing-deps <jar file>
Most likely the tool will only output two or three dependencies, which is clearly not correct. The reason is, that jdeps cannot handle the jar structure from Spring Boot. The best way to make it work for Spring Boot is to extract the fat jar to a temp directory. The jdeps parameters you need now are a bit more complicated. Ideally you export an environment variable first:
export extractPath=<path where you extracted the jar file>
Then you execute:
jdeps --class-path "${extractPath}/BOOT-
INF/lib/:${extractPath}/BOOT-INF/classes:${extractPath}"
--print-module-deps --ignore-missing-deps --module-path
${extractPath}/BOOT-INF/lib/<additional jar file>
--recursive --multi-release <the Java version your appilcation
uses> -quiet
${extractPath}/org ${extractPath}/BOOT-INF/classes
${extractPath}/BOOT-INF/lib/*.jar
You can try to execute the command without the "--module-path" parameter first. If it complains with a message like "Module <module name> not found, required by <other module name>" you add the jar file, that contains this module to the module-path. Often it is something like "jakarta.activation-api-X.X.X.jar" or "jakarta.annotation-api-X.X.X.jar" (or "java" instead of "jakarta" depending on the versions you use). You can then use the module list, to build your JRE with jlink as described above.
If the reported modules do not contain "jdk.scripting.nashorn" and "jdk.scripting.nashorn.shell" you can even build a Java 11 JRE which does not contain the JavaScript engine. If the modules do not contain "jdk.naming.rmi", your application should no longer be able to talk to RMI servers.
Stopping it from talking to LDAP servers does not seem to be that simple. There is a module called "jdk.naming.ldap" (at least in Java 11), but removing it, changes nothing.
And as closing words: In the spirit of removing functionality you don't use, if you should build a Docker image with that JRE, consider using a distroless base image like the ones here: https://github.com/GoogleContainerTools/distroless/blob/main/base/README.md