In a Rails application I work on we have a class in that provides a kind of null/nil value. It can be used in both the context of arithmetic operations, and in a string context. It is pretty similar to the example below:
With this we can do things like the following without needing to check for nil values:
irb(main):010:0> values = [BigDecimal('12.3'), BigDecimal('45.6'), NullValue.new]
irb(main):011:0> values.reduce(:+).to_f
=> 58.9
The way the coerce
method is implemented, it should work with any operation a Numeric supports.
However, it turned out that we’d never used this NullValue
in subtraction previously. When a developer in the team introduced code using this operation, it worked just fine using MRI/CRuby, however when it was deployed to our continuous deployment server that used JRuby, it suddenly failed.
I decided to track down what was going on.
The issue was easy to reproduce:
irb(main):001:0> RUBY_VERSION
=> "2.5.0"
irb(main):002:0> JRUBY_VERSION
=> "9.2.0.0"
irb(main):003:0> load './null_value.rb'
=> true
irb(main):004:0> BigDecimal(1) - NullValue.new
Traceback (most recent call last):
7: from /Users/seanm/.rbenv/versions/jruby-9.2.0.0/bin/jirb:13:in `<main>'
6: from org/jruby/RubyKernel.java:1180:in `catch'
5: from org/jruby/RubyKernel.java:1180:in `catch'
4: from org/jruby/RubyKernel.java:1418:in `loop'
3: from org/jruby/RubyKernel.java:1037:in `eval'
2: from (irb):4:in `<eval>'
1: from org/jruby/ext/bigdecimal/RubyBigDecimal.java:1134:in `-'
TypeError (NullValue can't be coerced into BigDecimal)
I tried a few different versions of JRuby to check if it was a recent regression, however those versions (9.1.17.0 and 1.7.27) had the same issue.
The stack trace gives a place to start. Examining line 1134 of org/jruby/ext/bigdecimal/RubyBigDecimal.java
@JRubyMethod(name = "-", required = 1)
public IRubyObject op_minus(ThreadContext context, IRubyObject b) {
return subInternal(context, getVpValueWithPrec(context, b, true), b, 0);
}
left me scratching my head a bit. Without a bit of context nothing stands out. I looked at the implementation of subInternal()
private IRubyObject subInternal(ThreadContext context, RubyBigDecimal val, IRubyObject b, int prec) {
if (val == null) return callCoerced(context, sites(context).op_minus, b);
RubyBigDecimal res = subSpecialCases(context, val);
return res != null ? res : new RubyBigDecimal(context.runtime, value.subtract(val.value)).setResult(prec);
}
That line containing callCoerced()
looked like the path to investigate. It is only called if val
is null. From the calling location in op_minus
, that val
is the result of getVpValueWithPrec(context, b, true)
.
To try to get some context of how this is used, I looked for similar calls to getVpValueWithPrec
. There’s a method op_plus which appears to be similar to op_minus
.
@JRubyMethod(name = "+")
public IRubyObject op_plus(ThreadContext context, IRubyObject b) {
return addInternal(context, getVpValueWithPrec(context, b, false), b, vpPrecLimit(context.runtime));
}
The difference in this case is that the third parameter to getVpValueWithPrec
is false
instead of true
. Time to see what that does.
private RubyBigDecimal getVpValueWithPrec(ThreadContext context, IRubyObject value, boolean must) {
if (value instanceof RubyFloat) {
// Unimportant details omitted
}
if (value instanceof RubyRational) {
// Unimportant details omitted
}
return getVpValue(context, value, must);
}
At this point we see the third parameter is named must
. I know that our value
isn’t a RubyFloat or RubyRational (it’s a NullValue), so this method falls through to call getVpValue
, passing through the must
value.
private static RubyBigDecimal getVpValue(ThreadContext context, IRubyObject value, boolean must) {
switch (((RubyBasicObject) value).getNativeClassIndex()) {
case BIGDECIMAL:
...
case FIXNUM:
...
// Unimportant details omitted
}
return cannotBeCoerced(context, value, must);
}
Nearly there. Again it falls through the other type checks and ends up at the call cannotBeCoerced
with the must
value passed in.
private static RubyBigDecimal cannotBeCoerced(ThreadContext context, IRubyObject value, boolean must) {
if (must) {
throw context.runtime.newTypeError(
errMessageType(context, value) + " can't be coerced into BigDecimal"
);
}
return null;
}
This method contains the text of the error message received from the failed subtraction that started this investigation. So if must
is true
, then the error is raised. Recall that the op_plus
has the must
value as false
.
Returning all the way back up the call stack to op_minus
, we see that if must
is changed to false
, then getVpValue
and subsequently getVpValueWithPrec
will return null
. Thus the val
passed to subInternal
will be null
.
private IRubyObject subInternal(ThreadContext context, RubyBigDecimal val, IRubyObject b, int prec) {
if (val == null) return callCoerced(context, sites(context).op_minus, b);
// Unimportant details omitted
}
With val
being null
, the callCoerced
method would actually be called in the first line of subInternal
, instead of an exception being raised.
The fix is to change that must
value, as seen in this diff:
diff --git a/core/src/main/java/org/jruby/ext/bigdecimal/RubyBigDecimal.java b/core/src/main/java/org/jruby/ext/bigdecimal/RubyBigDecimal.java
index 12b7788b2d..5c3373b7ee 100644
--- a/core/src/main/java/org/jruby/ext/bigdecimal/RubyBigDecimal.java
+++ b/core/src/main/java/org/jruby/ext/bigdecimal/RubyBigDecimal.java
@@ -1131,7 +1131,7 @@ public class RubyBigDecimal extends RubyNumeric {
@JRubyMethod(name = "-", required = 1)
public IRubyObject op_minus(ThreadContext context, IRubyObject b) {
- return subInternal(context, getVpValueWithPrec(context, b, true), b, 0);
+ return subInternal(context, getVpValueWithPrec(context, b, false), b, 0);
}
That addresses the exception that was being raised.
However, while I was looking around trying to understand how callCoerced worked, I noticed that there was a difference in the way it is called by methods similar to subInternal
.
subInternal
calls it as follows:
return callCoerced(context, sites(context).op_minus, b);
whereas addInternal
calls it with four parameters:
return callCoerced(context, sites(context).op_plus, b, true);
Most of the other invokations of callCoerced
used the 4 parameter version, with that last parameter set to true
.
The implementation of the 3 parameter form of callCoerced
method is:
public final IRubyObject callCoerced(ThreadContext context, CallSite site, IRubyObject other) {
return callCoerced(context, site, other, false);
}
Of interest is how it calls the 4 parameter version of the method with a value of false
. So I decided to check what that does.
The 4 parameter version of callCoerced looks like:
protected final IRubyObject callCoerced(ThreadContext context, CallSite site, IRubyObject other, boolean err) {
IRubyObject[] args = getCoerced(context, other, err);
if (args == null) return context.nil;
IRubyObject car = args[0];
return site.call(context, car, car, args[1]);
}
That 4th parameter is err
and that gets passed to getCoerced. The interesting part is below:
try {
result = sites(context).coerce.call(context, other, other, this);
}
catch (RaiseException e) { // e.g. NoMethodError: undefined method `coerce'
if (error) {
throw runtime.newTypeError(str(runtime, types(runtime, other.getMetaClass()), " can't be coerced into ", types(runtime, getMetaClass())));
}
return null;
}
If there is no coerce
method provided, an exception is raised. This exception is swallowed unless the error
parameter (which is the err
parameter from callCoerced
) is true
.
In the case of the NullValue
example from the beginning of this post we can remove the coerce
method (and the method_missing
one) to see what happens.
Firstly addition:
irb(main):001:0> load './null_value.rb'
=> true
irb(main):002:0> BigDecimal(1) + NullValue.new
Traceback (most recent call last):
7: from bin/jirb:12:in `<main>'
6: from org/jruby/RubyKernel.java:1181:in `catch'
5: from org/jruby/RubyKernel.java:1181:in `catch'
4: from org/jruby/RubyKernel.java:1415:in `loop'
3: from org/jruby/RubyKernel.java:1043:in `eval'
2: from (irb):2:in `evaluate'
1: from org/jruby/ext/bigdecimal/RubyBigDecimal.java:1039:in `+'
TypeError (NullValue can't be coerced into BigDecimal)
This fails with an exception as expected, because there is no coerce
method provided.
However, looking at subtraction:
irb(main):003:0> BigDecimal(1) - NullValue.new
=> nil
This fails quietly.
To make it consistent, and correct, the invokation of callCoerced
in subInternal
needs to change to the 4 parameter version with an err
value of true
, just like addInternal
.
The full PR for these fixes can be found at https://github.com/jruby/jruby/pull/5391.
As a final thought, it’s often interesting how much effort can go into what ultimately is a tiny code change.