Apr 9, 2018 - Explicit Return in Ruby

(or, the Perl community got return right)

In a Ruby function, the result of the last expression becomes the return value for that function.

  def demonstrate
    "I'm your return value"
  end

That is, unless there are any explicit return statements that execute before the end of the function.

  def has_a_guard_clause(value)
    return if value.nil?
    evaluate(value)
  end

While you can use an explicit return statement as the last statement in a function, e.g.

  def encode(value)
    outcome = transform(value)
    return outcome.success?
  end

the Ruby community has come down in favour of not doing this. This is evidenced by the default rules applied by the Ruby linting tool Rubocop.

https://www.rubydoc.info/gems/rubocop/RuboCop/Cop/Style/RedundantReturn

Perl has the same behaviour as Ruby regarding the optional use of return.

However, the Perl community came down on the side of using an explicit return. This is the default rule applied by the Perl linting tool PerlCritic.

https://metacpan.org/pod/distribution/Perl-Critic/lib/Perl/Critic/Policy/Subroutines/RequireFinalReturn.pm

I’m going to suggest that the Perl community got it right.

A Ruby (or Perl) function will always return something, but it may not be intended for that something to be used.

There are three reasons I think it’s valuable to have an explicit return.

  1. Ruby has no static type checking to catch accidental return value errors
  2. Being explicit is good information for subsequent coders
  3. It can prevent accidental information leakage

Languages with static types allow us to declare the explicit return type, or the absence of a return value. A static type system would catch at compile time if you forgot to return something, or that it was the wrong type. Quite helpful!

This method returns a value of type decimal

    public decimal Gradient(decimal rise, decimal run)
    {
        return rise / run;
    }

This method returns nothing, as indicated by the void return type.

    public void Log(string message)
    {
        logger.log(timestamp + message);
    }

In Ruby, we have neither a static type system, nor a return type in our method signature, so we can’t indicate that a function is meant to return something, or that it doesn’t. Without this, we’re left with options like:

  • Use documentation to indicate the method’s return type, or lack thereof, and/or
  • Use the method name to help indicate that something (and possibly what) is being returned, and/or
  • Rely on users of our code to inspect the source code for themselves and determine what is being returned.

One way we can help our colleagues, successors, and our future selves is by being explicit in the source code at least. We can do this by using the return keyword to indicate that a function does have a return value intended for use.

  def current_time
    return Time.now
  end

  # No return keyword, so nothing notable being returned...
  def Log(string message)
    logger.log(timestamp + message);
  end

The PerlCritic description for the explicit return rule goes further, it suggests we should always have a return statement, even if that is to return nil (undef in Perl). So the 2nd example becomes:

  def Log(string message)
    logger.log(timestamp + message);
    return nil
  end

Again, this is explicit. It indicates that we should not expect a useful value from this function, whereas the previous version might leave you wondering if logger.log(...) might have some value.

There is another good reason that the PerlCritic guides gives in favour of always having return. It is the potential for a function with no intended return value to accidentally leak information. Here’s another example.

  def add_to_group(email)
    group_email_list << email
  end

group_email_list is going to leak here.

Now admittedly this is quite a contrived example, but you get the idea, and I’m sure with some imagination you can envisage your library being used in unexpected ways where it emits information you didn’t intend.

So, I’d advocate that we, the Ruby community, should also embrace the use of an explicit return statement. To ensure our intention is clear, and to ensure we are returning the thing we intended.

Oct 7, 2014 - Rewrite or Refactor

You’ve inherited a mess of a codebase. Much of it seems impenetrable. Adding features is onerous, and you can’t tell what else you might break in the process. It’s brittle.

The appealing solution is to rewrite the application from scratch.

Rewrites are appealing. A chance to start clean, use the latest technology or at least up-to-date versions of what you’re familiar with.

Rewrites are for code with great tests and test coverage. Rewrites are for well understood business rules. Rewrites are for legacy codebases that are still easy to understand. Rewrites are for an unavoidable platform change. Rewrites are a last resort for when building it and missing some important details is less costly than gradual measured improvements.

The problem with the codebase you have is that there’s some arcane business rule you don’t know about and can’t easily see. There’s some forgotten piece of hard earned workaround, crucial to interoperating with another system. You won’t know about these until the system(s) it integrates with start failing. Murphy’s Law says you’ll only find out in production.

The unappealing, but likely right choice, is that you should refactor. Call it a rebuild, renovation, or overhaul if you like.

Start by building some confidence around making changes by adding or refining tests. When the codebase you’re refactoring doesn’t have a good test suite, then start by testing what you can. From the outside in.

Start with integration tests if you can’t do unit tests. As you break down large classes or functions later you can add unit tests to those bits.

Pull out the crudiest bits into their own isolated class or method. Make it testable. Then make it good.

You’re going to fatigue, so gamify the process. Take some measure of something, like number of tests, code quality score, anything, and try to improve that score. Celebrate some milestones. Give before and after demonstrations. Make cosmetic changes to freshen the UI if there is one.

In the end the outcome will likely be better, and you can pride yourself on leaving something better than you found it.

Sep 27, 2012 - Java and Ruby and XML

I’ve been working with a colleague on a project to create a set of command line applications to read and emit various genomics file formats. The clients asked that we write the applications in Java, since we’ll be handing it over to them and it’s the language they’re most comfortable with.

One of the file formats is mzidentML, an XML format.

It’s been a while since I’ve used Java in anger, and using it to read XML probably longer, so it was virtually like learning it anew.

Even with my more Java adept colleague, it took hours to work out how to use the Java XML API for this relatively simple task. So by noting the analogous operations using Ruby and Nokogiri, I’ll save myself the grief in future.

It’s also interesting to note where I think things in the Java XML API are more verbose or complicated, but probably some of the myriad of external Java XML processing packages address these.

Namespaces

If your XML uses namespaces, like mzidentML does:

<mzIdentML id="" version="1.0.0"
     xsi:schemaLocation="http://psidev.info/psi/pi/mzIdentML/1.0 http://mascotx/mascot/xmlns/schema/mzIdentML/mzIdentML1.0.0.xsd"
     xmlns="http://psidev.info/psi/pi/mzIdentML/1.0"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     creationDate="2012-09-14T11:21:00">

then you need to specify the namespace in your XPath expressions.

Here is the minimum required (as best as I could determine) to set an XML namespace:

xPath.setNamespaceContext(new NamespaceContext() {
    public String getNamespaceURI(String prefix) {
        if (prefix == null)
            throw new NullPointerException("Null prefix");
        else if ("mz".equals(prefix))
            return "http://psidev.info/psi/pi/mzIdentML/1.0";
        return XMLConstants.NULL_NS_URI;
    }
    public String getPrefix(String uri) {
        throw new UnsupportedOperationException();
    }
    public Iterator getPrefixes(String uri) {
        throw new UnsupportedOperationException();
    }
});

So that it can be used in an XPath expression as so:

NodeList peptideList = (NodeList)xPath.evaluate("//mz:Peptide", rootNode, XPathConstants.NODESET);

In Ruby, using Nokogiri, it can basically be:

MZIDENTML_NS = 'http://psidev.info/psi/pi/mzIdentML/1.0';
# Look for <Peptide> elements in this namespace
peptide_list = doc.xpath("//mz:Peptide", {'mz' =>  MZIDENTML_NS})

Now I can appreciate somewhat that the Java NamespaceContext interface is trying to cover more than this simple use case. However I feel there should be, in the core Java API, a simple alternative specifically for this common use case.

XPath Expressions

There was a hint of these in the section above.

To evaluate an XPath expression in Java:

NodeList peptideList = (NodeList)xPath.evaluate("//mz:Peptide", rootNode, XPathConstants.NODESET);

The casting and the constant are understandable in a statically typed language, although I did wonder if method overloading could be used to avoid the casting (obviously the last parameter could not be a String constant, perhaps NodeList.class for example).

The Ruby/Nokogiri version of searching for elements with XPath:

peptide_list = doc.xpath("//mz:Peptide", {'mz' =>  MZIDENTML_NS})

Retrieving Attributes

Given the following XML:

<PeptideEvidence id="PE_11_1_KPYK1_YEAST_0_469_474" start="469" end="474" pre="K" post="E" missedCleavages="0" isDecoy="false" DBSequence_Ref="DBSeq_1_KPYK1_YEAST" />

It took a while to work out a way to get an XML attribute value with the Java XML API. I was convinced this must not be right, but it seems that it is:

String id = node.getAttributes().getNamedItem("id").getNodeValue();

Contrast that with Ruby/Nokogiri:

id = node['id']

Completely straightforward and intuitive.

Conclusion

I’m sure we could have cast around and evaluated the available XML parsing/handling libraries for Java. Many probably make life a lot simpler.

Ultimately we didn’t because we thought the task was simple and straightforward, and hence didn’t need another potentially large library. We thought the provided Java XML API should be sufficient. It was, but it wasn’t straightforward.