Diff two modified JSON files in fish

Another interesting command line JSON exercise: you have two JSON files, you want to diff a modified version of one to the other, and your shell is fish.

For making JSON diffable, gron is a great choice: it transforms a JSON file into a list of assignment lines. Like this:

$ echo '{"a": ["b", "c"]}' | gron
json = {};
json.a = [];
json.a[0] = "b";
json.a[1] = "c";

gron doesn't help us with modifications, so jq is again a good choice there. In my case, I had one file that contained an array of objects, and wanted to compare the first element of that array to the content of a second file. Additionally I wanted to set the value of one key of the object from the first file to null. With jq that's quick work:

jq '.[0] | .large_field = null' file1.json | gron

The second file I wanted as-is, and that's just

gron file2.json

The last problem I wanted to solve was how to actually get the output from those two commands to diff (or colordiff). I could just pipe them to files and compare those, but that's untidy. As usual it's much easier to find answers to this for bash than for fish. In bash (and zsh, and ksh) it looks like this:

colordiff -u <(jq '.[0] | .large_field = null' file1.json | gron) <(gron file2.json)

A quick peek at the bash manual page reveals that <( is used for "process substitution". Search the web and you'll find that in fish that's accomplished with the magic command psub:

colordiff -u (jq '.[0] | .large_field = null' file1.json | gron | psub) (gron file2.json | psub)

And that's it. Pretty colored diff of the parts you're interested in, and no temporary files to remove afterwards.

Copy value with jq

I use jq heavily in my day to day work. It's a powerful tool but not always easy, so I have piles of notes about how to do things with it.

I had to do a few iterations of copying a value from one JSON file to another the other day, and the files are large, and copying and pasting and launching editors was getting old, so I reached for jq. And after digging for a while and reading Stack Overflow answers interpreting the manual page for me, here's how to do it, in fish:

jq --argjson js (jq .path.field a.json) '.path.field = $js' b.json | sponge b.json

The subshell invoked by () reads the value from a.json, --argjson sets the value the variable js without extra quotes or escapes, as it's already a valid JSON value, and then the assignment just sets the value to the document read from b.json.

It also requires sponge because the jq developers don't want it to have an -i option. Oh well.

Parsing and evaluating mathematical expressions in Swift / Part 3: Interpreter

Ever needed to interpret mathematical expressions with variables, like a.field1 + (a.field2 - b.field1) * 2, in Swift? I did. This series of blog posts will walk you through my solution. This is part 3 of the series:

  1. Tokenization
  2. Building a syntax tree
  3. Evaluating the syntax tree

Evaluating the syntax tree

To recap, we started with the string a.field1 + (a.field2 - b.field1) * 2 and ended up with this Swift structure:

ExpressionTreeNode.operation(
    .addition,
    .atom(.name(["a", "field1"])),
    .operation(
        .multiplication,
        .operation(
            .subtraction,
            .atom(.name(["a", "field2"])),
            .atom(.name(["b", "field1"]))
        ),
        .atom(.number(2))
    )
)

Now we want to compute the value of that tree. That requires two things: a depth-first walk of the tree to compute the nested values first, and a context where we can find values for the names. Let's look at the name context first.

Binding values to names is something we're very well equipped for. It's just a simple dictionary. One thing we do have to consider is the types of values. My implementation deals in floating point values, but we have those names as arrays, allowing nested contexts. You might be able to go with enum values, but in this case I ended up with plain dynamic typing: values typed as Any, types checked at runtime. So our dictionary is typed as [String: Any].

Do we want to build up dictionaries of all our values? I suppose we could. But as you might have guessed from the names in our expressions, the values are actually derived from some actual live objects. It might be nicer to feed those objects directly to the evaluator, right? We can use a protocol to make it happen. Swift's dictionaries use Swift's subscript functionality to allow access to values, and you can adopt that in your own types, too. So let's define a protocol:

protocol ExpressionEvaluatorNameContext {
    subscript(name: String) -> Any? { get }
}

So what do we want, the dictionary I discussed earlier or types implementing that protocol? Swift allows us to have our cake and eat it too:

extension Dictionary: ExpressionEvaluatorNameContext where Key == String, Value == Any {}

And if we want to wrap our objects in that protocol, we can define simple wrapper types:

struct EvaluatorContextUIViewWrapper {
    private let view: UIView

    init(view: UIView) {
        self.view = view
    }
}

extension EvaluatorContextUIViewWrapper: ExpressionEvaluatorNameContext {
    subscript(name: String) -> Any? {
        switch name {
        case "height": return Double(self.view.bounds.height)
        default: return nil
        }
    }
}

Now that we have our context types defined, we can start digging for the values. First let's define a couple of error types that we'll use in our evaluator. EvaluationError is what we'll use to signal problems outside, and NameLookupInContextFailed is used to accumulate errors in recursive name resolution.

enum EvaluationError: Error {
    case emptyName
    case unknownName([String], Error)
    case invalidOperation(ExpressionOperator, Any, Any)
}

struct NameLookupInContextFailed: Error {
    let name: String
    let cause: Error?
}

Then define a couple of functions for finding values in those nested contexts. One provides the outside interface, the other recursively calls itself with names from the list until reaches the end or runs into a problem:

func contextValue(for name: [String], context: ExpressionEvaluatorNameContext) throws -> Any {
    guard let nameHead = name.first else { throw EvaluationError.emptyName }
    do {
        return try contextValue(for: nameHead, tail: name.dropFirst(), in: context)
    } catch {
        throw EvaluationError.unknownName(name, error)
    }
}

func contextValue(
    for nameHead: String,
    tail: ArraySlice<String>,
    in context: ExpressionEvaluatorNameContext
) throws -> Any {
    let maybeValue = context[nameHead]

    guard let value = maybeValue else {
        throw NameLookupInContextFailed(name: nameHead, cause: nil)
    }

    if let tailHead = tail.first {
        guard let subContext = value as? ExpressionEvaluatorNameContext else {
            throw NameLookupInContextFailed(name: nameHead, cause: nil)
        }

        do {
            return try contextValue(for: tailHead, tail: tail.dropFirst(), in: subContext)
        } catch {
            throw NameLookupInContextFailed(name: nameHead, cause: error)
        }
    } else {
        return value
    }
}

That's pretty straightforward — look for a value, check if we have still more parts of the name to resolve, if so recurse, otherwise return the value, check types and throw errors as necessary. With those functions in place, we can resolve names in our expressions.

We can now move on to expression node evaluation. We have two kinds of expression nodes: ones with just a simple value, and ones with an operation on other nodes.

func evaluate(expression: ExpressionTreeNode, context: ExpressionEvaluatorNameContext) throws -> Any {
    switch expression {
    case let .atom(atom):
        return try evaluate(atom: atom, context: context)
    case let .operation(op, n1, n2):
        return try evaluate(operation: op, node1: n1, node2: n2, context: context)
    }
}

For atoms, it can be a number which we can return directly, or a name, in which case we'll use the contextValue functions we defined earlier.

func evaluate(atom: ExpressionTreeAtom, context: ExpressionEvaluatorNameContext) throws -> Any {
    switch atom {
    case let .name(name): return try contextValue(for: name, context: context)
    case let .number(num): return num
    }
}

And for operators we need to find a function that implements the operator and then evaluate both sides so we get the final operand values for the operation.

func evaluate(
    operation: ExpressionOperator,
    node1: ExpressionTreeNode,
    node2: ExpressionTreeNode,
    context: ExpressionEvaluatorNameContext
) throws -> Any {
    let opf = opFunc(operation)
    let val1 = try evaluate(expression: node1, context: context)
    let val2 = try evaluate(expression: node2, context: context)
    return try opf(val1, val2)
}

All right, almost done! Except for the implementation of opFunc which should give us actual operator implementations. We need functions that are ready to deal with two Any values and do math of them. I'm going to present simple versions here that deal only with Doubles for brevity. In reality you may want to support other numeric types too, as well as deal with Optionals. The op function is generic and will happily deal with any types you throw at it, but optionals would require a bit more work to unwrap them.

func opFunc(_ op: ExpressionOperator) -> (Any, Any) throws -> Any {
    switch op {
    case .addition: return opAddition(_:_:)
    case .subtraction: return opSubtraction(_:_:)
    case .multiplication: return opMultiplication(_:_:)
    case .division: return opDivision(_:_:)
    }
}

func opAddition(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, +, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.addition, lhs, rhs)
}

func opSubtraction(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, -, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.subtraction, lhs, rhs)
}

func opMultiplication(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, *, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.multiplication, lhs, rhs)
}

func opDivision(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, /, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.division, lhs, rhs)
}

func op<T>(_ type: T.Type, _ operation: (T, T) -> T, _ lhs: Any, _ rhs: Any) -> T? {
    if let lhsValue = lhs as? T, let rhsValue = rhs as? T {
        return operation(lhsValue, rhsValue)
    }

    return nil
}

Conclusion

That's it! That'll take you from the expression as a string to a final double value, as long as you have suitable values in the context. If we give it a spin, the following results in 42:

let context: [String: Any] =  [
    "b": ["field1": 4.0] as [String: Any],
    "a": ["field1": 10.0, "field2": 20.0] as [String: Any]
]

let tokenized = try tokenizeExpression(input: "a.field1 + (a.field2 - b.field1) * 2")
let node = try parseTokenizedExpression(expr: tokenized)
try evaluate(expression: node, context: context)

This was a fun problem to work on. I got to use a very functional approach to a problem I hadn't had to tackle before. I'm sure a lot of this is the kind of thing the first chapters of a compilers course would cover, but it was unfamiliar territory for me.

There's lots of room to expand. In the part about parser combinators I left lots of details uncovered. The evaluator only supports non-optional Doubles. And the expression language is very limited with only four operators and no function calls (spoiler: they're easy to add.) But if you ever need to work on something like this and don't have a modern compiler text book handy, this might get you off to a decent start.

© Juri Pakaste 2024