Parsing Embedded JSON and Arrays in Swift

Tony DiPasquale

In the previous posts (first post, second post) about parsing JSON in Swift we saw how to use functional programming concepts and generics to make JSON decoding consise and readable. We left off last time creating a custom operator that allowed us to decode JSON into model objects using infix notation. That implementation looked like this:

struct User: JSONDecodable {
  let id: Int
  let name: String
  let email: String?

  static func create(id: Int)(name: String)(email: String?) -> User {
    return User(id: id, name: name, email: email)
  }

  static func decode(json: JSON) -> User? {
    return _JSONParse(json) >>> { d in
      User.create
        <^> d <|  "id"
        <*> d <|  "name"
        <*> d <|? "email"
    }
  }
}

We can now parse JSON into our model objects using the <| and <|? operators. The final piece we’re missing here is the ability to get keys from nested JSON objects and the ability to parse arrays of types.

Note: I’m using <|? to stay consistent with the previous blog post but ?s are not allowed in operators until Swift 1.1. You can use <|* for now.

Getting into the Nest

First, let’s look at getting to the data within nested objects. A use case for this could be a Post to a social network. A Post has a text component and a user who authored the Post. The model might look like this:

struct Post {
  let id: Int
  let text: String
  let authorName: String
}

Let’s assume that the JSON we receive from the server will look like this:

{
  "id": 5,
  "text": "This is a post.",
  "author": {
    "id": 1,
    "name": "Cool User"
  }
}

You can see that the author key is referencing a User object. We only want the user’s name from that object so we need to get the name out of the embedded JSON. Our Post decoder starts like this:

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String) -> Post {
    return Post(id: id, text: text, authorName: authorName)
  }

  static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author"
    }
  }
}

This won’t work because our create function is telling the <| operator to try and make the value associated to the "author" key a String. However, it is a JSONObject so this will fail. We know that d <| "author" by itself will return a JSONObject? so we can use the bind operator to get at the JSONObject inside the optional.

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String) -> Post {
    return Post(id: id, text: text, authorName: authorName)
  }

  public static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" >>> { $0 <| "name" }
    }
  }
}

This works, but there are two other issues at play. First, you can see that reaching further into embedded JSON can result in a lot of syntax on one line. More importantly, Swift’s type inference starts to hit its limit. I experienced long build times because the Swift compiler had to work very hard to figure out the types. A quick fix would be to give the closure a parameter type: { (o: JSONObject) in o <| "name" }, but this was even more syntax. Let’s try to overload our custom operator <| to handle this for us.

A logical next step would be to make the <| operator explicitly accept a JSONObject? optional value instead of the non-optional allowing us to eliminate the bind (>>>) operator.

func <|<A>(object: JSONObject?, key: String) -> A? {
  return object >>> { $0 <| key }
}

Then we use it in our Post decoder like so:

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String) -> Post {
    return Post(id: id, text: text, authorName: authorName)
  }

  public static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" <| "name"
    }
  }
}

That syntax looks much better; however, Swift has a bug / feature that allows a non-optional to be passed into a function that takes an optional parameter and Swift will automatically turn the value into an optional type. This means that our overloaded implementation of <| that takes an optional JSONObject will be confused with its non-optional counterpart since both can be used in the same situations.

Instead, let’s specify an overloaded version of <| that removes the generic return value and explicity sets it to JSONObject.

func <|(object: JSONObject, key: String) -> JSONObject {
  return object[key] >>> _JSONParse ?? JSONObject()
}

We try to parse the value inside the object to a JSONObject and if that fails we return an empty JSONObject to the next part of the decoder. Now the d <| "author" <| "name" syntax works and the compiler isn’t slowed down.

Arrays and Arrays of Models

Now let’s look at how we can parse JSON arrays into a model. We’ll use our Post model and add an array of Strings as the comments on the Post.

struct Post {
  let id: Int
  let text: String
  let authorName: String
  let comments: [String]
}

Our decoding function will then look like this:

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String)(comments: [String]) -> Post {
    return Post(id: id, text: text, authorName: authorName, comments: comments)
  }

  public static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" <| "name"
        <*> d <| "comments"
    }
  }
}

This works with no extra coding. Our _JSONParse function is already good enough to cast a JSONArray or [AnyObject] into a [String].

What if our Comment model was more complex than just a String? Let’s create that.

struct Comment {
  let id: Int
  let text: String
  let authorName: String
}

This is very similar to our original Post model so we know the decoder will look like this:

extension Comment: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String) -> Comment {
    return Comment(id: id, text: text, authorName: authorName)
  }

  static func decode(json: JSON) -> Comment? {
    return _JSONParse(json) >>> { d in
      Comment.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" <| "name"
    }
  }
}

Now our Post model needs to use the Comment model.

struct Post {
  let id: Int
  let text: String
  let authorName: String
  let comments: [Comment]
}

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String)(comments: [Comment]) -> Post {
    return Post(id: id, text: text, authorName: authorName, comments: comments)
  }

  public static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" <| "name"
        <*> d <| "comments"
    }
  }
}

Unfortunately, _JSONParse isn’t good enough to take care of this automatically so we need to write another overload for <| to handle the array of models.

func <|<A>(object: JSONObject, key: String) -> [A?]? {
  return d <| key >>> { (array: JSONArray) in array.map { $0 >>> _JSONParse } }
}

First, we extract the JSONArray using the <| operator. Then we map over the array trying to parse the JSON using _JSONParse. Using map, we will get an array of optional types. What we really want is an array of only the types that successfully parsed. We can use the concept of flattening to remove the optional values that are nil.

func flatten<A>(array: [A?]) -> [A] {
  var list: [A] = []
  for item in array {
    if let i = item {
      list.append(i)
    }
  }
  return list
}

Then we add the flatten function to our <| overload:

func <|<A>(object: JSONObject, key: String) -> [A]? {
  return d <| key >>> { (array: JSONArray) in
    array.map { $0 >>> _JSONParse }
  } >>> flatten
}

Now, our array parsing will eliminate values that fail _JSONParse and return .None if the key was not found within the object.

The final step is to be able to decode a model object. For this, we need to define an overloaded function for _JSONParse that knows how to handle models. We can use our JSONDecodable protocol to know that there will be a decode function on the model that knows how to decode the JSON into a model object. Using this we can write a _JSONParse implementation like this:

func _JSONParse<A: JSONDecodable>(json: JSON) -> A? {
  return A.decode(json)
}

Now we can decode a Post that contains an array of Comment objects. However, we’ve introduced a new problem. There are two implementations for the <| operator that are ambiguous. One returns A? and the other returns [A]? but and array of a type could also be A so the compiler doesn’t know which implementation of <| to use. We can fix this by making every type that we want to use the A? version to conform to JSONDecodable. This means we will have to make the native Swift types conform as well.

extension String: JSONDecodable {
  static func decode(json: JSON) -> String? {
    return json as? String
  }
}

extension Int: JSONDecodable {
  static func decode(json: JSON) -> Int? {
    return json as? Int
  }
}

Then make the <| implementation that returns A? work only where A conforms to JSONDecodable.

func <|<A: JSONDecodable>(object: JSONObject, key: String) -> A?

Conclusion

Through a series of blog posts, we’ve seen how functional programming and generics can be a powerful tool in Swift for dealing with optionals and unknown types. We’ve also explored using custom operators to make JSON parsing more readable and consise. As a final look at what we can do, let’s see the Post decoder one last time.

extension Post: JSONDecodable {
  static func create(id: Int)(text: String)(authorName: String)(comments: [Comment]) -> Post {
    return Post(id: id, text: text, authorName: authorName, comments: comments)
  }

  public static func decode(json: JSON) -> Post? {
    return _JSONParse(json) >>> { d in
      Post.create
        <^> d <| "id"
        <*> d <| "text"
        <*> d <| "author" <| "name"
        <*> d <| "comments"
    }
  }
}

We’re excited to announce that we’re releasing an open source library for JSON parsing based on what we’ve learned writing this series. We are calling it Argo, named after the Greek word for swift and Jason of the Argonauts’ boat. Jason’s father was Aeson, which is a JSON parsing library in Haskell that inspired Argo. You can find it on GitHub. We hope you enjoy it as much as we do.

The Bad News

During this part of the JSON parsing I ran up against the limits of the Swift compiler quickly. The larger your model object, the longer the build takes. This is an issue with the Swift compiler having trouble working out all the nested type inference. While Argo works, it can be impracticle for large objects. There is work being done on a separate branch to reduce this time.