Wednesday, May 17, 2017

JSON Feed

Brent Simmons and Manton Reece (via John Gruber, Hacker News):

The JSON Feed format is a pragmatic syndication format, like RSS and Atom, but with one big difference: it’s JSON instead of XML.

For most developers, JSON is far easier to read and write than XML. Developers may groan at picking up an XML parser, but decoding JSON is often just a single line of code.

Our hope is that, because of the lightness of JSON and simplicity of the JSON Feed format, developers will be more attracted to developing for the open web.

Seems like a good idea. Sure, it’s another standard, so if it catches on this will create more work for people writing code in this area. But the fact that it’s so easy to use could open up more possibilities, and I assume that it will be more amenable to the needs of new services. There’s a WordPress plug-in.

See also: Dave Winer (2012).

Update (2017-05-17): See also: Brent Simmons.

Update (2017-05-18): See also: Manton Reece.

Update (2017-05-30): John Gruber:

The DF RSS feed isn’t going anywhere, so if you’re already subscribed to it, there’s no need to switch. But JSON Feed’s spec makes it possible for me to specify both a url that points to the post on Daring Fireball (i.e. the permalink) and an external_url that points to the article I’m linking to. The way I’ve dealt with that in the RSS (technically Atom, but that’s sort of beside the point) is a bit of a hack that’s caused problems with numerous feed readers over the years.

Ben Ubois:

One of the criticisms I’ve seen of JSON Feed is that there’s no incentive for feed readers to support JSON Feed. This is not true. One of the largest-by-volume support questions I get is along the lines of “Why does this random feed not work?” And, 95% of the time, it’s because the feed is broken in some subtle way. JSON Feed will help alleviate these problems, because it’s easier to get right.

Update (2017-06-01): See also: The Talk Show.

Update (2017-06-02): See also: Chris Siebenmann.

Update (2017-06-12): See also: Dave Winer.

9 Comments RSS · Twitter

Oh for FUUUUUUUUCKSSAKE.

Somebody!

PLEASE!

STOP! THEM!

NOW!!!!!!

Web programmers now are the fucking Donald Trumps of our entire technology world, and it’s only one more tossup now between which of these worlds gets nuclear-glass-plated first.

No, really. How is it even POSSIBLE that a barely-functioning, borderline brain-dead MORON like myself can understand HTTP, MIME, Content Negotiation, and REST to a working level with almost no trouble; while, simultaneous to this, SEVERAL MILLION Highly-Intelligent Highly-Paid University-Trained Professional Software Developers, many of whom are also Widely-Respected-And-Listened-To within their fields, somehow manage effortlessly to invent such an absolute polar MIS-comprehension of… well, EVERYTHING, that not only are they ALL #NotEvenWrong, THEY’RE NOT EVEN TETHERED IN THIS FUCKING DIMENSION.

And then, in huge pride and confidence, they vomit it all over the rest of the world like they actually know what they’re doing, and everybody else absolutely believes that they are doing it right too!!!

...

Please, lets just save humanity a lot of time and trouble; rename the whole bloody crapshow to Computer Religion, and JMP straight to its own Book of Revelation. Then we can all get out the way and let the cockroaches have fresh crack next, because even those little horrors cannot possibly be and dumber, stickier, and downright EUUUUWWWW than what we already have got! For I have never, ever in my LIFE encountered such a vast, ignorant, and witlessly destructive bunch of super-intelligent idiots as programmers. I really should thank them: all my life I have suffered crippling “impostor syndrome”; here in the programming world, I find myself completely at peace, at last knowing the TRUE impostors are EVERYONE ELSE. But right now I just HATE THEM far too much.

...

AAAAAAAAAAAAAAAAAAAAAAAAAARRRRRGGGH.

...

Okay.

So.

HERE is how you DO make a pragmatic syndication format, like Atom only in JSON in place of XML:

application/atom+xml ⟺ application/atom+json

That’s it. That. Is. IT.

The ONLY THINGS you have to do is:

1. swap the encoding format from XML to JSON, and

2. declare a new Content Type to describe the type of data structure you’ve encoded within it.

There is no 3. It is, at worst, A THIRTY-MINUTE JOB. Including Debugging, and “Hello, World”.

EVERYTHING ELSE you already got—all the field names, value types, usage patterns, user documentation, data consumers, data producers; everything—all of that stays absolutely perfectly 100-fucking-per-cent EXACTLY THE SAME AS BEFORE. You get 100% all the convenience of JSON encoding, and DON’T EVEN NEED to reinvent the entire fucking world; throwing out FIFTEEN YEARS of proven success and adoption, with Trott-knows how many MAN-HOURS of research, development, learning, testing, fixing, refining, documenting, proving, evangelizing, spreading, supporting, and making an Official RFC STANDARD out of the perfectly good, competent, and already widely-understood and even more widely-used Atom Syndication Format, already done, fully paid for, and here for the taking FOR FREE.

.

But NOBODY in this goddamned clown car of a global industry profession gets famous and adulated by all their peers just for doing THE SIMPLEST POSSIBLE THING THAT ACTUALLY WORKS. As I say: NOT – EVEN – WRONG.

Ohhh. Ada Wept.

--

Incidentally I recently #Followed some of the bright ambitious up-and-coming #LadyGeeks on Teh @Twitters, and cannot FOR THE LIFE OF ME figure out why they keep raging about “Pipeline Problems” and how to solve it. Not when the pipeline they so pleadingly covet is a four-inch soil pipe filled with the impacted fecal outpourings of ten million bloated narcissistic Dunning Kruger IDIOT MALE NERDS who could not do the job right if the other 99% of humanity tossed them all in a pit and gave them the hose.

Oh, I could so near laugh in the utter insane ludicrousness of it all… were it not for the absolute-zero spine-freezing knowledge that one day soon this industry’s #NextGenerationSpawn will be writing the #NextGenerationSoftwares that hold in safekeeping all of our nuclear launch codes. Major Kong, I envy you!

No idea whether the proposed feed solution is a good one, and I don't care; this is possibly my favourite internet comment ever.

Upon a quick glance on the proposal, I cannot find any rule for turning xml attributes into JSON elements. Is there an obvious and generic rule for that? For example:

<tag1>
<tag2 attr1="a">b</tag2>
</tag1>

How does one translate that to JSON? Sparkle appcast feeds have such attributes, for instance.

@Thomas: Any way you like. It's not magic. It's not a religion. Use common sense. XMLers tend to over-use attributes anyway, storing data, not metadata, just because they're too lazy to use subelements.

As a rule of thumb, if I'm designing an XML representation for a data structure, I use an 'href' attribute on any element that is available as a separate subresource, e.g.

<iTunes href="http://itunes-server.local">
  <playerState>paused</playerState>
  <playlists href="http://itunes-server.local/playlists"/>
  <currentTrack href="http://itunes-server.local/playlists/1/tracks/230"/>
</iTunes>

If I think there is value in a subresource element including a 'name'/'id' attribute too then I'll include that. It can be handy for use cases where you just want to put a bunch of elements into a list from which the user can select an item, in which case including names saves having to do a bunch of GETs just to present that list:

<playlists href="http://itunes-server.local/playlists"/>
  <playlist name="Library" href="http://itunes-server.local/playlists/1"/>
  <playlist name="Music" href="http://itunes-server.local/playlists/2"/>
  <playlist name="Podcasts" href="http://itunes-server.local/playlists/4"/>
  ...
</playlists>

Obviously with a JSON representation you need to adopt a slightly different structure, but it's no big deal:

{
  "__type":"iTunes",
  "href":"http://itunes-server.local",
  "playerState":"paused",
  "playlists":{"__type":"playlists", "href":"http://itunes-server.local/playlists"},
  "currentTrack"{"__type":"track", "href":"http://itunes-server.local/playlists/1/tracks/230"},
  ...
}
[
  {"__type":"playlist", "name":"Library" "href":"http://itunes-server.local/playlists/1"},
  {"__type":"playlist", "name":"Music" "href":"http://itunes-server.local/playlists/2"},
  {"__type":"playlist", "name":"Podcasts" "href":"http://itunes-server.local/playlists/4"},
  ...
]

Once you've cooked up a JSON representation that's well presented and you're happy with, write the formal spec for it and publish it. e.g. You may prefer a slightly "fatter" representation for the collections, as that then gives you somewhere to include hrefs for paging through large collections:

{
  "__type":"playlists",
  "href":"http://itunes-server.local/playlists",
  "prev": null,
  "next": {"__type":"playlists", "href":"http://itunes-server.local/playlists?start=20&slice=20"},
  "items":[
    {"__type":"playlist", "name":"Library" "href":"http://itunes-server.local/playlists/1"},
    {"__type":"playlist", "name":"Music" "href":"http://itunes-server.local/playlists/2"},
    {"__type":"playlist", "name":"Podcasts" "href":"http://itunes-server.local/playlists/4"},
    ...
  ]
]

The whole point of the web is for everyone to standardize on common data representations for easy interop, regardless of where they're used. It's the web programmers who go "Durrrr...APIs" that screw everything up; Idiots wouldn't know interop if it crawled up their ass and died there. (As it does.) Blame the schools too for feeding them nothing but OOP pablum that turns them into obedient monkeys but rots their brain.

By the way, has, you're halfway to JSON-LD. Which gives you a standard way of serializing graphs. You could then prioritize your spec on the vocabulary. But of course people react on any hint of term like Linked Data, RDF and such with more horror than about XML. Depressing.

@ttepasse: You can do a lot just by getting standard naming conventions down pat (another form of reuse).

TBH though the "__type" attributes are a code-smell—they aren't needed if an attribute's type is defined in advance, and if they are needed then it strongly suggests the spec isn't tight enough. I put them in more as an aide memoire; were I defining an actual spec I'd eliminate them from the data structures once I'd formally encoded that knowledge in the spec itself. In turn, it's a reflection of the deep cultural and philosophical flaws that lie beneath JSON's superficial "programmer friendliness". By encoding type information in the data itself, it encourages lazy and naive developers to make assumptions about the data they're consuming and trust that everything it tells them is the standard of truth.

This is DANGEROUS and WRONG. All the knowledge of each attribute's type and constraints should always be encoded in the CONSUMER, not in the data itself. The data can't be trusted; nothing from an external source can. Just look at a to see what happens when you deserialize data according to what IT WANTS instead of what YOU NEED.

Part of this culture likely comes from untyped (dynamic) languages like PHP and Ruby, but that's not fundamentally a limitation of languages or their design; e.g. the weak, dynamic kiwi automation language I developed has very powerful coercion and constraint-checking rule support that makes it trivial to guard against bad inputs at runtime. The real problem IMO is that modern mainstream programmers analyze problems and assemble solutions algorithmically rather than compositionally; a consequence of generations being raised purely on Algol/C descendants without a healthy mix of declarative and symbolic programming tools and techniques to foster a culture that shapes tools to solves specific problems, as opposed to hammering the problem to fit the crude, inexpressive tools that they've got. Heck, I still fall into this mindset myself when approaching new problems; but thankfully being a bear of very little brain I run out of steam before I can do much harm, forcing me out of the brute-force "how do I solve this problem?" mindset and into "how do I make a language for solving this problem?"

Real example (mildly anonymised):

# ACME Shipper
# © 2017 ACME Ltd. All rights reserved.

package: acme-shipper
  name: ACME Shipper Demo
  description: Shipper label demonstration.
  
  archive: all
    include: assets
    include: scripts
    include: template_rules
    include: workflow_rules

  executable: collate-packcopy
    interpreter: {python-id}
    path: scripts/export-packcopy.py
    language: python
    auto-import: {python-id}-site.library
    auto-import: acme-toolkit.kiwi.library
    auto-import: acme-client.library
  
  library: acme-shipper
    language: kiwi
    path: workflow_rules
    auto-import: acme-toolkit.jobrunner.library

  project: acme-shipper
    name: Shipper Demo
    client: acme-marketing.client
    auto-import: acme-shipper.library

    workflow: export-packcopy
      name: Collate Shipper Pack Copies
      program: acme-shipper.collate-packcopy.executable

    workflow: render-label
      name: Build Shipper Label Artwork
      program: acme-client.run-workflow.executable
      command: 
          choose file ("Please select one or more kiwi pack copy files:", 
                  {}, {$public project folder}/data, multiple files), 
          show acme client, 
          apply to list (
            render artwork ( {$ @ read file, parse (value)} )
          )

    folder: public
      path: ACME shipper
      make: artwork
      make: data

That's a manifest file for our new workflow system, containing everything a project needs to package up for distribution, deploy onto user's machines (including recursive dependency resolution), and run the workflows provided. The only syntax is KEY:VALUE pairs and indentation to indicate grouping. In developing the manifest data structure, I started out with Python's own .ini parser "to save time", then ditched it when it was obvious it lacked the expressiveness to describe non-trivial grouping. I could've used kiwi, but being a Lisp-y language it's heavy on parentheses which are a pain to get right without an assistive editor, and manifests need to be easily human writable. So I defined my own syntax and structure; creating a "language" for neatly expressing the information I need (the above is the final field names and section hierarchy—obviously it evolved a bit as I "play-used" it to see what "worked" and what didn't).

As I was developing the data structure I started writing a parser for it, and being lazy did my usual thing of hardcoding a bunch of mutually recursive functions that would read a line, split it into KEY and VALUE strings, and then use a big old conditional block to check if KEY was valid at that point in the structure and process its value accordingly. I think I did about three of these functions, then threw them all out a stupid make-work creating a giant blob of dumb expensive code. It was useful starting that way, but only inasmuch as it gave me space to think about how to solve the problem better. So I got my language head on, and while Python (the implementation language) isn't the best managed to cook up a descriptive data structure that captures hierarchy, and field names, types, constraints, and min/max occurrences in about the same amount of space as the manifests it describes:

importstype = [noneormany, listof(taggable, libraryid)]
descriptiontype = [noneoronce, multilinestring]

manifestdefinition = {
    'package': {
        'name': [once, namestring],
        'description': descriptiontype,
        
        'archive': {
            '.IDKEY': [enum('all', 'development', 'experimental', 'QA', 'UAT', 'production', 'deprecated')],
            'description': descriptiontype,
            'include': [noneormany, listof(pathstring)],
            'exclude': [noneormany, listof(pathstring)],
        },
        
        'executable': {
            'name': [defaultname, noneoronce, namestring],
            'description': descriptiontype,
            'interpreter': [once,  taggable, simpleuid],
            'path': [once, pathstring],
            'language': [once, languagename],
            'auto-import': importstype,
            'self-import': importstype,
        },
        'library': {
            'name': [defaultname, noneoronce, namestring],
            'description': descriptiontype,
            'path': [once, pathstring],
            'language': [once, languagename],
            'auto-import': importstype,
            'self-import': importstype,
        },
        'project': {
            'name': [defaultname, once, namestring],
            'client': [once, clientid],
            'description': descriptiontype,
            'auto-import': importstype,
            'self-import': importstype,
            'workflow': {
                'name': [once, namestring],
                'program': [once, executableid],
                'command': [noneoronce, kiwistring],
            },
            'folder': {
                '.IDKEY': [enum('public', 'private')],
                'description': descriptiontype,
                'path': [once, pathstring],
                'make': [noneormany, listof(pathstring)],
                'copy': [noneormany, listof(pathstring)],
                'skip': [noneormany, listof(pathstring)],
            },
        },
    },
}

The parser's not super-efficient: it does a two-pass, first to collate each section's fields into a dict of lists, {FIELDNAME:[VALUE1,VALUE2,...]}, then to apply the cast and constraint functions listed in the manifest definition, to reduce the manifest file it down to the final, validated Python data structure. Each conversion function is very simple, taking two values as input (source data and its environment), performing a single check or transform and returning two values as output (transformed data and its environment), allowing them to be chained into field processing pipelines. For example, functions such as once and noneormany take the initial list of gathered field value(s), and reduce down to a single item (once), or return as-is (noneormany) after confirming the list has the correct number of items and no duplicates. Subsequent functions can then apply additional checks and transforms, e.g. the simpleuid function checks if a field contains a well-formed ID string (one or more hyphenated alphanumeric words):

uidpattern = re.compile(r'\A[a-z0-9]+(?:-[a-z0-9]+)*\Z')

def simpleuid(value, fieldstate):
    if not uidpattern.match(value):
        raise ParseError("Expected {!r} field to be simple UID but found {}.".format(fieldstate.key, value), fieldstate)
    return value, fieldstate

All very simple operations individually, but completely composable so able to build up a comprehensive and essentially self-documenting schema very fast. And trivially extensible too: just implement new check/convert functions as needed and add them to the definition. Pus it's no longer limited to parsing just one data structure (manifest) because defining new document 'schemas' is just a case of declaring new definitions. I suspect I'll find a bunch of additional uses for it over the coming months as I build out our client- and server-side technologies.

All terribly old-hat and unimpressive to Lispers, I'm sure, but compare to Python's .ini parsing module to see how different it is culturally. And then look at recent mainstream languages like Swift and wonder what on Earth they are doing, still scrabbling in the dirt to express any concept more sophisticated than "Ugg bang rock!"

Me, I like executable thought. :)

p.s. @Michael: Is there a trick to getting PRE tags to appear correctly? It seems to eat mine. Ta.

@has It should work if you type PRE instead of CODE.

[…] Currently the Evergreen blog only supports JSON Feed. […]

Leave a Comment