Friday, May 4, 2018

Retrobatch Public Beta

Gus Mueller:

Retrobatch is a node based (not the JS language) batch image processor. A bit like Quartz Composer, and a bit like Audio Hijack. But for images. Lots and lots of images (or maybe a few or even one).

[…]

But why node based? Every batch image processor I’ve come across was linear. You put images in one end, and out they came the other side. But that’s so limiting! What if it was possible to take a folder of images and then operate on them twice with the same workflow? What if you could create branches where one would resize images to 50%, and another write out PNG files with the @2x suffix added to the file name? What if you had a workflow that referenced multiple folders which combined into a single output?

And all the possibilities! What if you could read an image from the clipboard, apply a filter to it, and write it to a folder and to the clipboard? What if you had a way to separate out PNG images of a certain size from a folder and only do an operation to those? What if you could script the application in response to new images being added to a shared folder? What about if it could capture all the open windows of your favorite application as images, then apply a filter to those, and then write out a layered PSD of those windows? What if you wanted to apply a machine learning model against your images, to figure out which contains pictures of hotdogs in them, and then perform some action based on that?

This is a really cool idea for an app, and I like the way he’s designed the interface. The beta seems to be pretty mature already.

FAQ:

The App Store requires apps be sandboxed, which would considerably limit Retrobatch’s functionality.

Update (2018-06-02): Gus Mueller:

Which is all to say Retrobatch 1.0 was released yesterday!

[…]

For instance, the initial work to bring Metal to Acorn 6.1 was originally done in Retrobatch. Since I had no legacy code to worry about with Retrobatch 1.0, I started with Metal from the beginning. And with that experience I was able to figure out how I could move code around and refactor Acorn in an intelligent way to bring Metal rendering there.

13 Comments RSS · Twitter

Generic visual node-based programming languages may not be a great idea (or maybe it's just that nobody has solved the problem yet), but domain-specific visual node-based programming languages are a great idea, and are being used successfully in lots of different places (e.g. shader programming in 3D graphics, or BPMN in business applications).

I currently use a combination of Folder Actions and ImageOptim to reduce PNG sizes for screenshots and whatnot. Retrobatch seems a lot more general purpose and approachable. This is the first new Mac app I’ve seen in a while that I can point to and say “this is why I’d like to stay on the Mac platform”.

> This is the first new Mac app I’ve seen in a while that I can point to and say “this is why I’d like to stay on the Mac platform”.

I had the exact same thought.

@Lukas: Dataflow languages describe directed graphs, so there’s a good argument for being able to represent them visually. The mistake is making visual the only representation, because, unlike text, it’s inefficient to work with and totally non-portable, and still ends up needing some sort of serialization format anyway.

Whereas if you start with a Lisp-y text-based representation, you get all the benefits of a powerful, expressive, mature 10,000 year-old technology and the tools and techniques we already have for working with that, plus you can trivially transform all or some of it to and from a visual graph when that representation is the more useful (e.g. for high-level overview).

Modern mainstream programmers are very good at designing these complicated, low-ceiling visual programming systems for the simple reason they have no clue what it is that makes programming so difficult for non-programmers to understand and do. (Hint: try a mirror.) Scratch, Automator, Workflow, etc are all born of the same misconception: that CLI is hard and GUI is easy, so to make programming accessible to non-programmers just GUI-fy it. But that mistakes form for function.

GUI didn’t beat CLI because graphical is easy; it did so because GUIs happened to provide transparency and safety—i.e. you could easily see what operations were available to use, inappropriate operations were prevented, and unsafe operations were [mostly] reversible. Whereas CLIs of the day were utterly opaque and intolerant of error—but that was a flaw in UI/UX design (and lack of processing power), not inherent limitation.

Now watch kids txt-ing on their phones, or asking Alexa to do stuff for them, and tell us what the value proposition of visual programming is. Any end-user programming language that can’t work across multiple devices and input/output systems is, frankly, already a dead-end product design, even before you get to the well-known spaghetti-code hell that visual programs naturally degenerate to.

Remember, it’s not the “visual” aspect that’s valuable; it’s safety and transparency. Unreadable code in any language is very bad for that. And code which you cannot post in an email or online forum to give to others or when asking for help is code in a language that will never scale beyond its own closed ecosystem; a terrible value proposition in any math.

Truth is: Abstraction is Easy, Punctuation is Hard. Papert proved that 50 years ago; my own work in the last 10 has found this too. There is something wonderfully simple, not to mention crazy powerful, in being able to create your own words to express the concepts and behaviors of interest and value to you. I once wrote a book about AppleScript; it took hundreds of pages just to explain the language before getting to the stuff users actually care about, yet I can explain up Papert’s Logo in just three lines:

1. This is a word.

2. This is how you perform a word.

3. This is how you compose your own words.

Visual languages ultimately fail because they capture the worst tendencies of algorithmic thinking and then gamify it. Thus the reward becomes stringing together the longest chain of instructions that [appears to] produce the correct result. Whereas the compositional approach that the Lisps espouse and which Logo makes accessible to mortals rewards simplicity: building up a multi-layered vocabulary of general- and special-purpose commands for expressing solutions in a particular domain. And which, like I say, can still be skinned up as visual forms and graphs for when that is appropriate, since Lisp code is data and vice-versa, without being constrained to that representation in all the situations where it is not.

@remmah: TBH, all I see is a 15yo dead-end niche non-solution on a 30yo niche platform that is looking increasingly like a dead-end itself. #NerdToys won’t change the world.

--

TL;DR: “A curious thing about our industry: not only do we not learn from our mistakes, we also don’t learn from our successes.”—Keith Braithwaite

> The mistake is making visual the only representation, because, unlike text, it’s inefficient to work with

Not true. It depends entirely on what kinds of actions you're performing on your program, and how the user interface works. The idea that text-based user interfaces are *always* more efficient than visual user interfaces does not match reality. In fact, I have worked on a BPMN tool which used to have both a visual view, and a text-based view. The results were that people spent more time learning the tool because they had to learn two completely different user interfaces, made more mistakes (because the text-based view is just more prone to errors that can easily be prevented in the visual view), and were also slower (because text manipulation is *not* always faster than a visual user interface). It also created a lot of external costs, like duplicated documentation, and a weird cultural effect where long-time users were proud that they could use the harder-to-use text editor, and looked down on new users who preferred the visual editor.

We removed the ability to interact with, or even see, the textual representation.

Use the right tool for the task. Sometimes, text. Sometimes, visual. Never both at once.

@Lukas you must’ve faced pushback from the long term / expert users. How’d you manage to get rid of the text UI?

> you must’ve faced pushback from the long term / expert users.

Very little. I remember one person (jokingly) complaining about it.

> How’d you manage to get rid of the text UI?

With psychological trickery. At the same time as removing the text editor, we also renamed the product, restructured and visually redesigned the user interface, and added a lot of new features. So we never framed it as "we're removing a feature", we framed it as "this is a new product, a successor to our old product, much more modern, more user-friendly, more approachable, safer, more powerful - it's not just a new version of our old product". That way, we managed to avoid triggering people's loss aversion.

We also took care to make sure that all of the features that previously genuinely worked better in text mode (e.g. copy-pasting large sections of a diagram) were implemented in an even more convenient form in the visual editor.

To be clear, this was more than just framing - those things (more user-friendly, powerful, safer) were actually true. But they also worked well as a way of making people transition without feeling like we'd taken something away from them. People felt like they'd gained something from the transition, which, in fact, they genuinely had. Also, causation was backwards: we were redesigning the tool anyway, and took that opportunity to remove the text view, since we knew that this would be an opportunity to do so without too much pushback from users. We didn't redesign the whole tool just to get people to give up the text mode :-)

Martin Wierschin

@has I feel that you're overthinking the situation here. This doesn't have to "change the world" as you said, or be the most perfectly structured and flexible way to represent the desired functionality. It's just a tool to help a person quickly put together an image processing workflow. This app is perfectly positioned to be a nice "NerdToy" as you put it.

Personally I'd use a tool like this every once in a while, maybe a few times a year at most. For something like that I'm much happier to embrace the safety and approachability of a GUI over some much deeper text-based interface. For this type of infrequently used task I do not want to learn a new specialized niche syntax/language. I'd surely forget it in the months between needing to create new workflows. That's definitely what happens to me every time I fire up the "sips" command line tool provided with macOS.

@Lukas: "It also created a lot of external costs, like duplicated documentation, and a weird cultural effect where long-time users were proud that they could use the harder-to-use text editor, and looked down on new users who preferred the visual editor."

Sounds like a consequence of creating two independent interfaces, rather than designing a single toolset/UX that works across multiple UIs. As for your dysfunctional user community, that's a definite sign you've got something wrong. Go read Nardi's A Small Matter of Programming, specifically the chapter on Collaborative Programming. Multiple levels of users should be supporting those below (and above!).

I designed my kiwi language so non-programmers (artworkers) can assemble their own simple data processing pipelines by putting together a series of rules. Rules are self-documenting and their interfaces contain detailed type information, allowing a whole range of tools to be built on top: CLI 'manpage' and HTML documentation generators, search, autosuggest/autocomplete (for plain/structured text editors), GUI form builders (for visual interfaces).

Some of that tooling's built; much of it is TODO. (It's both a benefit and a curse that it doesn't require any high-level tooling to exist in order to use it; a lot of the cursing being around typos.) But I PoCd the form builder a year ago, so can confirm it has legs. Pick a rule name (e.g. EAN-13) and pass it to the show rule form rule and it creates a GUI form containing fields and buttons for the user to set its parameters (e.g. to enter optional bar width reduction and/or percentage scale values). On return, the result is a fully populated rule command (e.g. EAN-13 (20µm, 100%)) ready to insert into a pipeline and run. The documentation gets used in the GUI too, as tooltips on the controls and as a linked help page, and the type information gets used to validate fields as the user fills them in.

One of our tooling TODOs for this year is to find us an experienced ExtendScript developer to build us a JS-powered GUI frontend for all this which we can use in both a kiwi code editor and a custom Illustrator panel. There's still lots of room for improvement, e.g. the documentation still needs to auto-adapt to different levels of user experience (beginner/intermediate/expert users each need content and volume tailored to suit), and text-based code examples need to be both directly runnable (so text-mode users can try them to see what they do) and transformable to GUI equivalents (so visual-mode users can do likewise). But the point is the whole kiwi platform was designed from the start to enable efficient reuse and repurposing of resources at every opportunity. It takes more up-front planning, but pays dividends further down the line, precisely by avoiding the unnecessary duplication of effort and incompatible learning and usage patterns you describe.

.

@Martin: Yeah, the *nix command line was a major inspiration to me, mainly in what NOT to do in my UI/UX design. (Though it did provide a few good ideas too—e.g. relaxed quoting rules for text and minimal punctuation, for quick input and low visual noise—though of course it screws up on the implementation in excitingly unsafe ways as any Unix fule kno.) And if you think sips is bad, try awk/sed (I won't!).

But again, those are products of UI/UX design by and for hardcore nerds who don't have clue zero how to do design a good UI/UX even for power-users, never mind beginners. Scalability has to be designed in from the start. The *nix CLI starts high and can't scale down; Scratch-like GUIs start low and can't scale up. Whereas Logo scales like crazy because keeping eight-year-olds occupied was its primary requirement.

Couple kiwi tags I sent to an artworker over Skype earlier today, for those that are curious:

{Ingredients @ delete if no input, fit frame to text}

{Barcode Number @ EAN-13 (20µm, 100%)}

Text-based (they're tags attached to text frames and paths in Illustrator artwork), and a bit terser than ideal (due to that context's space limitations), but it shouldn't be hard to see how these could easily be constructed in a visual pipeline editor using point-n-click, while also being self-explanatory and easy to cut-n-paste as text.

If I make it through the current venture, hopefully I'll have time to knock out a working release of entoli before I'm done. While kiwi evolved to fit a very specific domain and isn't publicly available, entoli was designed to be a general-purpose Logo/AppleScript successor and a potential "killer application" for the new voice-driven UIs as well as the more traditional point-n-click and text/txt modes. Or, as I said at the end of today's Skype sesh: I may not be smart but I'm cunning AF. :)

> Sounds like a consequence of creating two independent interfaces, rather than designing a single toolset/UX that works across multiple UIs

This is such a vague statement that is open to so many different interpretations that it's not really possible for me to respond.

> As for your dysfunctional user community, that's a definite sign you've got something wrong

Correct, and we fixed it by doing the opposite of your suggestion. I think you need to do a bayesian readjustment of the confidence you have in your own opinions ;-)

> precisely by avoiding the unnecessary duplication of effort and incompatible learning and usage patterns you describe

Just to be clear, you're claiming that you've created a tool that has two different user interfaces, a text-based and a visual user interface, and that this has not created any additional effort on learning, training, or documentation.

@Lukas: "we fixed it by doing the opposite of your suggestion"

And you executed it very smoothly too. The question to ask is: how much have you empowered your users (by decluttering and focusing the toolset), and how much have you disempowered them (by reducing its expressive power)? Cos if your users can copy-paste graphical DAGs into your help forums, well done.

Out of interest, was your text-based language an algorithmic one (c.f. C*, Python, JavaScript, et al)?

"Just to be clear, you're claiming that you've created a tool that has two different user interfaces, a text-based and a visual user interface, and that this has not created any additional effort on learning, training, or documentation."

Built the language; prototyped the tooling. So far so promising. The cost of presenting two views onto the same language is not zero (obviously!), but it's not much more either (we automate our automation!); and the two interfaces complement and support each other, instead of compete as yours did.

..

As I said earlier, punctuation is hard. Kiwi users immediately get how to type basic tags, e.g. {product title}, and how to apply a single, stock rule to it: {product title @ uppercase}. That alone is some powerful data merge, especially as we can produce custom brand rules for them.

The UX problems kick in at the next step, when users start parameterizing rules. In theory it should be simple (like writing a shopping list!), but in practice they'd forget the parentheses, or the commas between values, or the quotes around text that has reserved characters in it. Kiwi has more punctuation rules than I'd like, but that's the price of having whitespace in names (which users love, cos it's WYSIWYG!) without injecting lots and lots of special forms (c.f. AppleScript, which is its own special usability hell).

For instance, {barcode number @ EAN-13 20µm 90°} is perfectly self-explanatory to the user and is syntactically valid too, but kiwi can't find a rule named "EAN-13 20µm 90°" so the tag fails to expand. Type in a rule name as text, e.g, EAN-13, click "Show as Form", and enter its arguments via a nice GUI form. That takes care of all the punctuation, checks everything as you type, and explains what each value is (BWR, scale, rotation).

ISTR one of the early Mac IDEs having a similar capability in its shell (caveat the GUI for each command was built manually). Kiwi rules are more granular so their parameters are not as complicated and have much richer metadata, so their forms can be generated completely automatically. Can't recall which one it was, but for other examples of the kind of UXes I take my influences from, take a look at REBOL, Mathematica, and Lisp Machine shells. No coincidence that they're all symbolic languages too (ditto Logo, ditto kiwi).

Anyway, better run; got some new rules to write. Tx.

> Cos if your users can copy-paste graphical DAGs into your help forums, well done.

This is an interesting question. We anticipated this problem, and intitially implemented a feature where people could copy things from the visual editor, and, in addition to a format that could be pasted back into the editor, it would also add a text representation of the copied things to the clipboard. So you could paste into a text editor, and get a human-readable representation of the copied thing.

This worked reasonably well, but it was one-way. You could use it to communicate what you copied to humans (e.g. paste it into a forum to explain what you were looking at, or paste it into an email), but you couldn't copy and paste that text representation back into the editor (some data relevant to the editor was not converted to text, in order to make it nicely human-readable).

It turned out that people never actually adopted this feature. Instead, they do two things:

1. Export the thing they're working on, and attach it to the forum post or email (we added a one-click export feature for this purpose)
2. Attach an image (we added one-click image export for this purpose)

I'm not entirely sure if people simply didn't want the text feature, or if it wasn't discoverable enough, or if people never adopted it because pasting the text representation back into the editor didn't work. But because the alternative options worked well, we eventually removed the text feature.

>and how much have you disempowered them (by reducing its expressive power)?

We did not reduce the language's expressive power. In fact, removing the need to have text editors allowed us to make the language more expressive, by adding features that could not have easily be translated to a human-compatible textual representation.

> Out of interest, was your text-based language an algorithmic one (c.f. C*, Python, JavaScript, et al)?

It's a combination of multiple domain-specific languages, some of which have features of algorithmic languages.

Leave a Comment