Tuesday, July 6, 2021

Audacity’s New Privacy Policy

Tim Hardwick:

Two months ago, Audacity was acquired by Muse Group, which owns other audio-related projects including the Ultimate Guitar website and the MuseScore app. According to Fosspost, changes to the privacy policy section on the Audacity website indicate that several personal data collection mechanisms have since been added by the parent company.

Audacity:

Personal Data we collect

  • OS version [I presume they mean App version.]
  • User country based on IP address
  • OS name and version
  • CPU
  • Non-fatal error codes and messages (i.e. project failed to open)
  • Crash reports in Breakpad MiniDump format
  • Data necessary for law enforcement, litigation and authorities’ requests (if any)

The first four are pretty common for Mac apps to collect without opt-in, as part of a software update check. I don’t think IP addresses really count as personal data if they are not linked with other identifying information. Otherwise, anyone with a Web site who didn’t disable logging would be considered to be collecting personal information.

I don’t think error codes or crash reports should be collected without the user opting in.

The last item has people worried, but I’m not really sure what it means. You could imagine that Audacity is collecting information about which audio files you’re editing and making that available to companies who want to sue for copyright infringement. Or it could just be boilerplate saying that Audacity will comply with lawful requests for the not very personal information that it is collecting anyway. Whether or not it’s spelled out in a privacy policy, most companies probably don’t have a choice about that.

workedintheory:

We believe concerns are due largely to unclear phrasing in the Privacy Policy, which we are now in the process of rectifying.

See also: Reddit, Hacker News, 2, 3.

Update (2021-07-07): Syenta:

I have already uninstalled it and cleared out the %AppData% folder where I found the LastLog which listed:

Kalk = A calculator
WindowsApps
OpenSSH
Powershell
Python

None of which are in @getaudacity folder Why would you list things not used by Audacity like Kalk

Shoshana Wodinsky (via Nick Heer):

First came plans to add telemetry capture. Then came a new contributor license agreement. Then last week came a privacy policy update that some Audacity die-hards say turns the software into “spyware.” But Audacity isn’t “spyware”—if only because virtually every app we use is some form of spyware these days.

[…]

Ray adds that its data collection is “very limited” and only includes “pseudonymized” IP addresses that are “irretrievable after 24 hours,” system information that includes “OS version and CPU type,” and optional error report data—not users’ microphone recordings or personal details.

[…]

Also worth mentioning here is that some of the other products under the Muse Group umbrella—like the music notation software MuseScore—feature nearly identical privacy policies, which suggests the parent company just updated Audacity’s policies for some consistency across its catalog. But that doesn’t excuse the piss-poor wording on its original draft, which Ray swears will be “revised” soon enough.

cookiengineer (via Hacker News):

Stepdown as Maintainer of this Fork

Disclaimer: I really thought long about this, and I haven’t slept in two days due to ongoing harassments of 4chan.

As the first people were literally arriving at my place of living, where they knocked on my doors and windows to scare us, I am hereby officially stepping down as a maintainer of this project.

I don’t understand how this escalated.

Update (2021-07-14): Tom Nardi (via Hacker News):

While there was still a segment of the Audacity userbase that was skeptical about remote analytics being added into a program that never needed it before, representatives from the Muse Group seemed to be listening to the feedback they were receiving. Keary assured users that plans to implement telemetry had been dropped, and that should they be reintroduced in the future, it would be done with the appropriate transparency.

Unfortunately, things have only gotten worse in the intervening months. Not only is telemetry back on the menu for a program that’s never needed an Internet connection since its initial release in 2000, but this time it has brought with it a troubling Privacy Policy that details who can access the collected data. Worse, Muse Group has made it clear they intend to move Audacity away from its current GPLv2 license, even if it means muscling out long-time contributors who won’t agree to the switch. The company argues this will give them more flexibility to list the software with a wider array of package repositories, a claim that’s been met with great skepticism by those well versed in open source licensing.

7 Comments RSS · Twitter

Kevin Schumacher

Here's the thing with the way you formatted the list of data collected. It's not the way it's formatted in the linked page. Normally, OK, whatever. There's only so much you can do with certain formatting.

In this case, though, it makes a world of difference. They have chosen to separate "For legal enforcement" into its own section in their Privacy Notice. Supposedly, they mean it to only apply to the data already listed as collected, but that's not what it literally says right now. It says that the personal data they collect for the "legal enforcement" collection reason is "Data necessary for law enforcement, litigation and authorities’ requests (if any)", not the data types already enumerated.

If the table was formatted such that the entire "For legal enforcement" row was moved up to be additional bullet points in each of the respective sections in the row above, it would be clear they aren't collecting additional data specifically for legal reasons. As it stands, that is not clear at all, and the fact somebody there thought it was a good idea to make it appear this way is very problematic.

As far as IP addresses, it's not clear to me that they have excluded the right to combine them with other data purchased from data brokers to do exactly the type of tracking being discussed around the time Apple released App Tracking Transparency, in which scummy marketing companies are claiming high rates of user identification based on IP addresses.

Additionally, the whole process through which they store the IP addresses sounds absurd. They hash them with a salt that changes daily and is not recorded anywhere once it changes the next day. Then what is the purpose of storing it for a year, and what is the purpose of storing it in cleartext for one day? It's no longer comparable against anything else once the salt changes. And if you're trying to associate sessions within that 24-hour period, what's the point of storing it in cleartext? You could hash it immediately, compare the hashes until the salt changes, and then delete it. There's something else going on here, because as described, the hashes are useless for the next 364 days until deleted.

> I don’t think IP addresses really count as personal data if they are not linked with other identifying information. Otherwise, anyone with a Web site who didn’t disable logging would be considered to be collecting personal information.

Sadly, the GDPR disagrees and European Data Protection Authorities have been known to enforce this: IP addresses are considered personal information in the EU, whether or not they are cross-referenced with other identifiers.

This is madness, but yes, it means that EU operators who want to comply with GDPR and data retention laws (enacted for “safety”) must navigate a maze of regulations involving retaining some IPs for some time in some circumstances, but absolutely, positively, definitely not in other cases… Many smaller companies in the EU have totally disabled logging on their sites for fear of running afoul of the GDPR.

IP addresses (even when dynamic) are PII; this has been established in the CJEU. If you log them, warn your customers at least once if you provide a feature that uses the Internet for functionality (including update checks). I think this is perfectly reasonable--use of your IP address manifestly discloses at least the time and location of access to a remote party, which may be information you would prefer was private. The fact that world+dog now thinks abusing peoples' privacy on a routine basis is acceptable is an indictment of our industry, IMO, and (at least on this occasion) not the regulators.

The first four are pretty common for Mac apps to collect without opt-in, as part of a software update check. I don’t think IP addresses really count as personal data if they are not linked with other identifying information. Otherwise, anyone with a Web site who didn’t disable logging would be considered to be collecting personal information.

They are.

If you have an IP address and a timestamp (which a log would), you can at least roughly trace someone’s location. If you’re an ISP, you can even trace it down to a single household.

The last item has people worried, but I’m not really sure what it means.

Arguably, “I’m not really sure what it means” is exactly the biggest problem here: lots of the FAQ is inconsistent and vague. It pretends to answer questions, but really poses more.

The table defines two categories (the distinction here is important, and your quote conceals this):

OS version, CPU, etc.
“data necessary for law enforcement”

For the first category, it then say they collect it for app analytics and improvement. So far, so good. Then they claim a “legitimate interest to ensure the proper functioning”. Not only is that obviously false (the app functioned just fine before it had analytics); it also conflicts with the remainder of the table row.

The second row is worse. “Data necessary for x” is not a valid answer to “data we collect”. The “why” column is equally redundant. The “legal grounds” column, as with the first row, is meaningless. I don’t think “we collect data necessary for x” will fly in a EU GDPR suit.

Many other quibbles. For example:

We may disclose the Personal Data listed above (your hashed IP address)

Isn’t a hashed IP address fairly easy to crack? There’s only 32 bits, and presumably, they don’t use a salt.

What they should’ve done, but don’t want to because it would make things more transparent, is make a third row in that table that mentions the collection of hashed IP addresses, with different “why we collect it” and “legal grounds” values.

>The first four are pretty common for Mac apps to collect without opt-in, as part of a software update check. I don’t think IP addresses really count as personal data if they are not linked with other identifying information. *Otherwise, anyone with a Web site who didn’t disable logging would be considered to be collecting personal information.*

They are.

If you have an IP address and a timestamp (which a log would), you can at least roughly trace someone's location. If you're an ISP, you can even trace it down to a single household.

>The last item has people worried, but I’m not really sure what it means.

Arguably, "I'm not really sure what it means" is exactly the biggest problem here: lots of the FAQ is inconsistent and vague. It pretends to answer questions, but really poses more.

The table defines two categories (the distinction here is important, and your quote conceals this):

1. OS version, CPU, etc.
2. "data necessary for law enforcement"

For the first category, it then say they collect it for app analytics and improvement. So far, so good. Then they claim a "legitimate interest to ensure the proper functioning". Not only is that obviously false (the app functioned just fine before it had analytics); it also conflicts with the remainder of the table row.

The second row is worse. "Data necessary for x" is not a valid answer to "data we collect". The "why" column is equally redundant. The "legal grounds" column, as with the first row, is meaningless. I don't think "we collect data necessary for x" will fly in a EU GDPR suit.

Many other quibbles. For example:

>We may disclose the Personal Data listed above (your hashed IP address)

Isn't a hashed IP address fairly easy to crack? There's only 32 bits, and presumably, they don't use a salt.

What they should've done, but don't want to because it would make things more transparent, is make a third row in that table that mentions the collection of hashed IP addresses, with different "why we collect it" and "legal grounds" values.

"Or it could just be boilerplate saying that Audacity will comply with lawful requests for the not very personal information that it is collecting anyway. Whether or not it’s spelled out in a privacy policy, most companies probably don’t have a choice about that."

Then why did they say it at all? Especially if there's nothing of value to law enforcement. That's the mystery.

Apple's App Store has a huge "About App Store & Privacy" document, and they don't mention law enforcement at all, so it's not a requirement.

As written, Audacity's draft reads like "We're collecting these 6 specific pieces of information plus also ANYTHING ELSE THE COPS WANT". As Kevin points out, its position in the table makes it suspicious. The first 6 lines are clearly a disjunction, yet apparently we were supposed to understand that the 7th line is a conjunction?

My sympathies to the poor employees on the other side of the outrage machine. My company has been there, and it wasn’t fun.

The proper response to this is: "Hey, could you clarify this?" not "Oh my god this is now evil spyware, we must fork it and burn the company at the stake". It’s only a badly written privacy policy. They haven’t actually been caught *doing* anything evil.

Leave a Comment