Monday, March 14, 2022

DuckDuckGo Will Down-Rank Russian Disinformation Sites

Tom Parker (Reddit):

The founder of DuckDuckGo, a Google-alternative search engine that has touted its “unbiased” search results for years, has announced that it has started down-ranking sites based on whether they’re deemed to be associated with Russian disinformation.

[…]

The practice of suppressing content that is deemed to be disinformation while elevating content that’s deemed to be “high quality information” is something that has been embraced by Google, particularly on YouTube where so-called “high authority channels” are up to 20x more likely to top search results and censoring “misinformation” is its number one priority.

Prior to Weinberg’s announcement, DuckDuckGo had made multiple statements over a period spanning more than five years that positioned DuckDuckGo as a search engine that provides “unbiased results,” criticized other search engines for failing to show “neutral, unbiased results,” and criticized bias in algorithms.

Via Mike Rockwell:

We want search engines to rank results based on relevancy and that can be determined by numerous factors. That’s literally what search engines do. But they’re making a determination as to whether or not a piece of content is “disinformation” and then down-ranking content based on that. That’s an editorial decision.

And what if they’re wrong? What if they down-rank content that is later found to be true? What if someone is specifically looking for “disinformation” content for research purposes — to see what the opposing perspective has to say in order to better form their opinions or to point at the absurdity?

[…]

I don’t use DuckDuckGo directly anymore, that changed last year when I started self-hosting SearX. But I still use DuckDuckGo as one of the search engines powering SearX’s results.

A. Khalid (via Nick Heer):

Earlier this month, DuckDuckGo announced it would pause its relationship with Russian-state owned search engine Yandex.

A number of platforms including the Meta-owned Facebook and Instagram have also demoted posts from Russian state media. Google has been down-ranking search results from Russian state news agencies since 2017.

DuckDuckGo:

The primary utility of a search engine is to provide access to accurate information. Disinformation sites that deliberately put out false information to intentionally mislead people directly cut against that utility. Current examples are Russian state-sponsored media sites like RT and Sputnik. It's also important to note that down-ranking is different from censorship. We are simply using the fact that that these sites are engaging in active disinformation campaigns as a ranking signal that the content they produce is of lower quality, just like there are signals for spammy sites and other lower-quality content.

So this sounds like a very coarse adjustment. They are not evaluating whether a given piece of content is disinformation, and I guess true content on these sites will be down-ranked as well. The search results will still be “complete,” just perhaps in a different order than before.

The examples given are sites that are well known to be controlled by the Russian government. It would be interesting to know what the other sites are, and whether they are only targeting Russian sources of disinformation.

Previously:

21 Comments RSS · Twitter

Well, time to find a different search engine.

> The primary utility of a search engine is to provide access to accurate information.

I would disagree. The primary utility of a search engine is to provide access to *relevant* information. There’s ways to discern accuracy without making direct editorial decisions — how often people click on links beyond a given link, for example.

> It would be interesting to know what the other sites are, and whether they are only targeting Russian sources of disinformation.

It would be nice for them to publish a report with information like this. If they’re going to make editorial decisions in results ranking, at least be as transparent about it as possible.

Old Unix Geek

At one point I considered working for DDG. No more.

Tools are supposed to be neutral, not have opinions.

If I want to know what the Russian view, the anti-vax view, the anti-whatever-those-in-charge-don't-like view, that's my prerogative in a free country. It doesn't seem that the West understands this anymore. "Nudge" has devolved into "Bully, Lie, Misinform, Coerce". It disgusts me.

Meanwhile, try searching DDG for "sitescreenname" and see what floats to the top.

"Tools are supposed to be neutral, not have opinions."

This is a good take for a word processor, but it's an odd take for a tool whose job is literally to select a few things out of billions of options. That can't be done without opinions.

"whether they are only targeting Russian sources of disinformation"

They're obviously not. This is apparently not something people have realized before, but search engines have always made an effort to downrank misinformation in their search results. This isn't exactly a new idea.

If you search for stuff like "flat earth", you get links to actual scientific content, not to flat-earther websites, both on DDG and Google. This isn't what a generic pagerank-type algorithm would produce. And, in fact, it didn't use to be the case in the past.

"If they’re going to make editorial decisions in results ranking"

In my opinion, this is drawing a distinction without a difference. Any ranking algorithm will make editorial decisions. Unless you want to go back to a situation where relevancy is determined by the amount of times a word appears on a page, that's unavoidable.

"If I want to know what the Russian view, the anti-vax view, the anti-whatever-those-in-charge-don't-like view, that's my prerogative in a free country"

You can. The only thing this does is provide actually useful search results for the 99% case where you *don't* want misinformation shown at the top of your search results.

What exactly do you think the alternative here is? Search engines should literally stop caring about the quality of the search results they provide?

In my view *no* tool is neutral. You don't whisk your eggs with a hammer, just as you don't clean your windows with a balloon whisk.

Tools are meant to be heavily biased towards a certain task. I'm thinking @Old Unix Geek meant that tools shouldn't have biases within that task, a hammer that only drives nails from a certain manufacturer, a whisk that beats bearnaise but not hollandaise. Those would be crap.

As a fun exercise people can spend five minutes thinking what a truly unbiased list of search results would look like.

@Mike Rockwell How does SearX order search results?

Search engines have always downranked sites that lead to less useful results.

First PageRank downrated unpopular information in favour of popular information (measured by link count), on the benign assumption that popularity was correlated with accuracy and depth.

Spammers realised that if you made false information popular via a network of fake pages (eg take this one pill to XYZ) you could ensure frauds were suggested before facts. This was Google bombing, and was popular a decade ago.

From that point search engines like Google, Bing (and therefore DuckDuckGo) started evaluating the quality of link origin sites against various criteria, invented by humans and then encoded into algorithms, to ensure facts were suggested before fraud.

(Note that even machine learning approaches are designed by humans in that it’s humans that design the training datasets, evaluate the models against their criteria of human invention, and decide to deploy or not deploy the model)

This was pretty much always directed at sites, since modelling individual pages in spammy link farms would be computationally prohibitive.

It might superficially appear that what’s being proposed here is different, but in reality people have identified a series of sites that are are employing popularity to promote fraud over fact, and have developed an algorithm to combat it.

Ultimately search engines primary focus, the primary thing that they promote themselves on, is the most accurate information, rather than the most popular. In many domains (eg pharmaceuticals) spam is more popular, in terms of basic links, than quality content, but users have shown a clear preference for quality content.

“Relevance” to a query has therefore historically been related to concepts of accuracy, not popularity, despite the ontological acrobatics I see some people engaging in.

Ultimately fraud causes harms individuals and society at large. There is a legitimate social interest therefore in frustrating fraudsters in their attempts to create such harm.

Tools are supposed to be neutral, not have opinions.

Few tools are truly neutral. A lot of tools are optimized for right-handedness, for example. It’s not malice; it might be a market calculation, or it might be unconscious bias.

As for search engines, they’re not the Yahoo! Directory or dmoz. Google started out by ranking pages not by what their own meta tags claimed, but by other pages vouching for them. There’s already an inherent bias to that. It elevates the status quo. What if a page is really good but nobody knows about it yet? What if a page is good but people don’t want to link to it, because they’re biased against it? Well, you might find that page on SERP 27, but probably not, since almost nobody goes beyond page 1.

So, “neutral” is already too broad a concept.

If I want to know what the Russian view, the anti-vax view, the anti-whatever-those-in-charge-don’t-like view, that’s my prerogative in a free country.

And you can have those, just not through DDG.

(What if this were dmoz? Well, that’s biased, too. Who designs the taxonomy, i.e. who decides what categories exist in the first place?)

Old Unix Geek

Information is that which allows one to correctly predict something. Noise is that which doesn't.

Lossy compression algorithms work by reducing the amount of "irrelevant" information (namely high frequencies in pictures and sound which mostly correspond to noise). They keep the information that corresponds to what lets me see the picture. As a result they may fail on cartoons, but then that's all cartoons. Such algorithms are neutral tools: they don't compress pictures of certain types of "approved content" and reject "unapproved content". Apple's Neural Hash was an example of the latter, a non-neutral tool. To me that's dangerous, even though I'm totally against the exploitation of anyone. The tool is no longer neutral.

What DDG is doing is also the latter: reduce the rank of unapproved information. What it should be doing is the former: privilege information that lets people better predict what will happen next.

Yes, that's hard, very hard.

However the solution isn't simply to replace the problem by another you can solve. That's what only reporting what those in power want to be reported is doing. This "solution" changes the information landscape of our societies from free societies to non-free ones.

Promoting "approved narratives" is what the big tech companies have been doing, over the last year. If you look in detail at the approved CDC/WHO COVID narratives, you'll find that you predicted much worse based on their information than if you followed "unapproved" narratives that were based on what was known before the pandemic (e.g. N95's work, other masks, not so much). In other words, noise, not information, by my definition, was promoted.

The societal consequences of approved narratives can be enormous: we are often lied into wars by our governments. WW2 started when "Poles attacked Germany" according to the official German narrative of the time. The same may be true for Pearl Harbour, the Gulf of Tonkin incident, etc.

Search engines are a gateway to information. As such they have a special responsibility. Recall Google's mission statement: to organize the world's information. Many people only look at one search engine's results. To look at many is rare. So I don't find the argument that "you can get it elsewhere" particularly persuasive, and therefore I find DDG's recent move (and if I understand correctly, Bing's, and who knows who else's) to be reprehensible.

The silver lining is that DDG said they were doing this, so I can now avoid them, but most people won't know that, and it will change society's understanding of what is happening.

I'm sorry, but I just don't subscribe to this postmodernist idea that reality doesn't exist, and everything is a matter of narratives and approvals and perspectives, and I'm really, really glad for every single tech company that doesn't, either, and instead strives to provide factual, scientifically valid, reality-based information to its users.

Old Unix Geek

@Plume You misunderstood what I said.

Predictions are checked by comparing to reality once time has elapsed.

The tech companies aren't providing "factual, scientifically valid, reality-based information to (their) users". If they were, what they said would match reality later. It doesn't. That's my problem with opinion based ranking of information.

E.g.: the CDC said COVID vaccines would stop transmission. They didn't. People with lifetimes in vaccine research predicted they wouldn't beforehand. They were right, and have a model that is too complex to go into here which explains why. Therefore I adopted their model and pay attention to them. The CDC got it wrong. Walensky recently said she was hopeful the vaccines would work at the time, which doesn't seem to me to be firm ground on which to base policy.

All I'm doing is following basic scientific methodology: one should prefer the model that predicts something correctly over the model that doesn't. As a lesser consideration, one should prefer the model that is parsimonious of the one that isn't. However I can't do that if I'm prevented from knowing about some models.

"Predictions are checked by comparing to reality once time has elapsed."

That's not a useful measure, because it depends on the future. Since you don't know what the future is, you can't use it in the present to determine how to rank search results. So the only conclusion one can draw from this position is that ranking is inherently pointless, which is not a conclusion anyone agrees with, presumably not even you.

"People with lifetimes in vaccine research predicted they wouldn't beforehand"

I assume this implies that your position is that there actually *is* a way to determine ranking based on currently available information: that's good, I agree with that.

If your point is "Google should rank data based on the current scientific consensus, rather than on one government agency's position", then yes, we're on the same page.

But that's precisely what search engines are doing when they downrank state-sponsored media like RT, which has repeatedly and intentionally published false information. So now I'm not exactly sure what you're proposing should be done: do you want search engines to prefer models that make better predictions (as you seem to say now), or do you want them to be neutral (as you said before)?

>Lossy compression algorithms work by reducing the amount of "irrelevant" information (namely high frequencies in pictures and sound which mostly correspond to noise). They keep the information that corresponds to what lets me see the picture. As a result they may fail on cartoons, but then that's all cartoons. Such algorithms are neutral tools

If I could ask one thing, Old Unix Geek, it would be to examine this assessment from one level higher up the chain of abstraction.

In this example you highlight, I don't know whether the compression algorithm's bias toward photography and away from line art was deliberate or unintended, but that bias is there, and it's baked into the algorithm. To me, the algorithm is not neutral, because it is biased toward preserving the perceived visual quality of one kind of image over another. Having a bias — whether it's unintended or a design choice — is by definition not neutral.

As an analogy, if an algorithm is biased against a minority, but that bias was applied equally to all members of that minority, does that make the algorithm neutral in your assessment? Again, to me the answer seems clear that it's not neutral, because it is exhibiting a bias.

This isn't a theoretical concern, as a lot of our technology tends to disproportionately harm or exclude women, minorities, people of color. Often those harms are perpetuated under claims that these technologies are neutral when in fact they are reflecting and amplifying the biases and disproportionate vulnerabilities that already exist in our society.

Old Unix Geek

@Plume: it is a useful measure. Einstein's theory of relativity postulated in 1915 that light would be bent around a star (or other heavy object). Newton's didn't. Needless to say, the scientific consensus was for Newton and against Einstein. When there was an eclipse in 1919, Einstein was proven right. A fun story if you'd like to read about it:

https://earthsky.org/human-world/may-29-1919-solar-eclipse-einstein-relativity/

Had no one been able to learn about Einstein's theory, we'd have simply maintained the "scientific consensus". That's what today's "fact checkers" (often people without any scientific degree) would impose: "Fact Check: False. The accepted Scientific Consensus is that Newton, an eminent physicist, correctly described the movement of all bodies with his three laws of motion in 1686. Mr Einstein, a lowly patent clerk in Switzerland, studying part time for a yet to be awarded PhD, published this year on such a wide range of topics (photo-electric effect, Brownian motion, special relativity, mass-energy equivalence) that it would be quite miraculous if any of his ideas amounted to much. He'd be better advised to concentrate on a single topic."

Predictions about the future are key: it's how the state of our knowledge improves. The fact that a theory predicts something that others cannot, makes that theory particularly valuable.

@Remah, lossy image compression is neutral with respect to content. It doesn't care whether the image is pro-Russian, pro-flat-earth-theory, pro-whatever. Sure, there is a trade-off: it's not good at line art, but you wouldn't start arguing that phones are not neutral because they can't transmit the ultrasonic utterances of bats, would you? Everything in life involves physical and mathematical trade-offs. If I only have black pens, I can't draw colour pictures, but that's not the same as telling me what I'm allowed to draw, and what I'm not.

"Disinformation sites that deliberately put out false information to intentionally mislead people directly cut against that utility." Right. Sure. I get it. "Current examples are Russian state-sponsored media sites like RT and Sputnik." Maybe so. But who decides? "Quis custodiet ipsos custodes?" has never been more relevant than today.

It's been clear to me for decades that media sites like CNN and MSNBC publish outright lies constantly. Does DDG "down-rank" them? Of course, they're not "state-sponsored". No, they're just owned by the same entities that also own the State here in the once-USA, and reliably promote the narrative issued by that State.

Certainly all search engines make "editorial" decisions as to which results to display out of the millions available. There is in fact no such thing as "neutral", or "objective". We all make choices as to what we take to be real. However, such mealy-mouth language appearing precisely at this moment makes it transparently clear that DDG IS "biased", by any rational person's understanding of the meaning of the word. That's enough for me to down-rank THEM.

Now, I expect that Brendan Eich and I would probably disagree on a number of significant issues, but that he has explicitly stated that Brave Search is not changing how it shows results at this time, in this atmosphere of deliberately-induced hysteria, is enough to earn my respect – and use of his platform.

Gotta agree with HandyMac -- who's to say that RT is any less reliable than NYT or CNN? All media sometimes tells the truth and sometimes lies, often on behalf of the government of their country whether directly funded by tax dollars or not. At least RT is honest about where their funding comes from. How is it any different than the BBC? or CBC? or NHK? or PBS? Is anyone going to claim that the BBC etc has never lied, and that when it is telling the truth it's telling the 110% whole truth from all perspectives? DuckDuckGo is censoring results based on the desires of the US Gov, whether DDG realizes it or not.

"it is a useful measure"

It's not a useful measure to rank pages right now. Obviously, it's a useful measure to validate hypotheses later on.

"Had no one been able to learn about Einstein's theory, we'd have simply maintained the "scientific consensus"

See, that's where you completely lose me, because I genuinely do not understand the argument you're making. Do you think scientists use Google to do their research? When I studied physics and the prof explained quantum mechanics, she didn't tell us to open Google and see what it returns. So search engines upranking sites that explain the current scientific consensus has absolutely no negative impact on scientific progress.

And, in fact, even assuming that was how science was done, what you're proposing basically means that, instead of the current scientific consensus being ranked the highest, we'd instead get the thing ranked highest that has the staunchest supporters, whether it's state-sponsored lies, or whether it's insane conspiracy theories. Russian disinformation groups have big budgets, and flat-earthers have a hell of a lot of free time on their hands. How do you think that would help advance science?

All it would achieve would be to make us even dumber.

"It's been clear to me for decades that media sites like CNN and MSNBC publish outright lies constantly"

So you're saying there's no difference between RT and CNN at all? If the answer to that is "Yes", then you've genuinely lost all connection to reality. You don't exactly see CNN reporters leave and then publicly write about how they were forced to publish things they knew were false. In fact, I loved the recent Palin defamation lawsuit, where the NYT was forced to release internal communication, which clearly showed that these people are not intentionally lying, even when they publish falsehoods.

If, on the other hand, your answer is "No", and you agree that there is an actual difference between RT and CNN, then DDG is justified in ranking one site higher than the other.

Had no one been able to learn about Einstein’s theory, we’d have simply maintained the “scientific consensus”.

Are you actually arguing that “Russia is in Ukraine to fight nazis” is a valid scientific hypothesis that might one day become the consensus, rather than the current consensus of “the Ukrainian President is Jewish, so, that’s probably not why”, or are you just pretending?

lossy image compression is neutral with respect to content

But it isn’t.

If one culture is more likely to use art such as cartoons, and another culture is more likely to use photorealistic art, guess which one lossy compression is biased against?

If you make something harder to use for some people than other people, intentionally or not, that’s bias. See the often-cited example of a water faucet that detected light skin quite well, but dark skin poorly. Was there malice involved? Probably not. But the effect is nonetheless that biased technology was installed in that bathroom.

No, they’re just owned by the same entities that also own the State here in the once-USA, and reliably promote the narrative issued by that State.

That doesn’t even make sense. Did CNN and MSNBC publish pro-Trump messages when he was President? Not really.

They do promote a certain narrative (slightly left in the case of MSNBC, and rather centrist in the case of CNN), but whether that has anything to do with the State’s narrative depends a lot on, y’know, elections.

who’s to say that RT is any less reliable than NYT or CNN?

Me. I’ll happily say that RT is a lot less reliable than NYT or CNN.

Is anyone going to claim that the BBC etc has never lied

Intentionally? They probably haven’t. Have they made editorial decisions I’ve disagreed with? Yes.

See, that’s where you completely lose me, because I genuinely do not understand the argument you’re making. Do you think scientists use Google to do their research?

I think OUG is arguing that public opinion is shaped too much by existing biases. Which I think is fair. Where OUG kind of loses me is that I don’t get the impression that fact checkers would have discredited Einstein.

"I think OUG is arguing that public opinion is shaped too much by existing biases."

This is technically true, but in the context of this discussion, not really a compelling argument. The current scientific consensus is still the best available information, so it *should* shape public opinion, despite the fact that, inevitably, some of it will be falsified in the future.

Old Unix Geek

@Plume

Of course scientists use Google to find papers in their fields. You do realize the internet was initially only available in universities? We used it to disseminate papers. It was only later that everyone else joined in, and that's been a mixed blessing. In some ways it's gotten better over time: interlibrary loans to get relevant conference papers was quite the pain when I did my PhD. Now more research is available online. On the other hand, there's a lot more rubbish too.

Secondly, funding in Science unfortunately depends on what is trendy. Funding decisions depend therefore on the "zeitgeist", which often ends up depending on what the public believes. This is another way that distorting public opinion has an effect.

This isn't new. It's a variation of the common saying that Physics advances each time a prominent physicist dies. Why? Because each such physicist has his pet theories and prevents alternatives from rising to the fore.

The same phenomenon applies to news, by the way. Editors believe what's in the newspapers they read, rather than what their journalists on the ground might report, which means what the audience hears is informed by the consensus in London (for the BBC), not necessarily by the facts on the ground. If you're a journalist on the ground it can be quite annoying, when they delete key sentences that do not fit their view.

However my point was rather that the basic scientific methodology is our best way of determining reality. To me, that means we should follow it not just in Science, but also to understand other aspects of the world, such as world events.

@Sören asks whether the Ukrainian government contains significant numbers of neo-Nazis. Well...

The Azov battalion was a volunteer paramilitary group absorbed into the Ukrainian military:

https://cisac.fsi.stanford.edu/mappingmilitants/profiles/azov-battalion

https://www.aljazeera.com/news/2022/3/1/who-are-the-azov-regiment

https://thehill.com/policy/defense/380483-congress-bans-arms-to-controversial-ukrainian-militia-linked-to-neo-nazis

It, and other such groups, were funded by billionaire Kolomoisky who also funded Zelenskyy's rise to power. Zelenskyy was a comedy actor, and he made a rather funny TV show (funded by Kolomoisky) about becoming president of the Ukraine, called "Servant of the People", rather in the vein of Yes Primeminister in the UK (Gorbachev's favorite program).

https://www.youtube.com/playlist?list=PLJo-obgJSqxbzEDvUHX9jiX2DXWGSvv0T

Since:

* Russia lost up to 30 million people fighting the Nazis in WW2,

* and therefore is very very sensitive to Nazis, like most Jewish people are,

* and since these extreme right wing government entities have been shelling the Russian ethnic areas of Ukraine since 2014

* it would not surprise me if Russians thought that "Russia is in Ukraine to fight nazis", and push Nato away from Moscow.

Putin, and the vast majority of Russian politicians in the Duma, agree on this point. Putin's popularity has risen, which suggests Russians agree with this narrative. Of course one can invent reasons why this doesn't mean anything, but the evidence I've read doesn't concord with that.

Since I don't support wars, I believe understanding how others think is important to solving conflict before it reaches that point.

I believe I explained my point about compression being content-neutral as clearly as I can, with the drawing with a black pen analogy. If you don't get it, I give up.

Did Russia invade Ukraine to fight Nazis? Probably not.

Are Russia currently fighting Ukrainian Nazis? Yes, absolutely, that's not questionable. And people in the West are sharing those Nazis' posts and cheering them on unabashedly.

Leave a Comment