@phnt@fluffytail.org why so much hate towards anubis? I would have thought having some free software against scraping bots would be a good thing (?)
@phnt@fluffytail.org @waifu@mai.waifuism.life IIRC the point of Anubis is to prevent generic scraper bots, not targeted ones. It doesn't even stop scraping, just slows them down.
I find its quick rise in popularity to be a bit sus tbh
@phnt@fluffytail.org @waifu@mai.waifuism.life If the scraping bots remove the "I" from Mozilla, server operators can easily just block the unique user agent. Anubis isn't "a tool to stop bots", its a tool to verify connections pretending to be a browser are actual users.
@phnt @waifu Read the post of mine again. Anubis isn’t a catch-all bot filter, its to filter out connection pretending to be a browser. I know the “lmao I can bypass Anubis with this simple trick” is the peak of /g/ hacker mentality, but in the real world everyone already knew this. It is mentioned in the blog post where the guy talks about it.
Anubis exists solely to ensure connections that act like browsers to prove they are browsers. If a connection doesn’t pretend its a browser, Anubis does nothing, as is the intended effect, because at that point if the connection misbehaves, you can just block that particular UA.
don’t actually serve degraded content to other user agents probably
Depends on what you mean, I do “serve degraded content” to other UAs in the sense that I block LLM scrapers that misbehave when I can. To reiterate, Anubis exists because LLM scrapers pretend to be a browser with a regular browser UA. If one were to block that, you’d block all legitimate browser users, so realistically its not an option.
If you pretend to be a regular browser, Anubis exists to verify this. If you don’t pretend to be a regular browser, it does nothing.
How is this so hard to grasp.
@icedquinn @phnt @waifu Ah, yeah if you don’t do any UA blocking otherwise I guess there’s no point, but in my experience people seek out Anubis because general UA blocking couldn’t get the job done (because LLMs started to pretend to be browsers).
I would expect anyone that has set up Anubis to have reached the point where other forms of access control have stopped being practical.
@phnt@fluffytail.org @waifu@mai.waifuism.life You're having a really hard time reading my posts I guess. I'm sorry for you.
There is no reason to prove a browser is “real” in the normal world.
Okay, keep believing that.
you are trying to combat a bot attack
Correct. Sounds like there are reasons after all. Only took a whole sentence to figure that one out.
your mitigation simply has no effect
It appears to have an effect, and not just for my personal cgit. It appears a lot of people are using it because they are seeing a (positive) effect in combatting LLM scrapers with it.
it can be easily bypassed
It can, and that’s ok. Because if you bypass it, you become a unique UA that I can just block with any regular UA block in HAProxy. Even if you automate “random” UAs, I can put in a pretty excessive UA blacklist with patterns if I so desire. The entire point is that a connection using a regular browser UA has to prove they are in fact a regular, legitimate browser, because blocking those isn’t feasible, because you’d block nearly all legitimate traffic otherwise.
Its not a hard concept I think. I don’t know why I have to reiterate the same thing three times for you, but I truly hope this time it’ll stick. If not, for the love of Stallman please just cancel your Internet subscription.
@tyil @waifu Along with this one:
map_hash_bucket_size 256;
map $http_user_agent $git_scrapers {
default 0;
"~*claudebot" 1;
"~*meta-externalagent" 1;
"~*amazonbot" 1;
}
server {
server_name whatever;
listen 443 ssl http2;
listen [::]:443 ssl http2;
location / {
if ($git_scrapers = 1) {
return 402;
}
}
}
@phnt@fluffytail.org @waifu@mai.waifuism.life Its cute that you only get 20rq/s and think that's the scale of "the real world".
As a sidenote, I'm already blocking entire ASs.
@ullard@shitposter.world @phnt@fluffytail.org @waifu@mai.waifuism.life /g/ brainrot is a real disease :(
@p Anubis is total garbage that makes some sites basically unusable for me when my connection is poor (often). If AI is bad then mechanisms implemented to combat it are worse. You don't combat enshittification with more enshittification. This is like in 《The Matrix》 where they block out the sun to combat the machines, just making an already-bad problem worse. @tyil @phnt @waifu
@p@fsebugoutzone.org @tyil@fedi.tyil.nl @phnt@fluffytail.org what if I want to only get connections from users and want zero bots reading my sites (that already use JavaScript) wouldn't using Anubis work for me?
@waifu@mai.waifuism.life @p@fsebugoutzone.org @tyil@fedi.tyil.nl @phnt@fluffytail.org Bots can access the site even with anubis, it just takes slightly more computational resources
@earslash@ebiverse.social @p@fsebugoutzone.org @tyil@fedi.tyil.nl @phnt@fluffytail.org I'm guessing that's enough to defer a few of them no? Why do people use it if it doesn't work for this specific purpose?
@p@fsebugoutzone.org @phnt@fluffytail.org @tyil@fedi.tyil.nl what if I block every user agent that isn't a browser? I only want connections by users not bots
@waifu@mai.waifuism.life @p@fsebugoutzone.org @tyil@fedi.tyil.nl @phnt@fluffytail.org It does deter "some" bots - i don't know how effective it is in the larger context. I think a lot of the sites using it have poor coding behind them that can't efficiency handle any sort of major traffic, bot or not.
My insider source says the creator of Anubis is still surprised at the sudden intake of sites using it, and they never expected it to get this far (it was a personal side project after all). The PoW thing is supposedly temporary until they find a new way to determine bot traffic. See https://github.com/TecharoHQ/anubis/blob/main/web/index.templ
"Ultimately, this is a hack whose real purpose is to give a "good enough" placeholder solution so that more
time can be spent on fingerprinting and identifying headless browsers (EG: via how they do font rendering)
so that the challenge proof of work page doesn't need to be presented to users that are much more likely to
be legitimate."
@p Mastodon block-lists never get investigated or audited. If you get added to one you're there forever. And, they LARP about things like a "consensus model" for determining block validities---but, they all just blindly copy blocks over meaning that by de-facto they have an engineered "consensus" through unanimity. Mastodon + Mastodon users really are like AI/Borg or something. Their very existence is a Fediverse blight. @earslash @tyil @phnt @waifu
@p@fsebugoutzone.org @phnt@fluffytail.org @waifu@mai.waifuism.life @tyil@fedi.tyil.nl
Archive.today (and ghostarchive.org) do not use headless chrome, they use an actual Chrome instance (some sites can tell the difference).
@phnt
>Fediverse as a network can't survive a split where basically the Mastodon and the rest divide and are unable to talk to each other
We're already at that point. The Mastodon side of the Fediverse essentially exists behind a virtual Iron Curtain of their own creation; a digital Berlin Wall of sorts with things like #fediblock. Which, is fine; This instance, my instance, blocks most of the Mastodon network preemptively, leading us to almost entirely exist within the consciousness of non-Mastodon-Fediverse. And, it's a very comfortable place to be. There was nothing gained in federating with Mastodon and, conversely, nothing lost in basically cutting them from our network collectively.
>Imagine a Fediverse, where ……
Simply, instance software should not respect the wishes of remote instance software. Simple as. As far as I'm concerned this stuff only becomes an issue whereupon our own software begins respecting remote instance demands, which I would hope developers like Silverpill never implement. Don't respect blocks, don't respect post denials, don't respect remote deletion requests. All of that, if desired, can be handled locally on the remote server. That's their own problem. You should not ever be able to issue demands against my own server.
E.g.: my server can issue any actions it wants, and receive any actions it wants; Your server does not need to respect any of the actions issued from my server, and vice-versa. If your server does not want to accept posts from my server, or deletes from my server, or wants to delete posts from my server, that's fine……only insofar as it being handled locally on your server---you should not be able to effect my own instance, however. If you want my post removed from your timeline or thread? Fine. But, it ought remain on my own instance and others' own respective instances. You want to block me? Okay. But, I can still communicate with others included in the thread and continue the conversation.
If GoToSocial wants to adopt some hivemind Borg model as well, where any server gains control over remote servers, then so be it; They can do that if they wish. I don't know anybody worth following or caring about that runs GoToSocial and my general impression of the software is that it's exclusively utilized by people somehow worse than Mastodoners.
I think they’re mostly static blogs run by neurotics and they don’t actually have a problem with bots.
To be completely honest this is my take as well. There is little to no reason why the Linux Kernel, one of the most resourced (if not the most resourced) FOSS organizations in the world, has difficulty finding servers/creating code that can handle the load without Anubis.
They inserted the rent-seeking maneuvers from the start. They were hoping it would happen.
It wasn’t until the viral thelibre article came out that people started mass adopting Anubis. It is possible the creator set themselves up for promotion
@phnt@fluffytail.org @adiz@mtl.jinxian.casa @tyil@fedi.tyil.nl @p@fsebugoutzone.org @waifu@mai.waifuism.life While on the topic of Mastodon keep in mind the default robots.txt on Mastodon is still just "GPTBot", despite the existence of numerous other major AI scraping bots and the addition of GPT-Search. Eugen can not merge a simple change that adds the new bots, making it almost useless. really shows the priorities and the mindset of Mastodon devs
@earslash If you are a user of, or administrator of, Mastodon then you deserve the worst possible Fediverse experience and I hope only bad things for you. To quote Silky Johnston:
>I hate you. I hate you. I don't even know you, and I hate your guts. I hope all the bad things in life happen to you and nobody else but you.
@phnt@fluffytail.org @tyil@fedi.tyil.nl @waifu@mai.waifuism.life @p@fsebugoutzone.org I remember when 100% of the Linux money went into diversity lmfao
@p@fsebugoutzone.org @phnt@fluffytail.org @tyil@fedi.tyil.nl @waifu@mai.waifuism.life
Headless Chrome doesn't output to a screen at all, the "headful" chrome they use outputs to a virtual framebuffer (similar to headless Firefox with your description). Because Headless chrome didn't output to anything it was possible to detect when it was running. For example, a certain variable relating to graphics would be different, or a driver would be missing from the headless Chrome. There were programs that could "patch up" the discrepancies from the webpage through the browser instrumentation - but there would always be more 'telltale signs' than what the programs could hide. Also, they were able to archive sites that were known to block Chrome headless browsers.
https://freeman.vc/notes/headfull-browsers-beat-headless has a bit more information on the differences
@laurel@fsebugoutzone.org @p@fsebugoutzone.org @phnt@fluffytail.org @tyil@fedi.tyil.nl @waifu@mai.waifuism.life KiwiFlare only shows up once per session. Anubis shows up every 3rd time i am navigating a site with Anubis enabled. It seems to me KiwiFlare is better designed than Anubis though they have different goals.
+bonifartius 𒂼𒄄
@p @earslash @tyil @phnt @waifu @adiz
> Hell is in sight.
does depend on the people though? i don't think NAS or dobbs.town would flip the switch on this kind of functionality to "on". the people who do like this hugbox stuff already import the blocklists and defacto created their own part of fedi - if they want remote permission to post things it's just another way for them to kneecap themselves.
/GTS side. Maybe it will make more people using instances running on those aware of the madness. If I stop following like three people on the Pleroma side of Fedi, all the broken threads I see would go away. But those people also make interesting posts, so it's always a balance of I'm annoyed, but not enough.
@laurel@fsebugoutzone.org @phnt@fluffytail.org @tyil@fedi.tyil.nl @waifu@mai.waifuism.life @p@fsebugoutzone.org IIRC the kiwiflare was forked from another programmer, so most of the grunt work might had already be done
Please, sir, may I have some GETs?
Here's a page with a thousand images on it. You won't see the images, just they'll fail to load.
@p@fsebugoutzone.org @laurel@fsebugoutzone.org @phnt@fluffytail.org @tyil@fedi.tyil.nl @waifu@mai.waifuism.life
The creator of Anubis claims there is "bait" in Anubis which AI companies are taking
Take of this what you will - in my opinion its a 100% bluff considering it is easily bypassable by anyone
@adiz @phnt @earslash @tyil @p @waifu @bonifartius a lot of the people using bsky are people who hate mastodon because of the shit mastodon instances pull (which you never see around here) but also because of that and "problematic" people on this side of the internet, they want to be in the cool kids club.
If you're not on a instance that blocks, you can feel like Patrick Bateman in the business card scene, able to Link Up (tm) to all the techbros to get yourself a feature because you linked up to the right person and made something foss.
arstechnica.com/gadgets/2022/05/microsoft-open-sourced-the-code-for-1995s-3d-movie-maker-because-someone-asked/
@sendpaws@mitra.pawslut.party @adiz@mtl.jinxian.casa @phnt@fluffytail.org @tyil@fedi.tyil.nl @p@fsebugoutzone.org @waifu@mai.waifuism.life @bonifartius@qoto.org I hate this foone guy since back in his Twitter days
@earslash @phnt @bonifartius @tyil @waifu @p @adiz part of why I'm into PC98 stuff is it repels some of the worst people in that so called community I mean some are into it.......but they seem to have a melty about Japanese video games. Turns out being a ex-$10 forum poster rots your brain and makes you feel some type of way about them.
@phnt @earslash @tyil @p @waifu @adiz @bonifartius foone is literally a typical "retro tech eceleb" (annoying, half the shits wrong) but formatted for Twitter retweets.
Which is the point about bsky I'm making too, the Twitter "cool kids" crowd moved there when they could no longer do the same shit they did back in the day with zero opposition.
@mrsaturday @earslash @tyil @phnt @p @waifu @adiz @bonifartius remember if you're an ex "internet hate site" janny, you can wipe the sins away by eating whoppers and big macs
@mischievoustomato There will be no "war". You're just going to see the network schism between those who want to have fun vs. those who want to control everything and everyone. Which, is basically the contemporary status quo anyway. So, I don't really think that much will change in general. @earslash @tyil @phnt @p @waifu
@p @earslash @tyil @phnt @waifu @adiz I'll never understand why people are so fervent on blocking AI user-agents.
I could maybe understand it if the rationale was traffic/load, but it really does appear to be "le ai bad" which has been rolled into the hivemind shit along with "le free speech bad"
But ultimately, all you're doing is ensuring that your message, the message you supposedly wanted out on the Internet for people to interpret and maybe accept, will not be included in the model and therefore cannot influence its output.
My Ekko lore archive gets a *lot* of traffic from OpenAI and Anthropic and I'm very happy about this because that means I'm in a position to countermessage TV Show before the LLM even begins producing a response to the user query.
@mrsaturday @earslash @tyil @phnt @p @waifu @adiz @bonifartius "nobody doesn't like me here"
@sendpaws
>fewer people tell me to kill myself here
Didn't BS just experience a sort of cultural disruption when administration decided that people couldn't tell other people to commit suicide anymore? @mrsaturday @earslash @tyil @phnt @p @waifu @bonifartius
@mrsaturday @earslash @tyil @phnt @p @waifu @adiz @bonifartius gonna name this "bsky interactions in irl" .mp4
@adiz @mrsaturday @earslash @tyil @phnt @p @waifu @bonifartius does it matter if it's selectively enforced anyhow?
@mrsaturday @earslash @tyil @phnt @p @waifu @adiz @bonifartius that's to cover the rage hole he punched in his wall after raging on stream
@p @earslash @tyil @phnt @waifu @adiz @bonifartius @mrsaturday that's foone irl (real selfies)
but using it globally will do nothing since you can still get posts from elsewhere and without a browser user-agent. Otherwise the federating protocol wouldn't be able to work.
Check this out, here are your public posts even if you would completely disable public access to Misskey's API:
curl -LH "Accept: application/activity+json" "https://novoa.nagoya/users/8ukmmetqq5/outbox?page=true"
How did I get there?
1. https://novoa.nagoya/.well-known/webfinger?resource=acct:sarvo@novoa.nagoya in a browser
2. copy the URL from href field in "0" field in the links array
3. curl -LH "Accept: application/activity+json" "https://novoa.nagoya/users/8ukmmetqq5"
4. find the outbox field
5. curl -LH "Accept: application/activity+json" "https://novoa.nagoya/users/8ukmmetqq5/outbox"
6. copy URL from the "first" field
7. curl -LH "Accept: application/activity+json" "https://novoa.nagoya/users/8ukmmetqq5/outbox?page=true"
8. Profit
I could maybe understand it if the rationale was traffic/load
That is my rationale, at least. I only deploy Anubis on cgit because that’s the only service I host that goes OOM whenever it gets several thousand requests per second. My blog is a completely static site, it can handle that kind of load from LLM scrapers, so I have no need to try and limit the amount of connections there.