social.kernel.org

Conversation

Jonathan Corbet

Ah joy ... Google is turning off its URL shortener and breaking every link that ever used it:

https://developers.googleblog.com/en/google-url-shortener-links-will-no-longer-be-available/

A quick search on lore.kernel.org:

https://lore.kernel.org/all/?q=goo.gl%2F

...turns up about 19,000 messages with affected links. That's a lot of history that is going to become harder (or impossible) to find.

780

589

Simon Phipps

webmink@meshed.cloud

1 year ago

Reply to @corbet

@corbet
Irresponsible cultural vandalism.

@bagder

Albert Cardona

albertcardona@mathstodon.xyz

1 year ago

Reply to @corbet

@corbet

Will the webmasters run a script to dereference all the URL shorteners? They can.

Jonathan Corbet

corbet

1 year ago

Reply to @albertcardona@mathstodon.xyz

@albertcardona I am thinking about hacking together a URL-replacement script for LWN. Doing that on lore, though (or the LWN email archive) would be a rather more painful prospect, to say the least. I would honestly be surprised if it actually got done.

Tofu Golem

tofugolem@mastodon.social

1 year ago

Reply to @corbet

@corbet
Good. I think those shorteners created more problems than they solved.

Chris Vest

chrisvest@mastodon.social

1 year ago

Reply to @corbet

@corbet ripgrep query that finds 'goo.gl' links in a directory, e.g. archive of mail or social media dump:

rg -oNI --no-heading -e 'https?://goo\.gl/[0-9a-zA-Z/]+'

clar fon

clarfonthey@toot.cat

1 year ago

Reply to @corbet

@corbet this makes me wonder whether the Wayback Machine tracks shortened links properly…

since I guess that would be a way to recover them after it gets shut off, albeit in a very annoying way

Andrew

andrewt@mathstodon.xyz

1 year ago

Reply to @tofugolem@mastodon.social

@tofugolem @corbet sure, but Google must have the resources to simply make it read only rather than breaking all those links

Phil

phil@chaos.social

1 year ago

Reply to @corbet

@corbet Recently watched a talk by @textfiles about losing our history. And apparently link shorteners were already a problem back then.

"URL shorteners are the stupidest idea we've come up within the last 10 years."

https://youtube.com/watch?v=tJqZGRIwtxk

Sven Slootweg (soft-deprecated)

joepie91@pixie.town

1 year ago

Reply to @clarfonthey@toot.cat

@clarfonthey @corbet Sort of; URLTeam (part of ArchiveTeam) has been continuously archiving link shorteners: https://wiki.archiveteam.org/index.php/URLTeam, and although not a part of the Internet Archive, the crawls *do* end up there I believe

Tomodachi94

tomodachi94@floss.social

1 year ago

Reply to @clarfonthey@toot.cat

@clarfonthey @corbet yes, they do work properly inside WBM in most cases. There is a long history of doing this for dead or dying link shorteners: https://wiki.archiveteam.org/index.php/URLTeam

Vegard

vegard@mastodon.social

1 year ago

Reply to @corbet

@corbet I think the vast majority of these are from syzbot emails and many are the same, a link to the syzbot docs. Something like https://lore.kernel.org/all/?q=nq%3Agoo.gl+and+NOT+%28f%3Asyzbot+OR+s%3Asyzbot%29 returns only about ~700 emails

Matt Nordhoff

mnordhoff@infosec.exchange

1 year ago

Reply to @corbet

@corbet Developing that interstitial page and writing that blog post has to be more work than keeping the service running forever! Fricking Google.

(Unless another team deprecated the infrastructure it runs on.)

Jason Scott

textfiles@mastodon.archive.org

1 year ago

Reply to @phil@chaos.social

@phil @corbet check urlte.am

Jonathan Corbet

corbet

1 year ago

Reply to @vegard@mastodon.social

@vegard Better but still really painful to fix; public-inbox is pretty firmly built around the idea that archived messages do not change.

Adam ♿

voltagex@aus.social

1 year ago

Reply to @corbet

@corbet @vegard links breaking and needing to be updated could possibly have been foreseen when public-inbox was built.

Oleksandr Natalenko, MSE

oleksandr@natalenko.name

1 year ago

Reply to @corbet

@corbet We deserve this for relying on proprietary services frivolously.

verita84

npub1x0r5gflnk2mn6h3c70nvnywpy2j46gzqwg6k7uw6fxswyz0md9qqnhshtn@momostr.pink

1 year ago

Reply to @corbet

weird. that’s perfect for analytics and they love data

Richard Emling (DO9RE)

tschapajew@metalhead.club

1 year ago

Reply to @corbet

@corbet @darrell73 Wenn da keine Bildbeschreibung dran hängt, mach ich nix damit. Dachte, ich bin das linksperformative Profil durch entfolgung losgeworden. Muss ich scheinbar noch nen Block reinhauen. Schade.

ferricoxide@evil.social

1 year ago

Reply to @corbet

@corbet@social.kernel.org

World needs to take the hint and stop relying on Google for anything.

sj

1 year ago

Reply to @corbet

@corbet Thanks for headsup. I was more curious about the source tree. Seems only a couple of shortened URLs are in the source tree, fortunately.

```
$ git grep goo.gl
Documentation/filesystems/9p.rst: http://goo.gl/3WPDg
net/ipv4/Kconfig: delay gradients." In Networking 2011. Preprint: http://goo.gl/No3vdg
```

I may post patches later, unless others do.

Levka

LevZadov@kolektiva.social

1 year ago

Reply to @webmink@meshed.cloud

Edited 1 year ago

@webmink @corbet @bagder

Google also trashed the Blues by the Bay podcast which made all the old shows unavailable, lost like tears in the rain.

I'm really pissed off about this. If anybody knows where the old shows *are* available, please tell me.

Erisa

erisa@awau.social

1 year ago

Reply to

@183231bcb @corbet those use "forms.gle" now so I wouldn't think so.

the hatter

hatter@metasocial.com

1 year ago

Reply to @andrewt@mathstodon.xyz

@andrewt @tofugolem @corbet They did, they made it read-only 6 years ago. And now they're giving another 12 months warning, adding a little pain to links using it, even more chance for those using it to do something about it.

Vegard

vegard@mastodon.social

1 year ago

Reply to @corbet

@corbet FWIW I captured a snapshot of those (non-syzbot) redirects here: https://github.com/vegard/vegard.github.io/blob/master/linux/2024-07-19/goo.txt

Of course some (many?) of the redirected-to URLs are themselves already defunct...

kurtsh

kurtsh@mastodon.social

1 year ago

Reply to @corbet

@corbet This seems to happen every month or so. It's amazing to me that people still keep going back for more. 295 products/services & counting!

Google Graveyard - Killed by Google
https://killedbygoogle.com/

Aaron

aaron@chirp.zadzmo.org

1 year ago

Reply to @corbet

@corbet I feel more vindicated self hosting everything I can every passing day.

Jobu Tupaki

RubyTuesdayDONO@mastodon.social

1 year ago

Reply to @corbet

NEVER TRUST GOOGLE

v̾i̾t̾r̾i̾o̾l̾i̾x̾

vitriolix@mastodon.social

1 year ago

Reply to @hatter@metasocial.com

@hatter @andrewt @tofugolem @corbet the point is it's breaks every archival post ever for posts that used it, it's a huge loss

feld

feld@bikeshed.party

1 year ago

Reply to @corbet

@corbet I just mentioned the other day if you look at recent amicus briefs they use tinyurl in the citations

[GARLIC] Lunya 🧄

luna@catgirl.center

1 year ago

Reply to

@shiri @corbet ArchiveTeam has software you can run on your computers to help archive all kinds of services that are about to shut down, and one of their long-term projects (URLTeam) archives URL shorteners (from what I can tell, goo.gl isn’t currently being actively archived, but I assume that’ll change soon)

Lunaphied

Lunaphied@tech.lgbt

1 year ago

Reply to @corbet

@corbet hopefully someone resolves all of them and stores them in an index somewhere

the hatter

hatter@metasocial.com

1 year ago

Reply to @vitriolix@mastodon.social

@vitriolix @andrewt @tofugolem @corbet Someone maintains goo.gl, someone maintains those archives. When the people still maintaining the shortener stop caring about the shortener, it's time for the archivist to do the work to preserve what they care about. Also, other archivists are doing what they can to preserve all links, regardless of immediate value that anyone else gives to each link. Likely very little will be lost in such a long deprecation cycle, and even less of value will be lost.

41402-nyan

a1ba@suya.place

1 year ago

Imagine relying on url shorteners in the first place.

Esparta

esparta@ruby.social

1 year ago

Reply to @corbet

@corbet @vegard in this particular case Google's syzbot shouldn't be using that shortener since

... checks their notes ...

since 6 years ago! (circa 2018)

it's missing 🔜 eth0 🎃

domi@donotsta.re

1 year ago

Reply to @corbet

@corbet thankfully, ArchiveTeam had a long-going project that collected millions of those URLs, so they won’t be lost.

my infra helped at some point, I was running archiveteam runners for quite a while :3

Takeshidude

Takeshidude@toot.garden

1 year ago

Reply to @corbet

@corbet don't rely on google services

Listens to Baroque while coding murder.exe

newt@stereophonic.space

1 year ago

Reply to @a1ba@suya.place

@a1ba imagine relying on links..

Ben Stokman

benjistokman@mast.benstokman.me

1 year ago

Reply to @corbet

@corbet but why

Varyag

Varyag@kitsunes.club

1 year ago

Reply to @a1ba@suya.place

@a1ba@suya.place I literally always thought they were both a security and a longevity risk, and I'm not glad to see that I'm right. Curse Twitter for making people feel the need to shorten their URLs so much. I've seen several other smaller shortener services dying over the years but this is the worst one.

41402-nyan

a1ba@suya.place

1 year ago

Reply to @Varyag@kitsunes.club

@Varyag remember when they were also used for ads?

Daniel

DaniElectra@mstdn.social

1 year ago

Reply to @corbet

@corbet I always wonder about the amount of history that would be lost if the old mailing lists from Google Groups were to vanish from one day to another

Doubledado

bigTanuki@social.linux.pizza

1 year ago

Reply to @corbet

@corbet Google is intentionally breaking the internet. I wonder what they plan to try and sell us in the smoke of the damage?

John Timaeus

johntimaeus@infosec.exchange

1 year ago

Reply to @corbet

@corbet

Wonder if they know that this is gonna break google workspace? Lots of the auto-generated URLs there are goo.gl

Steve Tibbett

stevex@mastodon.social

1 year ago

Reply to @corbet

@corbet What a rotten thing to do. I’m sure someone else would be willing to take over running that if it’s too hard or too expensive for Google.

Tofu Golem

tofugolem@mastodon.social

1 year ago

Reply to @vitriolix@mastodon.social

@vitriolix @hatter @andrewt @corbet
Good point.

Just Mikoto

mikoto@akko.wtf

1 year ago

Reply to @a1ba@suya.place

@a1ba @Varyag some minecraft-related stuff still uses those things and it's just super annoying

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to @corbet

Edited 1 year ago

@corbet

Self host url shorts without Google tracking 😁
I recommend https://yourls.org/docs

Phracker

Phracker2Art@mstdn.social

1 year ago

Reply to @corbet

@corbet Happens when you put the fate of technology in the hands of organizations that only care about short-term profits. This is also why I'm against things like streaming and SaaS. If something requires you to connect with a company's servers every time you use/consume it, it will be gone as soon as it no longer serves that company's bottom line.

unexpectedteapot

unexpectedteapot@social.linux.pizza

1 year ago

Reply to @corbet

@corbet yet another display of the consequences of dependency on profit-driven organisations.

LordPhantom (Mathew)

LordPhantom@mastodon.social

1 year ago

Reply to @corbet

@corbet You have to be really careful using almost any Google service; Their history of just dumping things on a whim is long.

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to

@glitzersachen @corbet

They just did another update on the GitHub site last week and I don't see anything about end of life but its mostly PHP so, I could handle it.

Some urls on our servers could use it but most of ours are fairly short anyway after i started getting into the rewrite codes.

I'll keep digging to see if they are discontinuing it though, thanks.

Andy Griffin

andy_griffin_design@mastodon.social

1 year ago

Reply to @corbet

@corbet @Meyerweb Classic Google

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to

@glitzersachen @corbet

All true but my point is not using Google for anything is the best reason.

With any of those corporate assholes using tracking and AI, all the better to take the Internet back 😀

Haelwenn /элвэн/

lanodan@queer.hacktivis.me

1 year ago

Reply to @a1ba@suya.place

@a1ba @Varyag Haha yeah AdFly or something like that, ad blockers FTW :P

Alison Meeks

alison@mastodon.online

1 year ago

Reply to @corbet

@corbet Filed under glad I didn’t use it and you can’t trust Google for anything.

FunHouse Radio

funhouseradio@mastodon.world

1 year ago

Reply to @corbet

@corbet jerks

James Henstridge

jamesh@aus.social

1 year ago

Reply to @corbet

@corbet @vegard Perhaps a system to rewrite the URLs when displaying messages would do?

Something like git's mailmap, but for URLs.

Alice

a11c3@social.naked.cash

1 year ago

Reply to @corbet

url shortening was always stupid tho... its hecking stupid to expect google to keep anything around. especially with how shortsighted their cors setup is and has been.

Alice

a11c3@social.naked.cash

1 year ago

Reply to @a11c3@social.naked.cash

like yes let me make it even easier for the spy company to spy on me -_-

Sqaaakoi

Sqaaakoi@wetdry.world

1 year ago

Reply to @corbet

@corbet How long until git.io links get destroyed?

Isaac Ji Kuo

isaackuo@spacey.space

1 year ago

Reply to @corbet

@corbet

On the one hand, this sucks.

On the other hand, it might help some people realize:

1) It is always a bad idea to expect any Google service to remain in place, given their ... ha ha ha ... track record.

2) Don't use URL shorteners. Ever. WTF is wrong with you, just don't do it.

Aaron Rainbolt

arraybolt3@theres.life

1 year ago

Reply to @corbet

@corbet I get the feeling this is one of those things that is going to cause mass havoc because some legacy software uses these internally and that when enough people scream about it, Google will be forced to keep the links working.

Momo

momo@social.linux.pizza

1 year ago

Reply to @corbet

@corbet
Ah, makes sense. I mean they removed their "Don't be evil" statement. Now they have to act accordingly...
@netzwerkgoettin

Scott Feeney

graue@social.coop

1 year ago

Reply to @corbet

@corbet My first thought was that should be no problem for the Internet Archive to back up.

...then I thought, why does a scrappy nonprofit have to do it instead of the $400 billion company keeping them up in the first place?

Waseem

iamwaseem@mastodon.social

1 year ago

Reply to @corbet

@corbet this is irresponsible behavior. Old links should be kept alive.

Kerr Avonsen (she/her)

kerravonsen@mastodon.au

1 year ago

Reply to @corbet

@corbet I always felt that URL shorteners were the wrong solution for a problem that didn't exist. Unless, of course, you want to use them to ensure that people DON'T know what link they are clicking.

Lars Hanisch

flensrocker@troet.cafe

1 year ago

Reply to @corbet

@corbet They should release the database into the public domain, so everyone can do what they can do.

mpe

mpe@fosstodon.org

1 year ago

Reply to @corbet

@corbet @vegard public-inbox uses git for storage right? I wonder if it's possible to use git-replace to replace the blobs for those messages with the URLs rewritten.

Pseudo Nym

pseudonym@mastodon.online

1 year ago

Reply to @luna@catgirl.center

Edited 1 year ago

@luna @shiri @corbet

Thank you. This was what I was looking for.

@textfiles

Is almost certainly already aware of this, but tagging him here just in case.

[Edit] never mind. He already chimed in down thread.

Dave Anderson

danderson@hachyderm.io

1 year ago

Reply to @luna@catgirl.center

@luna @shiri @corbet URLTeam's been scraping goo.gl since 2019, according to the wiki. Fingers crossed that means things are in hand. But more archiveteam warrior VMs set to "archive team's choice" are always welcome.

dbread

dbread@qoto.org

1 year ago

Reply to @corbet

PrOpRiEtTaRy link shortener service does PrOpRiEtTaRy things.

Use #commons for all that has worth.

#google #capitalism

@corbet

Aral Balkan

aral@mastodon.ar.al

1 year ago

Reply to @corbet

@corbet @lana Google did this? Shocking!

Jeff McNeill

jeffmcneill@fosstodon.org

1 year ago

Reply to @corbet

@corbet Penny-wise and Pound foolish. One more for "Killed by Google"

https://killedbygoogle.com/

Andrew

andrewt@mathstodon.xyz

1 year ago

Reply to @hatter@metasocial.com

@hatter @vitriolix @tofugolem @corbet oh yeah, it's definitely costing them money to run it and they're going about it in as good a way as you can expect — most of the old Twitter-era services just quietly stopped working while nobody was looking. To be fair OP was right, really this is Twitter's fault for creating an artificial need for these silly forwarding services in the first place, although I'm sure analytics services would have normalised it anyway

But it does feel a bit odd. Like, Google's core business is (was?) a constantly updating, publicly searchable live index of almost every page on the internet, and they really find it too expensive to maintain a static index of a billion or so string-string key value pairs you can only look up by the primary key with exactly zero UI that's already set up and presumably doesn't do much traffic any more? It's going to cost them more to shut it down this gracefully than it would to run it for another decade, surely?

StarkRG@myside-yourside.net

1 year ago

Reply to @corbet

@corbet It's a good thing they dropped their former motto as it would be really hypocritical given their current policy of doing as much evil as possible.

Cegorach

drazraeltod@chaos.social

1 year ago

Reply to @corbet

@corbet @jom this is EXACTLY the reason people were warned about using URL-shorteners

"If that service goes away,all your links will be lost!" - "Nah! I'll just use a really big service that'll last longer than my server for sure"

It doesn't get much bigger than Google.

Didn't help.

AliveDevil

AliveDevil@tauri.earth

1 year ago

Reply to @corbet

Edited 1 year ago

@corbet

Oh what? So goo.gl has been deprecated since 2018, and an automated Google bot still has goo.gl in their email footer? They really don’t know what they are doing.

Up to message 18000 these are just footer links, so maybe 2000 real messages, which includes quotes and whole mail body copies.

@mvsde

Götz Hoffart

goetz@freiburg.social

1 year ago

Reply to @corbet

@corbet General recommendation: don’t “shorten” URLs. That’s just another gatekeeper/database between readers and your website.

Tagomago

tagomago@mastodon.social

1 year ago

Reply to @webmink@meshed.cloud

@webmink @corbet @bagder Not so sure about whose, though.

paillp

paillp@lunai.re

1 year ago

Reply to @corbet

@corbet That's why I've never used an URL shortener...

Tagomago

tagomago@mastodon.social

1 year ago

Reply to @hatter@metasocial.com

@hatter @vitriolix @andrewt @tofugolem @corbet If no one writes a script for this, I’d say they don't even care about their archives? I mean, I would if I needed to. I bet it could be done with a oneliner!

Andy Mouse

andymouse@todon.eu

1 year ago

Reply to @andrewt@mathstodon.xyz

@andrewt

Google is in the business of harvesting human behavioural data from its users and selling results from human prediction models to the highest bidder. They are also in the business of using their knowledge of said users to influence user decision-making and opinions, also at the behest of the highest bidder.

It is likely that the URL shortening service offers no additional behavioural data for them to harvest. Therefore, it is useless. So, shut it down as it consumes resources above 0 and running anything carries with it operational risks (however minimal) that they can do without.

@hatter @vitriolix @tofugolem @corbet

Andrew

andrewt@mathstodon.xyz

1 year ago

Reply to @tagomago@mastodon.social

@tagomago @hatter @vitriolix @tofugolem @corbet I mean they *don't* care about that, we know that. They've clearly long since decided that old services, even well liked and used ones, are going to get shut down and it's up to users to deal with that. And that's mostly fair enough, they used to experiment a lot but that means most of them would fail sooner or later, and I'm sure as much as we complain when they do it, most people don't care and it doesn't really hurt Google's numbers. But I mean, their propensity to sunset everything has got to be a big part of why more businesses don't use their cloud offering. I even run Gmail from behind a forwarder in part because I don't entirely trust it will exist in five years or I'll want to use it if it does.

Bastelwombat

bastelwombat@chaos.social

1 year ago

Reply to

@Sweetshark @corbet Good idea. @textfiles are you aware of this?

ketchup71

ketchup71@mastodon.social

1 year ago

Reply to @corbet

@corbet Ah, great, another step to digital wasteland. Time to admit that the internet is no cultural heritage, but ephemeral.

Truth is, (external/public) link shortening services always were a bad idea. They only exist because of microblogs, where you have to fight for each precious message character.

However, I feel like having created such an abomination should lock you forever into the obligation to keep it alive, until the last referred-to link breaks.

Trash Panda

raccoon@hollow.raccoon.quest

1 year ago

Reply to @corbet

@corbet@social.kernel.org
I'm shocked.
SHOCKED!

Jak2k 🇪🇺 🔜39c3

jak2k@mastodontech.de

1 year ago

Reply to @corbet

Edited 1 year ago

@corbet
Would it possible to take all that 19,000 links, look where they go and make them available under another domain?

badwebsites

badwebsites@mastodon.sdf.org

1 year ago

Reply to @corbet

@corbet What madness.

patrick m.

sfpodge@famichiki.jp

1 year ago

Reply to @corbet

@corbet @Ming yikes!

Andrew

puck@mastodon.nz

1 year ago

Reply to @corbet

@corbet damned good reason for oss-security list to insist on including the important content from websites in posts to the list!

[GARLIC] Lunya 🧄

luna@catgirl.center

1 year ago

Reply to @danderson@hachyderm.io

@danderson@hachyderm.io @shiri@foggyminds.com @corbet@social.kernel.org I thought it wasn't because under the warrior projects section, according to the reference at the start of the section pink is currently being scraped, and the row for goo.gl is white

robinhouston

robinhouston@mathstodon.xyz

1 year ago

Reply to @corbet

@corbet

> Today, the time has come to turn off the serving portion of Google URL Shortener.

Not very long ago, everyone who worked at Google would have understood instinctively that there is no such time.

Dec [{()}]

dec23k@mastodon.ie

1 year ago

Reply to @bob@beamship.mpaq.org

Edited 1 year ago

@bob @corbet
Was that the one used by the SpaceKaren dot sucks URL shortener? (which failed after a year when the domain wasn't renewed)

words_number

words_number@mastodon.social

1 year ago

Reply to @corbet

@corbet That's probably a substantial cultural loss. But on the other hand, who would have thought that embedding a link somewhere for the long term while needlessly relying on some specific proprietary third party service is a bad idea? I wish people would be more aware of the fact that all companies they interact with will try to lock them in in their product ecosystem if they can. Letting that happen might be convenient but always comes with a risk.

Brewster Kahle

brewsterkahle@mastodon.archive.org

1 year ago

Reply to @corbet

@corbet

Archive.org is up for helping...

The original URL shorteners thought about this, and archived their links with archive.org .

https://archive.org/details/301works?tab=about

I hope google joins now, and gives us the host domain so we can make them continue to work (redirect into the wayback machine that would archive the redirect).

please.

如月飛羽🌈

kisaragi_hiu@mastodon.social

1 year ago

Reply to @corbet

@corbet After downloading the mailbox file from the linked search result and poking in it, it appears a lot of those mentions are duplicates; removing duplicates gets it down to about 600~800 unique goo[.]gl links. In the case of LKML, that's fairly easy to archive.

I'm not sure the mailbox file is everything, though, so this may still be off.

Regardless, the closing of the service is still a massive loss.

Hope

ladyhope@theres.life

1 year ago

Reply to @corbet

@textfiles @corbet Of all the things to kill... Why that?

Xdej

xdej@mamot.fr

1 year ago

Reply to @brewsterkahle@mastodon.archive.org

@brewsterkahle
@c3manu said in https://chaos.social/users/c3manu/statuses/112812473668724559 :
-
@drewdevault if you wanna help without the luxury of getting the db, people are currently organising in #archiveteam-bs (still deciding on a dedicated channel name)

the url shortener project has been running for a while now, including for goo.gl urls

https://tracker.archiveteam.org:1338/status

@corbet

uvok

uvok@furry.engineer

1 year ago

Reply to @corbet

@corbet is tinyurl still a thing? ;p (insert xkcd here)

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to @dec23k@mastodon.ie

@dec23k @corbet

I couldn't say but my thoughts are about those corporate services that track everything including the last time you use the bathroom

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to

@glitzersachen @corbet

We at MPAQ are always setting up services that the corporates use so that we don't need them anymore.

We have blogs, email, live music and many things. ATM, I'm working on a url shortener 😁

Just Bob ♒🇺🇲🪖🐧

bob@beamship.mpaq.org

1 year ago

Reply to

Edited 1 year ago

@glitzersachen @corbet

It doesn't matter what corporation it is, Fakebook and TwitterDumb are also on my hate list.

We are even hosting our own social network, Beamship 😁

it's missing 🔜 eth0 🎃

domi@donotsta.re

1 year ago

Reply to

@cdenesha @corbet nope, that would be too easy! to not make our work any less insane, google also introduced a bunch of silly ratelimits, just to throw us off…

Dan Scott

dbs@code4lib.social

1 year ago

Reply to @brewsterkahle@mastodon.archive.org

@brewsterkahle @corbet thank you for having taken over purl.org when the library monopoly, err, cooperative, gave up on it

CSDUMMI

csdummi@babka.social

1 year ago

Reply to @corbet

@corbet can the @internetarchive get a backup of this database for posterity?

CSDUMMI

csdummi@babka.social

1 year ago

Reply to @albertcardona@mathstodon.xyz

@albertcardona @corbet not all of them can, not all of them will. You could also try to query the shortener as much as possible before the shutdown and thus export as much as possible of the mappings.

Ben Zanin

gnomon@mastodon.social

1 year ago

Reply to @corbet

@corbet hey, I wrote you a short thing: https://git.sr.ht/~gnomon/fetch-goo.gl-shortlink-dereferences

It turns out that the ~19,000 goo.gl shortlinks in that lore search you posted deduplicate down to about 360 unique shortlinks. The script in that repo can pull down about 285 of them. Stuffs 'em in a smol sqlite3 database with a simple index that makes lookups, even in a tight loop, close to instant.

It's only the very simplest proof of concept, but it _does_ work. The Lore picture is not as bad as I expected.

Jonathan Corbet

corbet

1 year ago

Reply to @gnomon@mastodon.social

@gnomon Replacing URLs isn't that hard, whether there's hundreds of them or thousands - a properly written script doesn't care.

I mentioned lore (and the LWN mailing-list archive) in particular because they are based on public-inbox. That is a great piece of software, but it has some interesting design decisions. Behind public-inbox is a Git repository with a single file called "m". Each message added to the archive goes in as a patch to "m". A mailing-list archive is a long series of Git commits to that one file.

What this means is that changing a message in the archive comes down to a rebase operation. Lots of fun in an archive with millions of messages (and thus millions of commits to rebase) in it. It's doable, but it's not fast or easy. It's not what public-inbox was designed to do. Archived emails aren't meant to change.

Changing URLs in an email will also mess with things like DKIM validation, of course.

This is why I think it's unlikely that the linux-kernel archive (or the LWN archive) will be patched; it's a massive job. But perhaps @monsieuricon has a different view of things...?

Ben Zanin

gnomon@mastodon.social

1 year ago

Reply to @corbet

@corbet @monsieuricon indeed! By coincidence I happened to do a deep dive into the public-inbox codebase last week¹; it's part of why that script I wrote is the way it is. The sqlite3 DB fits into the existing prerequisites for that codebase.

While I agree that the git history rewriting would be bad, regenerating the Xapian indices would be even worse.

However I think a _render time_ transformation might work, conceptually like git's mailmap. I am experimenting.

¹: https://mastodon.social/@gnomon/112780218127791667

🇺🇦 haxadecimal

brouhaha@mastodon.social

1 year ago

Reply to @corbet

@corbet Maybe Google could ameliorate the pain of killing their Short URL service by setting up a system where you could query one of their short URLs that are going away, and get back a redirect to the URL it originally pointed to.

Luna Lactea

jackemled@furry.engineer

1 year ago

Reply to @corbet

@corbet I thought they did that a few years ago.

Jeroen Wiert Pluimers

wiert@mastodon.social

1 year ago

Reply to @corbet

@corbet @netzwerkgoettin

It's the #1 reason I try to avoid URL-shorteners (reason #2 is that they can insert unwanted stuff in the redirect).

Hopefully @internetarchive, @textfiles or @ArchiveTeam will step in and try to archive as much as possible, and users of googl shortener will expand the shortened URLs in time.

I will track them down the ones used on my blog today and if still there: expand them.

Jason Scott

textfiles@mastodon.archive.org

1 year ago

Reply to @bastelwombat@chaos.social

@bastelwombat @Sweetshark @corbet been working on it for 10 years so yes

clacke: exhausted pixie dream boy 🇸🇪🇭🇰💙💛

clacke@libranet.de

1 year ago

Reply to @brouhaha@mastodon.social

@brouhaha @corbet So ... not killing their service?

Jonathan Corbet

Simon Phipps

Albert Cardona

Jonathan Corbet

Tofu Golem

Chris Vest

clar fon

Andrew

Phil

Sven Slootweg (soft-deprecated)

Tomodachi94

Vegard

Matt Nordhoff

Jason Scott

Jonathan Corbet

Adam ♿

Oleksandr Natalenko, MSE

verita84

Richard Emling (DO9RE)

ferricoxide@evil.social

sj

Levka

Erisa

the hatter

Vegard

kurtsh

Aaron

Jobu Tupaki

v̾i̾t̾r̾i̾o̾l̾i̾x̾

feld

[GARLIC] Lunya 🧄

Lunaphied

the hatter

41402-nyan

Esparta

it's missing 🔜 eth0 🎃

Takeshidude

Listens to Baroque while coding murder.exe

Ben Stokman

Varyag

41402-nyan

Daniel

Doubledado

John Timaeus

Steve Tibbett

Tofu Golem

Just Mikoto

Just Bob ♒🇺🇲🪖🐧

Phracker

unexpectedteapot

LordPhantom (Mathew)

Just Bob ♒🇺🇲🪖🐧

Andy Griffin

Just Bob ♒🇺🇲🪖🐧

Haelwenn /элвэн/

Alison Meeks

FunHouse Radio

James Henstridge

Alice

Alice

Sqaaakoi ​

Isaac Ji Kuo

Aaron Rainbolt

Momo

Scott Feeney

Waseem

Kerr Avonsen (she/her)

Lars Hanisch

mpe

Pseudo Nym

Dave Anderson

dbread

Aral Balkan

Jeff McNeill

Andrew

StarkRG@myside-yourside.net

Cegorach

AliveDevil

Götz Hoffart

Tagomago

Sqaaakoi