Posts
676
Following
107
Followers
130
A professional kernel hacker, born in August 6, 2000, and living in Korea (the South one!).

- Linux Kernel Developer @ Oracle (Linux Kernel MM) (2025.02 ~ Present)
- A slab subsystem co-maintainer and a reviewer for the reverse mapping subsystem
- Former Intern @ NVIDIA, SK Hynix, Panmnesia (Security, MM and CXL)
- B.Sc. in Computer Science & Engineering, Chungnam National University (Class of 2025)

Opinions are my own.

My interests are:
Memory Management,
Computer Architecture,
Circuit Design,
Virtualization

Hoshino Lina (ζ˜ŸδΉƒγƒͺγƒŠ) 🩡 3D Yuri Wedding 2026!!!

Edited 28 days ago

Typical ML argument: "If I can read something legally, why can't I train an LLM on it?"

Humans are capable of reading things and later writing a similar thing that is still a copyright violation. If I go and write a book that follows the plot line of Star Wars, that's still a copyright violation, even if no text is literally the same. If I play the melody to a song on my piano and release it without the appropriate mechanical cover license, that's also a copyright violation.

The reason this does not happen often is that, as humans, we are aware that that's plagiarism and there are rules. Sometimes it happens by accident, and people still get sued and lose.

LLMs have no such awareness and routinely output things which are blatant copyright violations when appropriately prompted. That means the model weights encode that work, and therefore, are themselves a derivative work.

Your brain encodes a massive amount of copyrighted information. You are not a walking copyright violation because humans aren't data, can't be copied and distributed en masse, have human rights, etc. This is why "mind reading machines" are a classic dystopian plot point (monetizing your thoughts etc).

An LLM is not a human, does not have human rights, nor human privileges. It is data, and if it encodes copyrighted information, that's a derivative work. If you aren't following the license of the training data, that's a copyright violation.

7
10
2

Harry (Hyeonggon) Yoo

it is quite confusing that the floor index start from zero here in Zagreb
1
0
2
@Logical_Error some people say "we're running out of data to train AI, and that's a problem. we need more data"

but no. that's not the problem. the problem is that you can't make LLMs that are experts on every single field, even after training them on all the public data.
1
0
1
@lkundrak @ljs so true! unsettling thoughts keep showing up when you lots of time to think
1
0
2

Harry (Hyeonggon) Yoo

Edited 1 month ago
@ljs

I guess mine is a tiny complaint compared to you... even korean workholics don't work like that :'(

life sometimes forces unavoidable pain on us but it's important to keep going
0
0
3

Harry (Hyeonggon) Yoo

Edited 1 month ago
I need a break, but I can't take off so just zoning out for 30 minutes
1
0
3
@Logical_Error yeah AI is quite a nice tool if you can review them.

But decoding the ideas behind the code from history and past discussions takes time (no matter you are human or AI)
1
0
1
@ptesarik @axboe @ljs @vbabka @gregkh @Aissen

see my dominance signal, a claude subscription!
0
0
2
@axboe @ljs @vbabka @gregkh @Aissen

oh geez, that's not adding much value to the project :/
2
0
2

Harry (Hyeonggon) Yoo

Edited 1 month ago
@axboe @ljs @vbabka @gregkh @Aissen

it's really sad that people do that.
what do these people care then?
1
0
1
@Aissen @axboe @gregkh @ljs @vbabka

bittersweet because people report bugs on features they don't use rather than fixing them and stepping up to help?
2
0
0
@lkundrak

hello sir,
hail satan!🍷
0
0
0

Harry (Hyeonggon) Yoo

Edited 1 month ago
when I open Mastodon, and think "hmm, there's a name and profile picture that I don't recognize. who is it?"

and then I realize: oh wait, it's @lkundrak !

the last few names I remember is "hammer smashed filesystem" and "pope of nope"
1
0
1
@ljs @vbabka

fish and chips without the fish,
mac and cheese without the cheese,
haircut without.... OMG I've almost crossed the line!
1
0
2
@ljs @vbabka

oh mom and dad, I'm growing up!

but what can I do if I love growing up but hate getting older?
1
0
2

Harry (Hyeonggon) Yoo

Edited 1 month ago
I don't find code commentary helpful unless it explains the design first, because these days the implementation is way too complicated for readers to grasp the design by reading the implementation.

Oftentimes such code commentary mechanically describes list of statements, rather than explaining the idea behind it.

A well-written document should try to describe the idea behind the implementation, rather than the implementation itself. (Yeah, that's challenging)

They don't save much time compared to directly reading the code.

(some random rant of the day)
2
1
6

Harry (Hyeonggon) Yoo

I used think that using en/em dashes in writing is pretty elegant... until LLMs started ruining them, and now using them makes it look like it was generated by an LLM.
0
0
1
@ljs @vbabka @hny oh no, what can we do about it?
1
0
1
Show older