Conversation

Thorsten Leemhuis (acct. 1/4)

linus-next: improving functional testing for to-be-merged [ ] pull requests

https://lore.kernel.org/all/ZxZ8MStt4e8JXeJb@sashalap/

"'[…] Linus Torvalds expressed concerns about the quality of testing that code receives before he pulls it. The subsequent discussion side-tracked to the testability of linux-next, but we didn't directly address Linus's original concern about pre-pull testing quality.

In an attempt to address the concerns, we're trying out a new "linus-next"
tree […]"

1
1
0

2/ @kees replied to the "linus-next" proposal from Sasha and raised a few points I fully agree with, as that proposal felt a bit off for me.

https://lore.kernel.org/all/792F4759-EA33-48B8-9AD0-FA14FA69E86E@kernel.org/

"'Are people putting things in linux-next that they don't expect to send to Linus? That seems like the greater problem.

[…]

Why not just use linux-next? […]

[…] have a bot that replies to all PRs with a health check, and Linus can pull it if he thinks it looks good. […]"

2
0
0

@kernellogger @kees I don't see the point in get another testing tree.

0
0
1

@vbabka and @ljs, that "Are people putting things in linux-next that they don't expect to send to Linus? That seems like the greater problem." from @kees reminded me of a question you might be able to help out with:

From a quick look it seems to me that the "mm-unstable" branch is in -next (via "mm-everything"). Does that contain stuff for the next merge window only, or more experimental stuff as well? It looks like the latter to me.

1
0
0
@kernellogger @vbabka @kees what does 'experimental' mean?

mm-unstable is everything that _appears_ to be going to Linus because nobody objected to it yet but some stuff might not end up going because it's got issues.

I'm not sure how you're supposed to differentiate between stuff that's eventually going to get reviewed to the point of not being submitted vs. stuff that'll go?

I think this is a bad take to be honest.

Next _should_ contain stuff we _expect_ will go to Linus, which _is_ all of that.

The issue is that people aren't bloody testing next!
1
0
0

@ljs @kees @vbabka

Well, good points, but fwiw, afaiui only patches that *were* reviewed by one of the official maintainers are supposed to be included in -next. To quote Stephen from https://lore.kernel.org/linux-next/20240716083116.27f179bd@canb.auug.org.au/

"'You will need to ensure that the patches/commits in your tree/series have
been:
[…]
* reviewed by you (or another maintainer of your subsystem tree),
* successfully unit tested, and
* destined for the current or next Linux merge window."'

1
0
0
@kernellogger @kees @vbabka they are ostensibly reviewed by Andrew.

I mean thanks for trying to make stuff get _less_ tested before rc though...
2
0
0
@kernellogger @kees @vbabka I don't necessarily agree with the process in mm as it stands but this is how it works right now, that is a lack of objection and Andrew not finding major flaws = Andrew in effect approves, and unless it's a relative newcomer or otherwise he has reason not to want it, the series will get merged.

So it's representative of what will be in the next merge window (+ hotfixes not yet upstreamed for rc).

I'd like there to be more of a 'needs a tag from at least someone' rule in mm, but can't control that. It adds workload to people who have to act fast to stop stuff getting pulled in.

If we limited it to mm-stable or something then -next would be _miles_ behind what is going into the next merge window, and you'd also _not_ be stabilising as while not enough people test -next NOBODY tests mm-unstable so it'd become a pretty useless tree.

Again, I think the issue is that more people need to test against -next.
1
0
0

@ljs @kees @vbabka

> I think the issue is that more people need to test against -next.

Avoiding "this might eat my data" fears for the testers is very important detail here to realize that – having patches in there that come from a "unstable" branch is enough to scare people away when it comes to mm or filesystems.

So chancing that name might already help.

1
0
0

@ljs @kees @vbabka

But andrew replaces whole patch series with newer ones over time, or am I wrong there?

That's something other subsystems normally don't do.

So from the outside mm looks a bit odd and scary, sorry.

2
0
0
@kernellogger @kees @vbabka I don't even know what you mean by that...

Patch series go through review + different versions that's how kernel development works?
1
0
0
@kernellogger @kees @vbabka who is testing unstable unreleased kernel code while also being scared of 'having their data eaten'??

The whole point of -next is 'here is a snapshot of what is probably coming next, please test it'.

rc's might eat your data too, the whole point is for testing to catch this stuff
1
0
0

@ljs @kees @vbabka sure that's how it works, but most other subsystems *afaics* only add them to their trees included in -next once they are considered ready; all problems that surface later must then be fixed by patches submitted on-top of the subsystem tree, not by sending yet another newer version of the series.

1
0
0

@ljs @kees @vbabka

eating data is always a risk spectrum (even in stable release there is a risk that it happens) -- and depending on how risky something looks people then decide what they use/test: stable, stable-rc, mainline outside the merge window (that's me currently), mainline all the time, -next.

The less likely the "might eat data" risk is perceived in -next, the more likely people will be willing to help with testing it.

2
0
0
@kernellogger @kees @vbabka so other subsystems go through multiple versions of a series, and then do a bunch of 'fix' patches on top for some reason? That's not really sane?

I doubt very much that's the case, if it is that's silly, git has rebase you know.
1
0
0
@kernellogger @kees @vbabka I mean some series take weeks and weeks and weeks to get through review.

So what you're saying is - during all of those weeks - we must not have build bots on the series, we must not have anything doing even ostensibly small levels of testing, we just wait until rc and ship it as is and that's 'safer' and will make people 'less scared'.

OK.
2
0
0
@kernellogger @kees @vbabka to be clear I am not a fan of the 'automerge' stuff, but I do think once something has at least sufficient review it should go in to -next, even if it might have unexpected later rounds of review.

Andrew won't take anything that has outstanding review on it btw. It is literally only the stuff that truly looks like it's going in.

But I concede I'd like there to be at least 1 tag and a day or 2 to make sure no obvious other objection before moving to -next.

The basically 'wait until we are really really sure there's nothing more before subjecting to testing' take though, yeah profoundly disagree.

What I've been finding with my series is that it goes to Linus and _suddenly_ you get a bunch of reports and are doing 12 hour days to fix things.

If you want less 'you ate my data!' I'd say test sooner.

Anybody testing -next and expecting serious stability is being silly even if all of your conditions were met, it's very fresh unstabilised code you can't rely on it.

And I literally run an rc as my main system atm btw...
1
0
0

@ljs @kees @vbabka

I'm not saying that. Seems I don't got my point across. Whatever, let's please ignore this, discussing it here is unlikely to change anything anyway.

1
0
0
@kernellogger @kees @vbabka but that would be the result of what you're saying basically.

Which is why I object.... based on personal experience.

I'm a little mystified that you would tag me and I take all this time to respond on this issue to you and now you tell me to ignore it, but OK.
2
0
0
@kernellogger @kees @vbabka I would actually suggest something like a next-stable and a next-unstable to make it _super_ clear.

So somebody could then choose to use next-stable for only things subsystem maintainers are somewhat (as far as can be) sure are ok, and next-unstable for bots and workloads where you don't care

EDIT: adjusted for maturity
0
0
1
@kernellogger @ljs @kees he normally applies per-patch fixups from a new version, and squashes them before moving from mm-unstable to mm-stable branch at which point (around rc5?) the rebasing should stop. If rewrite of series is too drastic, it's replaced.
0
0
2

@ljs @kees @vbabka

I tagged you as I wanted to better understand the -mm workflow; your answers helped with that. Thx for that.

But from there things went on to different topics and if that workflow is wise or unwise; that's not something I wanted to discuss.

1
0
0
@kernellogger @kees @vbabka lol man if I didn't like you you'd annoy me with this.

But I do like you so, fine fine... but you owe me a beer bro
1
0
1

@ljs @kees @vbabka beer! np! As long as I don't have to drink one. 🥴

1
0
2
@ljs @kees @kernellogger I think the "sometimes merges new series too quickly" is perceived as the issue especially if it breaks -next and prevents testing there. Last LSF Andrew said he might create mm-experimental as another branch that would not be in -next so maybe that would move in the direction Thorsten would like too?
2
0
2
@vbabka @kees @kernellogger yeah I think we all agree that needs to be calmed down.

I am not a huge fan of having to jump on series and NACK easy just to absolutely avoid some broken series going in, it's tiring and makes me the villain a bit

And @vbabka has always said I'm a lovely guy so you know not me at all...
1
0
2
@vbabka @kees @kernellogger but I'd definitely want at least build bot tests even on super super experimental stuff.

Nothing more soul destroying than finding out fucking nommu broke your series AGAIN
0
0
0

@vbabka @kees @ljs

yeah, sounds like it; and bots could even process that mm-experimental branch, like they do for quite a few other subsystem trees not included in -next.

1
0
2
@kernellogger @vbabka @kees hey we could even have a

next-for-humans
next-for-bots

;)
1
0
3

@ljs @kernellogger @vbabka Why? Just use linux-next. Nothing should go in it that's not ready to go to Linus. Who is it that isn't testing -next? I'm always testing there. All the CIs I see reporting to lkml are testing -next. I genuinely didn't understand what's missing. -next is for testing. Do people need to be reminded to test on _copies_ of their data/images/workloads?

3
0
1

@ljs @kernellogger @vbabka Seriously, this is the classic development/testing/production cycle. Develop patches, test them alone (in a specific tree), test them integrated (in linux-next), send them to Linus (production).

But maybe I'm weird because the hardening folks do so much core/treewide work? Still, everyone should test that way.

1
0
0
@kees @kernellogger @vbabka I mean mm already only puts stuff in that 'if there is no objection will go to Linus' there.

if there's objection or contention on a series, or it's a larger series, then it doesn't go in.

My guard pages aren't in for example...

There's 2 parts to this - how akpm handles his workflow + expectations around next.

-next should be 'things that are heading to Linus's tree as far as we know right now'. Putting further restrictions on risks doing stuff like bcachefs where completely unseen stuff goes in.

I've had several cases of things not being tested until rc so I am a bit passionate about this.
0
0
0
@kees @kernellogger @vbabka yup.

I'm not sure what testing in the specific tree happens, but yup this is how I see it too.

I've actually found things quite early in the -next integration stuff before.

The build bots at least run there, and people have also chased up things from -next.

But there's a lot of stuff that isn't tested until rc.

For me this whole thing is noise and the real issue is that _more_ testing needs to be focused on -next.

Plus obviously yes dont' expect data to be safe in anything but 'stable' releases of linux (insert debate about that here)
2
0
0

@ljs @kees @kernellogger @vbabka

Plus obviously yes dont' expect data to be safe in anything but 'stable' releases of linux (insert debate about that here)

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

1
0
1
@ljs @kees @kernellogger I think there are bugs that you just can't catch with bots running VM guests with test suites, and only will manifest on bare hw, maybe a specific laptop. And people would not start running -next on those understandably. But testing -next more intensively than now, but with the same guest VM scheme also is unlikely to uncover these bugs.
1
0
1
@vbabka @kees @kernellogger I think you're just talking about yourself testing rc's on your literal laptop here...

I'm talking about both bugs that VMs and hardware could catch. Catching them as early as possible is a good idea.

I don't get what the 'understandable' thing is, if you have a giant farm of hw for linux testing at a larger level then bare metal testing does make sense just as much as for rc?

I also don't really understand your 'more testing won't test anything more' point either but maybe I'm missing something.

I'm not expecting randoms to run -next on a laptop obviously.

For me it's simple - test as much stuff as possible as early as possible.

That's it.

Surprised anybody would argue against it but there we are.
2
0
0
@ljs @kees @kernellogger yes I'm only talking about the bugs I've seen so clearly you've dealt with different ones that prove even VM based -next testing is insufficient, ok. It would be great if it improved, I agree and won't argue against it.
1
0
0
@vbabka @kees @kernellogger while there's good coverage from bots across the board I don't think quite everything is tested in that realm.

And I've definitely had stuff that's come up on rc first, or it feels that way at least...

To focus on the positives I think we agree that:

- mm needs to have a smoother process to avoid any actual -next breakages or putting patch series to -next that clearly are going to get yanked
- test as much as possible as early as possible.
0
0
2

@ljs @kees @vbabka

> I'm not expecting randoms to run -next on a laptop obviously.

Agreed, but I wonder if we should aim for "all kernel developer should feel safe enough to use -next as their daily driver" (side note: one kernel developer once stated avoiding mainline -rc1 release after being burned due to the famous problem back in 5.12-rc1 [hope I don't misremember the version])

1
0
0
@kernellogger @kees @vbabka 6.12-rc1-3 had a serious problem too :P especially if you use proton...

Yeah I don't think we should aim for that tbh, because it might cause people to be less willing to submit stuff earlier to get hammered by bots.

To me an average random who wants to test should be on an rc. I'm on an rc and I'm very very average I'm constantly told
1
0
1

@ljs @kees @kernellogger @vbabka Sane or not, it is pretty much how some subsystems do it. The trees are rolled and re-rolled before submitting to go into `-next`. Though I think fs rebases and rejiggers their trees every time things need to be fixed up instead, similar to mm.

0
0
1

@ljs @kees @kernellogger @vbabka You should be paying too much attention to what others say.

1
0
1
@oleksandr @kees @kernellogger @vbabka I mostly pay attention to what you say Sasha and mostly about Czech cuisine
1
0
0

@ljs @kees @kernellogger @vbabka Obsession is your profession.

1
0
1

@ljs @kees @kernellogger @vbabka This one is poisonous, beware.

0
0
1

@kernellogger @ljs @kees @vbabka There is also the option of running tests in a VM or whatever - dogfooding -next is probably foolish but you don’t have to give it your live production data.

0
0
2

@kees @ljs @kernellogger @vbabka The usual complaint with testing -next is that it moves fast enough to be an issue for longer running tests (which is going to be a problem for any integration tree at kernel scale) and that occasionally it does badly enough that you just loose all your testing due to breakage (again, I think that’s relatively unavoidable) and can’t focus on what you’re actually interested in.

I do have sympathy with those who feel they can’t keep up with the volume of change.

0
0
2