@maija @Suiseiseki I think there’s still a big difference between having the source code closed vs available. Closed source is a huge red flag - like, what are you trying to hide? Are you doing something sus? Nothing to hide, nothing to fear, right?
The absence of this red flag doesn’t mean you should blindly trust the software (as e.g. NPM shows), but makes establishing trust easier especially in conjunction with other possible signs of trustworthiness.
they can justt give you fake misleading code
Reproducible builds would help ensure that they can’t just put random bullshit in the binary because anyone can independently verify that it matches the source code (by obtaining the exact same binary from the exact same source code). Won’t help much if the source code itself is malware, but it’s still better because there’s less points at which malware can be introduced.
@maija @Suiseiseki It doesn't happen right now, but was happening nack when copilot first got released that it would "generate" code from gpl projects, ever since microsoft changed copilot so that it prioritises code from mit, bsd and apache type licenses for ai completion/generation and it will only ever spit out gpl code when the license of a project is also gpl, tho gpl versions incompatibility between the specific gpl license used by the project and the code copilot spits is still bound to happen.
@Suiseiseki @maija >Most software is terrible, so most inputs were terrible, so most outputs are terrible
LLMs are in the most basic mathematic models which will spit out an average of what the dataset input was, when i say "average" this is the very top of the curve used by the specific LLM, most will use a simple bell curve to calculate results others will use more fancy math to calculate an average within a constrained negative skew distribution curve to give you better results. But the point stands that AI generated code will be some function of the average code feed as dataset, and most of the code in, for example github, turns out to be terrible code with probably a 15% of code or less being actually good code, unless copilot is specifically trained with just the 15% good code as dataset all the code it will spit out will tend hard to terrible and only once in a blue moon will it happen to output something remotely resembling the good code.
@maija @Suiseiseki @EdBoatConnoisseur you two should get a room ngl
@maija @Suiseiseki @EdBoatConnoisseur you're living the good life god damn