@Hazzard

Hazzard@lemmy.zip · 2 days ago

It’s usually pretty good about that, very apologetic (which is annoying), and usually does a good job taking it into account, although it sometimes needs reminders as that “context” gets lost in later messages.

I’ll give some examples. In that same networking session, it disabled some security feature, to test if it was related. It never remembered to turn that back on until I specifically asked it to re-enable “that thing you disabled earlier”. To which it responds something like “Of course, you’re right! Let’s do that now!”. So, helpful tone, “knew” how to do it, but needed human oversight or it would have “forgotten” entirely.

Same tone when I’d tell it something like “stop starting all your commands with SSH, I’m in an SSH session already.” Something like “of course, that makes sense, I’ll stop appending SSH immediately”. And that sticks, I assume because it sees itself not using SSH in its own messages, thereby “reminding” itself.

Its usual tone is always overly apologetic, flattering, etc. For example, if I tell it bluntly I’m not giving my security credentials to an LLM, it’ll always say something along the lines of “great idea! That’s a good security practice”, despite directly suggesting the opposite moments prior. Of course, as we’ve seen with lots of examples, it will take that tone even if actually can’t do what you’re asking, such as in the examples of asking ChatGPT to give you a picture of a “glass of wine filled to the very top”, so it’s “tone” isn’t really something you can rely on as to whether or not it can actually correct the mistake. It’s always willing to take another attempt, but I haven’t found it always capable of solving the issue, even with direction.

Hazzard@lemmy.zip · 3 days ago

Man, AI agents are remarkably bad at “self-awareness” like this, I’ve used it to configure some networking on a Raspberry Pi, and found myself reminding it frequently, “hey buddy, maybe don’t lock us out of connecting to this thing over the network, I really don’t want to have to wipe the thing because it’s running a headless OS”.

It’s a perfect example of the kind of thing that “walk or drive to wash your car?” captures. I need you to realize some non-explicit context and make some basic logical inferences before you can be even remotely trusted to do anything important without very close expert supervision, a degree of supervision that almost makes it totally worthless for that kind of task because the expert could just do it instead.