• kibblebits@quokk.au
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    6
    ·
    15 hours ago

    Give me an example of what you’ve asked it to do? And, what model and app did you use?

    • ozymandias117@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      14 hours ago

      Not OP, but I was pretty disappointed trying Claude 4.6

      Prompted

      Write a C program to find the longest word in a static 5x5 array of characters.
      
      These characters shall be defined in a header file, you may allocate it with any letters for now
      
      This program should find the longest word, using words available in a file at /usr/share/dict/words
      This file will have one word per line
      
      The rules of the longest word are that you may select the next letter in any direction from your current letter one character away, including diagonals
      
      Any index may be the starting point, and you may not repeat a space on the grid
      

      It did a breadth first search for the longest path, then checked if that longest path was a word, rather than checking each step, so it never found any words

      When I asked it to fix that, it then opened and reread the entire dictionary for each character

      Once I got it to fix that, I asked it to read the input array from a file, and after 30 minutes of asking it in different ways, it never managed to successfully read that file in

      All in all, it took longer than just writing it myself, even for what I would call an interview question

      • kibblebits@quokk.au
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        4
        ·
        13 hours ago

        In a single prompt I would not expect that specific exercise to produce efficient code, but within a few prompts it should. Certainly less time than it would take someone to write it themselves.

        There are always creative ways to squeeze extra performance out of code if you spend enough time on it.

        • pinball_wizard@lemmy.zip
          link
          fedilink
          English
          arrow-up
          5
          ·
          edit-2
          13 hours ago

          Certainly less time than it would take someone to write it themselves.

          I mean, sure - for you and I, who aren’t qualified to write that specific code, maybe we can promot the electronic idiot to get there. Of course, neither we nor the electroic idiot knows where there is, and at best we will copy in exisitng better code that we should have imported from a library. So we gave up automated updates to avoid reading the manual pages.

          In contrast, for domains I’m an expert in, babysitting the electric idiot is always a complete waste of time. I can just call the correct library, the correct way, on the first attempt.

          Today’s AI really highlights exisitng technical debt. If there’s already a mountain of it, I can see how the learning model may help wrangle it, and how it may be hard to see the added costs.

          • kibblebits@quokk.au
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            4
            ·
            12 hours ago

            Aren’t qualified? I mean… I’m qualified. You aren’t?

            What “domains” are you an expert in?

        • ozymandias117@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          12 hours ago

          If it can’t output ~50 lines of code that is reasonably common from textbooks with one minor modification, I’m not clear what the benefit is

          It’s certainly not faster

          I already stated I kept prompting it for over 30 minutes and it still hadn’t fully completed the problem

            • ozymandias117@lemmy.world
              link
              fedilink
              English
              arrow-up
              6
              ·
              12 hours ago

              So, it’s the same answer as every other time I’ve tried to talk to people supporting AI…

              If it didn’t work, I just I didn’t guide it enough, and if I did guide it, it’s a skill issue…

              It is pretty hard to come up with an easier problem for it to solve for an example case

              • kibblebits@quokk.au
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                2
                ·
                12 hours ago

                Mine worked fine. I didn’t use your prompt cut and paste though. It was inefficient on the first prompt, but it worked, and by the third it was pretty speedy. I used codex 5.5, which imo is better than Claude for the time being.

                • ozymandias117@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  ·
                  edit-2
                  12 hours ago

                  Claude 4.6 was doing shit like

                  extern char grid[5][5]

                  fgets(grid[i], 6, fp); grid[i][6] = '\0';

                  • kibblebits@quokk.au
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    arrow-down
                    2
                    ·
                    11 hours ago

                    Yeah codex does some stuff where I’m pretty disappointed. It never really gets me 100% to where I need to be without human interaction. But I’m aware it won’t (probably ever) do that and I’m fine with it. It got me 70% there, while I play with my cat… and charge for it. 🤷‍♂️

    • confuser@lemmy.zip
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      15 hours ago

      Kibblebits wants to make the information known so newer models can train on it and win at life

      • kibblebits@quokk.au
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        10
        ·
        15 hours ago

        I’ll be surprised if there is any information to be had. Most people stop at this point because it either never happened or they never actually put any effort into it which is why it failed.

        • IrateAnteater@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          5
          ·
          14 hours ago

          I usually stop at this point because it’s a complete waste of my fucking time. I already know where the relevant sources of information are, and the current AI models have proven themselves to be incapable of distinguishing between firmware versions or subtle differences in model numbers. I try things again every once in a while to see if anything has improved, and so far, no dice.