• ozymandias117@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    9 hours ago

    Not OP, but I was pretty disappointed trying Claude 4.6

    Prompted

    Write a C program to find the longest word in a static 5x5 array of characters.
    
    These characters shall be defined in a header file, you may allocate it with any letters for now
    
    This program should find the longest word, using words available in a file at /usr/share/dict/words
    This file will have one word per line
    
    The rules of the longest word are that you may select the next letter in any direction from your current letter one character away, including diagonals
    
    Any index may be the starting point, and you may not repeat a space on the grid
    

    It did a breadth first search for the longest path, then checked if that longest path was a word, rather than checking each step, so it never found any words

    When I asked it to fix that, it then opened and reread the entire dictionary for each character

    Once I got it to fix that, I asked it to read the input array from a file, and after 30 minutes of asking it in different ways, it never managed to successfully read that file in

    All in all, it took longer than just writing it myself, even for what I would call an interview question

    • kibblebits@quokk.au
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      3
      ·
      8 hours ago

      In a single prompt I would not expect that specific exercise to produce efficient code, but within a few prompts it should. Certainly less time than it would take someone to write it themselves.

      There are always creative ways to squeeze extra performance out of code if you spend enough time on it.

      • pinball_wizard@lemmy.zip
        link
        fedilink
        English
        arrow-up
        4
        ·
        8 hours ago

        Certainly less time than it would take someone to write it themselves.

        I mean, sure - for you and I, who aren’t qualified to write that specific code, maybe we can promot the electronic idiot to get there. Of course, neither we nor the electroic idiot knows where there is, and at best we will copy in exisitng better code that we should have imported from a library. So we gave up automated updates to avoid reading the manual pages.

        In contrast, for domains I’m an expert in, babysitting the electric idiot is always a complete waste of time. I can just call the correct library, the correct way, on the first attempt.

        Today’s AI really highlights exisitng technical debt. If there’s already a mountain of it, I can see how the learning model may help wrangle it, and how it may be hard to see the added costs.

        • kibblebits@quokk.au
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          2
          ·
          7 hours ago

          Aren’t qualified? I mean… I’m qualified. You aren’t?

          What “domains” are you an expert in?

      • ozymandias117@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 hours ago

        If it can’t output ~50 lines of code that is reasonably common from textbooks with one minor modification, I’m not clear what the benefit is

        It’s certainly not faster

        I already stated I kept prompting it for over 30 minutes and it still hadn’t fully completed the problem

          • ozymandias117@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            ·
            7 hours ago

            So, it’s the same answer as every other time I’ve tried to talk to people supporting AI…

            If it didn’t work, I just I didn’t guide it enough, and if I did guide it, it’s a skill issue…

            It is pretty hard to come up with an easier problem for it to solve for an example case

            • kibblebits@quokk.au
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              7 hours ago

              Mine worked fine. I didn’t use your prompt cut and paste though. It was inefficient on the first prompt, but it worked, and by the third it was pretty speedy. I used codex 5.5, which imo is better than Claude for the time being.

                • kibblebits@quokk.au
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  arrow-down
                  1
                  ·
                  6 hours ago

                  Yeah codex does some stuff where I’m pretty disappointed. It never really gets me 100% to where I need to be without human interaction. But I’m aware it won’t (probably ever) do that and I’m fine with it. It got me 70% there, while I play with my cat… and charge for it. 🤷‍♂️