• 0 Posts
  • 8 Comments
Joined 3 years ago
cake
Cake day: July 3rd, 2023

help-circle
  • context: I wanted to know if the open source projects currently being spammed with PRs would be safe from people running slop models on their computer if they weren’t able to use claude or whatever. Answer: yes, these things are still terrible

    but while I was searching I found this comment and the fact that people hated it is so funny to me. It’s literally the person who posted the thread. less thinking and words, more hype links please.

    conversation

    https://www.reddit.com/r/LocalLLaMA/comments/1qvjonm/first_qwen3codernext_reap_is_out/o3jn5db/

    32k context? is that usable for coding?

    (OP’s response, sitting at a steady -7 points)

    LLMs are useless anyway so, okay-ish, depends on your task obviously

    If LLMs were actually capable of solving actual hard tasks, you’d want as much context as possible

    A good way to think about is that tokens compress text roughly 1:4. If you have a 4MB codebase, it would need 1M tokens theoretically.

    That’s one way to start, then we get into the more debatable stuff…

    Obviously text repeats a lot and doesn’t always encode new information each token. In fact, it’s worse than that, as adding tokens can _reduce_ information contained in text, think inserting random stuff into a string representing dna. So to estimate how much ctx you need, think how much compressed information is in your codebase. That includes stuff like decisions (which LLMs are incapable of making), domain knowledge, or even stuff like why does double click have 33ms debounce and not 3ms or 100ms in your codebase which nobody ever wrote down. So take your codebase, compress it as a zip at normal compression level, and then think how large the output problem space is, shrink it down quadratically, and you have a good estimate of how much ctx you need for LLMs to solve the hardest problems in your codebase at any given point during token generation

    *emphasis added by me






  • Sanders why https://gizmodo.com/bernie-sanders-reveals-the-ai-doomsday-scenario-that-worries-top-experts-2000628611

    Sen. Sanders: I have talked to CEOs. Funny that you mention it. I won’t mention his name, but I’ve just gotten off the phone with one of the leading experts in the world on artificial intelligence, two hours ago.

    . . .

    Second point: This is not science fiction. There are very, very knowledgeable people—and I just talked to one today—who worry very much that human beings will not be able to control the technology, and that artificial intelligence will in fact dominate our society. We will not be able to control it. It may be able to control us. That’s kind of the doomsday scenario—and there is some concern about that among very knowledgeable people in the industry.

    taking a wild guess it’s Yudkowsky. “very knowledgeable people” and “many/most experts” is staying on my AI apocalypse bingo sheet.

    even among people critical of AI (who don’t otherwise talk about it that much), the AI apocalypse angle seems really common and it’s frustrating to see it normalized everywhere. though I think I’m more nitpicking than anything because it’s not usually their most important issue, and maybe it’s useful as a wedge issue just to bring attention to other criticisms about AI? I’m not really familiar with Bernie Sanders’ takes on AI or how other politicians talk about this. I don’t know if that makes sense, I’m very tired


  • I’m in the same boat. Markov chains are a lot of fun, but LLMs are way too formulaic. It’s one of those things where AI bros will go, “Look, it’s so good at poetry!!” but they have no taste and can’t even tell that it sucks; LLMs just generate ABAB poems and getting anything else is like pulling teeth. It’s a little more garbled and broken, but the output from a MCG is a lot more interesting in my experience. Interesting content that’s a little rough around the edges always wins over smooth, featureless AI slop in my book.


    slight tangent: I was interested in seeing how they’d work for open-ended text adventures a few years ago (back around GPT2 and when AI Dungeon was launched), but the mystique did not last very long. Their output is awfully formulaic, and that has not changed at all in the years since. (of course, the tech optimist-goodthink way of thinking about this is “small LLMs are really good at creative writing for their size!”)

    I don’t think most people can even tell the difference between a lot of these models. There was a snake oil LLM (more snake oil than usual) called Reflection 70b, and people could not tell it was a placebo. They thought it was higher quality and invented reasons why that had to be true.

    Orange site example:

    Like other comments, I was also initially surprised. But I think the gains are both real and easy to understand where the improvements are coming from. [ . . . ]

    I had a similar idea, interesting to see that it actually works. [ . . . ]

    Reddit:

    I think that’s cool, if you use a regular system prompt it behaves like regular llama-70b. (??!!!)

    It’s the first time I’ve used a local model and did [not] just say wow this is neat, or that was impressive, but rather, wow, this is finally good enough for business settings (at least for my needs). I’m very excited to keep pushing on it. Llama 3.1 failed miserably, as did any other model I tried.

    For story telling or creative writing, I would rather have the more interesting broken english output of a Markov chain generator, or maybe a tarot deck or D100 table. Markov chains are also genuinely great for random name generators. I’ve actually laughed at Markov chains before with friends when we throw a group chat into one and see what comes out. I can’t imagine ever getting something like that from an LLM.