• Verserk@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 months ago

    Considering some of the very wrong and upvoted domain specific knowledge I’ve seen on Reddit over the years I’m not sure the training data is going to be useful for much beyond what every other model can do.

    • aStonedSanta@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      lol subreddits with troll names like trees vs marijuana enthusiasts. Good fun. John cena has one also but can’t recall which subreddit is actually about John cena though.

    • 【J】【u】【s】【t】【Z】@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      The legal advice in /r/legaladvice was some of the worst garbage I’ve ever seen. I have zero doubt numerous had bad outcomes, at best wasting money and time, at worst spending years in jail because of things that sub told them to say and do. Zero doubt.

      • evatronic@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        That sub was mostly cops just repeating their own bad interpretation of the law. Terrible.

    • peopleproblems@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I can only assume they are training some specific model for something appearing more human like.

      As useless as that will be considering how fucking wildly different we type

  • thawed_caveman@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    10 months ago

    I feel like AI companies have been scraping Reddit for their datasets already since the beginning and without permission. In fact, unless there’s been a regulation change that i’m not aware of, i’m not sure why they would have Reddit “sign away” the data when they can just scrape it.

    Also dubious if the current form of AI has a future. They seem like they should revolutionize every sector when you look at their capacities, but in practice their applications might be more limited than we thought?

    Anyway, if Reddit does go public i will be deleting my account within the hour. The only reason i haven’t yet is that i’ve been a moderator of the same subreddit for eight years and it’s the only thing that’s been consistent in my life in that time, i’m kind of attached. The reason i will is i didn’t sign up to create value for shareholders, i signed up to create value for a community.

    • ChunkMcHorkle@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      10 months ago

      I feel like AI companies have been scraping Reddit for their datasets already since the beginning and without permission.

      Well yeah, Sam Altman (Open AI) was even on the board of Reddit for a while. It’s a safe bet that they’ve been doing it for years.