Mama told me not to come.

She said, that ain’t the way to have fun.

  • 0 Posts
  • 1.34K Comments
Joined 1 year ago
cake
Cake day: June 11th, 2023

help-circle
  • What we are talking about is the act of reading and/or learning and then using that information in order to synthesize new material.

    Sure, but that’s not what LLMs are doing. They’re breaking down works to reproduce portions of it in answers. Learning is about concepts, LLMs don’t understand concepts, they just compare inputs with training data to provide synthesized answers.

    The process a human goes through is distinctly different from the process current AI goes through. The process an AI goes through is closer to a journalist copy-pasting quotations into their article, which falls under fair use. The difference is that AI will synthesize quotations from multiple (many) sources, whereas a journalist will generally just do one at a time, but it’s still the same process.




  • I disagree that it needs to be explicit. The current law is the fair use doctrine, which generally has more to do with the intended use than specific amounts of the text/media. The point is that humans should know where that limit is and when they’ve crossed it, with motive being a huge part of it.

    I think machines and algorithms should have to abide by a much narrower understanding of “fair use” because they don’t have motive or the ability to Intuit when they’ve crossed the line. So scraping copyrighted works to produce an LLM should probably generally be illegal, imo.

    That said, our current copyright system is busted and desperately needs reform. We should be limiting copyright to 14 years (as in the original copyright act of 1790), with an option to explicitly extend for another 14 years. That way LLMs can scrape comment published >28 years ago with no concerns, and most content produced >14 years (esp. forums and social media where copyright extension is incredibly unlikely). That would be reasonable IMO and sidestep most of the issues people have with LLMs.










  • I’m pretty sure I do understand the issue. Here are some facts (and an article to back it up):

    1. putting memory closer to the CPU improves performance due to less latency - from 96GB/s -> 200 (M1) or 400 (M1 Max) GB/s
    2. customers can’t easily solder on more RAM
    3. Apple’s RAM upgrades are way more expensive than socketed options on the market

    And here’s my interpretation/guesses:

    1. marketing sees 1 & 2, and sees an opportunity to do more of 3
    2. marketing probably asked engineering what the bare minimum is, and they probably said 8GB (assuming web browsing and whatnot only), though 16GB is preferable (that’s what I’d answer)
    3. marketing sets the minimum @ 8GB, banking on most users who need more than the basics to buy more, or for users to buy another laptop sooner when they realize they ran out of RAM (getting after-sale RAM upgrades is expensive)

    So:

    • using soldered RAM is an engineering decision due to improved performance (double socketed RAM w/ Intel on M1, quadruple on M1 Max)
    • limiting RAM to 8GB is a marketing decision
    • if you don’t have enough RAM, that doesn’t mean the RAM isn’t performing well, it means you don’t have enough RAM

    Using socketed RAM won’t fix performance issues related to running out of RAM, that issue is the same regardless. Only adding RAM will fix those performance issues, and Apple could just as easily make “special” RAM so you can’t buy socketed RAM on the regular market anyway (e.g. they’d need a different memory standard anyway due to Unified Memory).

    I have hated Apple’s memory pricing for decades now, it has always been way more expensive to add RAM to an Apple device at order time vs PC competitors (I still add my own RAM to laptops, but it’s usually way cheaper through HP, Lenovo, etc than Apple at build-time). I’m not defending them here, I’m merely saying that the decision to use soldered RAM makes a lot of engineering sense, especially with the new Unified Memory architecture they’re using in the M-series devices.


  • The built-in Digital Wellbeing & Parental controls works. I have it on my Android 11 device, haven’t tested on anything newer (it’s not on my Graphene OS device based on the most recent Android though).

    Settings > Digital Wellbeing & Parental controls > Dashboard > click the timer icon next to an app and set a limit

    If you want something outside of the Google ecosystem (e.g. you’re running GrapheneOS), the following should work (untested):

    There are probably others, that was just a cursory check.



  • code generated by an AI is arguably not a “substantial portion” of the software

    How do you verify that though?

    And does the model need to include all of the licenses? Surely the “all copies or substantial portions” would apply to LLMs, since they literally include the source in the model as a derivative work. That’s fine if it’s for personal use (fair use laws apply), but if you’re going to distribute it (e.g. as a centralized LLM), then you need to be very careful about how licenses are used, applied, and distributed.

    So I absolutely do believe that building a broadly used model is a violation of copyright, and that’s true whether it’s under an open source license or not.



  • This source seems to indicate that’s not the case:

    1. Google Search & Other (56.93%)

      2023 Total Google Search & Other Revenue: $175.04 billion
      This is revenue generated primarily from ads shown on Google’s search results pages and other search-related services.

    2. YouTube Ads (10.26%)

      2023 Total Youtube Ads Revenue: $31.51 billion
      This is revenue from ads shown on YouTube videos, including display ads, overlay ads, skippable video ads, and non-skippable video ads.

    3. Google Network (10.20%)

      2023 Total Google Network Revenue: $31.316 billion
      This is revenue from ads displayed on websites and apps that are part of Google’s ad network, beyond Google-owned properties.

    4. Google Other (11.26%)

      2023 Total Google Other Revenue: $34.68 billion
      This is revenue from Google’s other ventures and products, such as hardware (like Pixel phones and Nest devices), Play Store purchases, and other non-advertising sources.

    5. Google Cloud (10.75%)

      2023 Total Google Cloud Revenue: $33.08 billion
      This is revenue from Google’s cloud computing services, such as computing power, storage, and data analytics offered to businesses and developers.

    So, 57% from search, and only 10% from ads on non-Google pages.