element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Embedded and Microcontrollers
  • Technologies
  • More
Embedded and Microcontrollers
Blog Cerebras Launches Its AI Inference Chip With 44GB of Memory---Outperforming NVIDIA's DGX100
  • Blog
  • Forum
  • Documents
  • Quiz
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Embedded and Microcontrollers to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: Catwell
  • Date Created: 25 Sep 2024 6:51 PM Date Created
  • Views 1029 views
  • Likes 6 likes
  • Comments 1 comment
  • llm
  • artificial intelligence
  • compute
  • cabeatwell
  • ai
  • nvidia
  • cerebras
Related
Recommended

Cerebras Launches Its AI Inference Chip With 44GB of Memory---Outperforming NVIDIA's DGX100

Catwell
Catwell
25 Sep 2024

image

(Image Credit: Cerebras)

Cerebras recently developed an AI inference chip that positions itself as a competitor to NVIDIA's DGX100. It's packed with 44GB of memory—enough for it to work with AI models that have billions or even trillions of parameters. If AI models exceed the memory capacity of one wafer, the Cerebras chip divides them at layer boundaries. Afterward, the chip sends them to several CS-3 systems. One CS-3 system is capable of handling 20 billion parameter models, while four systems can handle 70 billion parameter models.

According to Cerebras, its 16-bit model offers more precision compared to competitors using 8-bit models that suffer from performance loss. Cerebras claims the 16-bit model has a 5% performance boost in math, reasoning, and multi-turn conversations compared to 8-bit models. This helps produce reliable and precise results.

Developers can seamlessly integrate the Cerebras inference platform through the chat and API. It's tailored for those who know how to use OpenAI's Chat Completions format. One of the greatest features of the chip is that it runs Llama3.1 70B models at 450 tokens per second---the only platform to reach high speeds with large-scale models. That performance is important for real-time LLM intelligence and complex AI workflows. This also applies to scaffolding, which uses a lot of tokens.

Cerebras wants to push for adoption, and it's doing this by offering developers a daily limit of one million tokens per day at launch. However, the company says the pricing of larger deployments will undercut popular GPU cloud services. Cerebras launched the platform with support for the Llama3.1 8B and 70B models, and it plans to provide support for larger models like Llama3 405B and Mistral Large 2.

Have a story tip? Message me at: http://twitter.com/Cabe_Atwell

  • Sign in to reply
  • DAB
    DAB 9 months ago

    That is the important issue with systems development.

    As soon as someone comes out with the "best" system to do something, someone else is already trying to do it better, bigger, and hopefully cheaper.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube