Uncategorized
Inside Meta’s scramble to catch up on AI
According to a business memo dated Sept. 20, 2022, Meta CEO Mark Zuckerberg assembled his top lieutenants for a five-hour examination of the company’s computing power, focusing on its potential to execute cutting-edge artificial intelligence work.
According to the memo, company statements, and interviews with 12 people familiar with the changes, who spoke on condition of anonymity to discuss internal comp, the social media giant had been slow to adopt expensive AI-friendly hardware and software systems for its main business, hindering its ability to keep pace with innovation at scale even as it increasingly relied on AI to support its growth.
“AI development tooling, techniques, and processes are lacking. “We need to invest heavily here,” read Santosh Janardhan’s memo, put on Meta’s internal message board in September and disclosed now.
“Fundamentally shifting our physical infrastructure design, our software systems, and our approach to providing a stable platform” is needed to support AI work, Meta (META.O) said.
Meta has spent over a year revamping its AI infrastructure. The business has openly admitted “playing a little bit of catch-up” on AI hardware trends, but capacity crunches, leadership changes, and a cancelled AI chip project have not been published.
Meta spokesperson Jon Carvill responded to the message and restructure by saying the business “has a proven track record in creating and deploying state-of-the-art infrastructure at scale combined with deep expertise in AI research and engineering.”
“We’re confident in our ability to continue expanding our infrastructure’s capabilities to meet our near-term and long-term needs as we bring new AI-powered experiences to our family of apps and consumer products,” added Carvill. He refused to say if Meta abandoned its AI chip. Meta has been cutting off personnel at a rate not seen since the dotcom crisis since November due to those investments.
After its Nov. 30 debut, Microsoft-backed OpenAI’s ChatGPT became the fastest-growing consumer app in history, sparking a race among tech giants to release products using generative AI, which creates human-like written and visual content in response to prompts.
Five sources stated generative AI devours processing power, intensifying Meta’s capacity struggle.
Falling behind
Those five individuals indicated Meta’s late adoption of GPUs for AI work was a major problem.
GPU chips can analyze billions of data points in parallel, making them ideal for artificial intelligence processing.
GPUs are more expensive than other chips, and Nvidia Corp (NVDA.O) controls 80% of the market and dominates software, sources said.
Nvidia declined comment for this story.
Meta used its fleet of commodity central processing units (CPUs), the computing world’s workhorse chip, to run AI tasks until last year.
Two people said the corporation started utilizing its own proprietary chip for inference, an AI technique in which algorithms trained on massive quantities of data make judgements and respond to instructions.
By 2021, the two-pronged solution was slower and less efficient than one based on GPUs, which were more adaptable in running multiple models than Meta’s processor, the two people said.
Meta didn’t comment on its AI chip.
Four sources said that as Zuckerberg pivoted the company toward the metaverse—a set of digital worlds enabled by augmented and virtual reality—its capacity crunch was slowing its ability to deploy AI to respond to threats like TikTok and Apple-led ad privacy changes.
In early 2022, former Meta board member Peter Thiel resigned without explanation due to the blunders.
According to two sources, Thiel told Zuckerberg and his executives at a board meeting before he left that they were complacent about Meta’s core social media business and focused too much on the metaverse, leaving the company vulnerable to TikTok.
Meta said nothing.
CATCH-UP
After canceling a 2022 rollout of Meta’s own inference technology, management ordered billions of dollars of Nvidia GPUs, one insider claimed.
Meta rejected comment.
Meta was already several steps behind competitors like Google, which had began deploying its TPU GPUs in 2015.
In March, executives reorganized Meta’s AI groups, hiring two new heads of engineering, including Janardhan, the September memo’s author.
According to LinkedIn profiles and a source familiar with the departures, more than a dozen executives left Meta during the months-long upheaval, a near-total turnover of AI infrastructure leadership.
Meta then redesigned its data centers to handle GPUs, which consume more power and heat than CPUs and must be grouped with specialized networking.
Janardhan and other executives declined company-requested interviews.
According to company reports, the redesign increased Meta’s capital spending by $4 billion a quarter, nearly doubling its spend as of 2021, and halted or canceled four data center builds. Janardhan’s memo and four unnamed sources said the facilities needed 24 to 32 times the networking capacity and new liquid cooling systems to manage the clusters’ heat.
Meta began designing a more ambitious in-house processor that could train AI models and perform inference like a GPU. Two sources indicated the unreported project will end around 2025.
Meta spokeswoman Carvill said data center building would continue later this year after migrating to the new designs. He didn’t discuss the chip project.
TRADE-OFFS
Meta has little to offer as Microsoft and Google debut commercial generative AI technologies while expanding up GPU capacity.
“Basically all of our AI capacity is going towards ads, feeds and Reels,” Meta’s TikTok-like short video format, CFO Susan Li said in February.
Four individuals said Meta prioritized generative AI products after ChatGPT launched in November. They said Facebook was not focused on turning its well-regarded research into products, even though its research center FAIR, or Facebook AI Research, has been publishing prototypes since late 2021.
Investor interest is rising. Zuckerberg announced a top-level generative AI team in February to “turbocharge” the company’s development.
This month, Chief Technology Officer Andrew Bosworth said that Meta would debut a product this year, focusing on generative AI with Zuckerberg.
Two people familiar with the new team said its early work focused on establishing a foundation model, a core software that can be fine-tuned and customized for different products.
Meta spokeswoman Carvill said multiple teams have been producing generative AI solutions for over a year. In ChatGPT’s months, development has intensified, he said.