Arista Networks, Inc. (NYSE:ANET) Bank of America Securities Global A.I. Conference 2023 September 12, 2023 12:00 PM ET

Company Participants

Jayshree Ullal - President and CEO

Conference Call Participants

Tal Liani - Bank of America Merrill Lynch

Tal Liani

Hi. Good morning, everyone. Thanks for joining us in the last 2 days. We're hosting multiple companies to speak about the entire value chain of AI. We heard from contract manufacturers. We heard from semiconductor companies. We heard yesterday from Cisco about their silicon.

And now I'm very pleased to host Jayshree Ullal, President and CEO of Arista, to speak about Arista. There are so many questions I want to ask her. But before I start, I want to welcome Jayshree to our small virtual conference. Thanks for coming, Jayshree.

Jayshree Ullal

Tal, thank you for having me. As you said, it's been 25 years together. I look forward to many more.

Question-and-Answer Session

Q - Tal Liani

Yes. I can guarantee you it's not going to be another 25. I want to start kind of with the high level of the question. I told Jayshree -- by the way, I'm saying it to the audience, I told Jayshree before that we're not going to talk about the quarter. We're not going to talk about the numbers. We just want to focus -- this is an opportunity to speak with Jayshree about AI, about the readiness of the company and about her views of the market. And I want to talk about the evolution of Arista over the last years in last few years in the context of how you support the participation in AI in generative AI.

Jayshree Ullal

Yeah. So I think you all know Arista very much is a pioneer of cloud networking, and we are most known for pushing the envelope of scale of networking, whether it's for the frontend network and the amount of traffic and the patterns for lease find topologies to connect thousands, if not hundreds and thousands or millions of CPUs, virtual machines, containers, right?

But while this has all been going on, there's been an incredible phenomenon that started as recently as last year, and Arista has been working on it with some of our leading customers on what I call it Arista 2.0 where we are now building a platform that's not only capable of carrying large workloads, workflows in the data center, but really what I call centers of data. And the centers of data may be in a campus, maybe in a routed WAN an environment and maybe new branch. But now there's a very interesting given the topic of today being AI, there's another interesting whole area emerging of what I call the backend network, which Arista has traditionally not participated in.

And so I think the next phase of Arista is really building a platform for all of what we've done already for the cloud and Web 3.0 era, but now bringing that to bear as AI clusters in the back end of the network.

Tal Liani

What is the backend network? And as far as I understand right now, the choice of technology is InfiniBand, how do you participate.

Jayshree Ullal

Yeah. Yeah, absolutely. As you all know, InfiniBand has been around 25 years. This has been a very well-recognized InfiniBand Trade Association, doesn't have a broad set of vendors. In fact, in many ways, it's a vendor of one, NVIDIA. But they have been delivering consistently for several years on high performance compute.

But now when you look at the role of InfiniBand and Ethernet for AI, first, I think it's important to step back and say, why is this even relevant? Because there's a massive AI data exchange going on where the AI workloads and demands on the network are both data and compute intensive. In fact, the workloads are so large and the parameters and the metrics are so distributed across thousands, hundreds of thousands, sometimes millions of processors that you have to look at both the large language models and the recommendation systems, LLM, DLRM. And how you share all of your parameters across these thousands or millions of processes?

And so this, as I said, requires you to do a constant compute exchange reduced cycle. And the volume of the data that's exchanged is so significant that any slowdown in the network with these expensive GPUs will ultimately impact the applications. A poor network is a poor choice and as you rightly point out, there are two very good choices.

Today, the most commonly used technology bundled with NVIDIA GPUs is InfiniBand. But I believe the future is very much for Ethernet and I have never bet on a non-Ethernet technology, although I worked on many of them, ATM, FDDI Token Ring to name a few. And those work in the file print share environments. But as Ethernet needs to be stretched and subsumed for AI applications, you need additional capabilities.

So in general, I would say you need a mission-critical AI network and neither InfiniBand or Ethernet are fully achieving those goals. And both need a lot of optimization going into the future.

Tal Liani

Got it. And why -- why did InfiniBand just in simple terms, why did InfiniBand -- why did they start AI with InfiniBand and not with Ethernet?

Jayshree Ullal

Yeah. So I think it was very difficult to reimagine a high-speed transit network when the real problem right now is large language models, training, inference, GPUs. So the natural connection with the GPUs from NVIDIA became InfiniBand. And the upshot here is wire speed delivery of packets, large synchronized burst of data. And I would say, especially latency, which has been a strength of InfiniBand.

But when you go back and look at Ethernet and InfiniBand even over the last 10 years, historically, Ethernet has always lagged a little bit to InfiniBand. When InfiniBand was doing 40 gig, Ethernet was doing 10 when InfiniBand went to DDR, EDR, MDR, HDR rates and was always doubling, Ethernet was always behind. And that's changed dramatically in the last year.

Today, you can push the envelope of Ethernet at 100 gig, 200 gig, 400 gig, 800 gig, and you can see a path to 1.6 terabyte. And this is what I think makes Ethernet, a natural standard-based ecosystem with a wide level of capability ecosystem as well as troubleshooting techniques. And as you know, the Ultra Ethernet form is on a mission to enhance the capabilities of AI and HPC using Ethernet. So the advantage of Ethernet is a no brainer. It brings broad economics of wide deployments, familiarity of tools and then obviously, we can push the envelope of silicon geometries with Moore's Law with all of the silicon vendors we work together with.

Tal Liani

Got it.

Jayshree Ullal

One of the other things, I think, was when you're building InfiniBand clusters, you only needed to worry about L2 subnets. And -- now today, as you start to build a backend network and you build these clusters, you also have to think of the uniformity and connection to the frontend network.

So the ability for Ethernet to be a routed protocol and run over IP is a tremendous advantage. So if you can get all the advantages of a back-end AI, high-performance network with Ethernet and connected seamlessly to your front end then one plus one is far greater than two.

Tal Liani

Right. How -- so how is it going to play out for you? You're going to -- you're focusing on Ethernet. You're talking about maybe different flavors of Ethernet. How long does it take? How do you participate in the build-out of AI and generative AI in the intermediate term?

And then how do you participate in the longer term? What needs to have for you to participate in the longer term?

Jayshree Ullal

Yeah. I think you have to parse the problem and look at it differently in different strokes for different folks. If you're really building a small cluster within a server rack, I don't have know if InfiniBand or Ethernet plays into that. So if it's 100 nodes or so, you're just going to connect with an internal IO of some kind of almost a bus technology, not a network. And that may be PCIe, CXL or NVLink at the back end, right?

But when you start to talk about thousands and thousands of nodes and needing AI at scale those AI jobs are going to really, really -- the underlying network to improve job completion time is very critical. So Arista's focus has been on making sure we can work with the GPU vendors. NVIDIA is our friend there, with the NIC vendors, again, coming from not Arista, but different vendors, be it Broadcom, NVIDIA or Intel and then really applying the right scale for the different traffic patterns.

And I'll give you a couple of examples. In the 1990s, when we talked about scale, it was just Ethernet with spanning tree because you're mostly detecting loops. In the 2000s, when you talk to us about scale, you have technologies like MLAG at Layer 2 or CMP at Layer 3 that can allow you to build scale with active path with the lease fine. In the next phase of an AI network topology, you need a heck of a lot more packet spring, load balancing where you can allow every flow to simultaneously access all paths in the destination to improve the job completion time because your entire completion time is affected by the last package, and that's the worst culprit.

So we're doing a lot of work to bring dynamic load balancing and packet spring. And this is something that's also being endorsed by the UEC, the Ultra Ethernet Consortium.

Another thing that's very, very important as we enhance Ethernet is having the right monitoring and visibility techniques. You have to be able to not pull it millisecond or minute intervals, but really at nano and microsecond to get all of the logging and visibility encounters and characteristics because things are moving so fast and furiously. And Arista has been always developing features like that for the cloud. And now we're extending that with our EOS for features like AI analyzer.

Network congestion is a key metric. And we always get trapped in this, okay, Ethernet is not lossless and InfiniBand is. But none of that matters when you look at the aggregate of how many nodes you're trying to support. And a common in-cast congestion problem can occur at the last link of any AI receiver and multiple uncoordinated centers are just jamming traffic. So a good example of that is the all in all, all-to-all AI operation across GPU clusters.

And so having the right Ethernet-based congestion control mechanisms and algorithms is critical, so you can spread the load not only across multiple paths but it's designed to work in conjunction with this multipath spring to have the right virtual output queuing in cast and then the egress buffer memory and output has to be appropriately balanced. So we have a key fabric and a lot of the things we've done in the cloud will replay here for an AI network as well.

Tal Liani

Got it. We spoke about backend and frontend networks. And I have two questions. First, do you see a different opportunity for the training portion and inferencing portions of AI? Or is it kind of the same thing?

Jayshree Ullal

No, they're definitely different in the envelope of the number of parameters and algorithms you're pushing, so -- but I would say not only -- so I talked about small networks being in the server. A good way to look at it is medium applications like inference may not require as many GPUs and as much of a network would still be substantial scale. But the ultimate scale of training billions of parameters and the associated tokens, et cetera, is really in the training. And this is where most of our forward-looking customers are focusing because if we can solve the training problem, we can naturally solve the inference in smaller networks.

So today, I would say, and I think I mentioned this in earlier calls, we're largely in trials and pilots, where we are really proving that the training algorithms can work across loss less congestion-free Ethernet network and map to the scale of their cluster and the scale of their clusters, we've seen anywhere from 1,000 GPUs to building a single tier, for example, 7,800 AI span to a two-layer cluster where you can add more GPUs to the scale of 4,000 to 8,000. And of course, eventually, we're going to be building clusters for training that are 32,000 to 100,000 GPUs.

So the size of the GPUs defines the metrics of training clusters you want. And right now, most of them are in pilots and trials but I fully expect production in 2025. And those will get even larger over time.

Tal Liani

When you grow the cluster size from -- I'm making up a number, from 1,000 GPUs to 4,000 GPUs, does it mean that the networking cost is four times also?

Jayshree Ullal

I hope not. Well, it depends on the design. So the beauty of our architecture is you can build a single-stage spine and AI spine and connect 1,000 GPUs. As you want to add more GPUs and again, the numbers will vary by mileage, depending on whether you want to connect with 400 gig or 800 gig. But the idea wouldn't be that you just sort of keep adding linear cost, but you would add a layer of lease, AI lease to connect those GPUs to get sort of a two-tiered additional aggregation of port density.

So it wouldn't be 4x, but suddenly, you would add more ports and it can be at least 2x.

Tal Liani

Got it. In today's world until we migrate to Ethernet everywhere, what's the impact when the backend network is being built with InfiniBand, what's the impact on your frontend network?

Jayshree Ullal

I think today, the backend networks, again, largely are built in silos. It's a cluster that doesn't connect to the frontend. So if you think about it, how I connect an InfiniBand to Ethernet without substantial loss in latency and it's a gateway function, nobody does that. So I think the clusters in the back end are largely not talking to the front end because we're still in this mode where there are two different islands.

I do think this is why Ethernet will be heavily favored because once we can solve the load balancing, the monitoring visibility, the congestion control is algorithms on the right track, whether it's PFC, per flow congestion control and dealing with all the system-level mechanisms. Then it's seamless. You have no translation and you can move back and forth between -- and you're normally limited to a Layer 2 only subnet and your high availability isn't constrained by the number of subnet managers that you could support.

So everything gets a lot better when you have Ethernet front and then it back. But today, that's not how it's happening. It's mostly silos.

Tal Liani

Got it. I want to maybe speak about the support from the GPU community. It's public information. Google is having its own GPU. Others are using NVIDIA. AMD is developing. Others are developing. There are small startups even. What is your view on the support for GPUs? And is there any preference or is there any way that you look at it differently for example?

Jayshree Ullal

No, listen, I think NVIDIA has definitely won the first phase of AI systems and solutions with the GPU and the maniacal focus not just on one GPU, but the different types they have and then built into systems like DGX and ATX is very remarkable. And as you know, there's a shortage. There's no glut of GPUs yet. There's an extreme shortage. So in the foreseeable future, like the next one or two years, I think there's pretty much only one major vendor supplying GPUs.

However, if I look forward, much like I look forward on InfiniBand versus Ethernet, I think the industry always needs a diverse ecosystem. And as you rightly classified, I think there'll be three types of players there will be alternative vendors to NVIDIA. And this is where companies like AMD and Intel with their Habana's come in.

And then there will be startups, difficult for startups to compete with appropriate differentiation perhaps. But don't relied also our own customers many of who will develop GPU accelerators or GPUs of -- that are customized for their environment and very committed to price performance for their applications. So I think those are the three categories of XPUs we will see. And Arista will be Switzerland on that and look forward to connecting with a fantastic AI fabric in all of those scenarios.

Tal Liani

And the other way around, NVIDIA is -- they have a competing Ethernet offering in the Spectrum networking platform. Is this a concern for you?

Jayshree Ullal

No, I mean I don't think it's a concern for NVIDIA or Arista. We're going to be partners on most occasions. But when it comes to the Ethernet switch, we'll have some overlap. But I think building AI and general-purpose Ethernet switch isn't easy. We've been at it now for a decade or more, whether it's a software stack, but -- and so our applications are going to be both for AI and obviously, for the multitude of the other platforms I talked about in the datacenter campus in WAN. And we are very comfortable that ultimately you cannot build an island for an Ethernet switch, you need to build multiple use cases that connect the back to the front, and that's where Arista will shine.

Tal Liani

Right. How is 800 gig and above connected to our discussion of AI? Is it the only driver, the main driver? How do you view the market?

Jayshree Ullal

Yeah, that's a really good question. Because historically, as you know, these speed transitions took time. I still remember when we were waiting for 10 gig to happen in the beginnings of Arista, and I think that was a very long tail. It took at least 10 or 15 years because there wasn't any port density and for -- and server connectivity available back in 2008, when Arista was shipping products. And that changed over time.

I think the acceleration to 100-gig, I'm going to skip 40-gig because it's kind of neither here or there happened a lot sooner. It didn't take or 20 years, it took more like five years for the cloud. And of course, there's now another five to 10 years with the enterprise.

So when I look at 400 and 800 gig, in theory, it should take time. But the reason it won't is because I think there's really three speed transitions happening. One is where the classical enterprises are moving to 100 gig, and then the cloud providers building spines and frontend networks, and distributing their centers across different geographies are starting to do a 400-gig migration. And usually, I would have told you, well, the 800 gig will take time, but now you have this killer AI application for the back-end network, and this is where we envision 800 gig.

Now as we start to deploy 400 and 800 in the backend, by the way, that will affect the frontend of the spine networks as well, and that will have a 20% effect on the 400-gig network, which will then need higher performance. So I think these use cases will have a vicious cycle across them. And AI is definitely a killer application for faster deployment of 100 gig -- 800 gig, sorry.

Tal Liani

Got it. When I look at the market today, not AI, just a Cloud Titans or datacenter, big kind of datacenters, there are -- some companies are using white boxes, white box switching. Some companies are using branded solutions from yourself, mainly from yourself. And we all know who is doing what kind of in the market. There are only four big players. Do you think the appetite for using branded or white boxes? The appetite will change or the way that the architecture works will change with AI versus regular networks.

Jayshree Ullal

Not really. I think you can only talk about white boxes when things are well defined and mature in terms of the hardware and software stack. And as you know, you and I have talked about this many times, the discussion on white box appears a lot of times, but it is something we actually embrace and we recognize it's there in some use cases, but isn't there in most complex use cases.

And I think AI would be a very difficult thing to white box at a time when things are moving, changing the performance, the latency, the aggregate scale, the functionality, the UEC form, everything is in flux.

So I think until we get into some stability, it's difficult to think of any of these things as white boxes. But in fact, quite the opposite. I think that you will see a lot of AI focus on customizing for generative AI for inference, for very high-performance training models. And I think it will be at least a few years before we see any kind of white box in AI world.

Tal Liani

Got it. Is Cisco a bigger threat now versus before? We hardly saw Cisco before, especially with Cloud Titans. They're talking now about more orders, higher level of orders and backlog with Cloud Titans. Do you see more of Cisco in the market now?

Jayshree Ullal

I always view Cisco as a very respectable competitor. I was there 15 years. And I'm now 15 years with Arista. So we never take Cisco lightly, and respect. Their ability to do things as a very large dominant company in a lot of market sectors specific to Cloud Titans, I think especially the strength in the acquisition of Acacia and Optics cannot be underestimated. We were partnering with Acacia for a long time. So I definitely think they've always had a presence in cloud titans. And of course, Arista's presence is much stronger.

Tal Liani

The fact it was one of my questions that I had is about the optical integration. The fact you don't have an in-house optical integration, is it -- is it something -- is it weakness that you need to strengthen or can you work without it?

Jayshree Ullal

I think we can work without it because I think we choose to work with best-of-breed optical vendors and Optics, as you know, by the nature of it, is changing all the time. And you have to have a specialized set of expertise on it. We've chosen to work with partners on that specialized expertise from -- and a good example of that at OFC earlier this year, is where Arista demonstrated that using our electrical series instead of just doing co-packaging, we could drive longer distances and with reduced power on our switches for long haul optics or medium haul, I should say, because long-haul could imply hundreds of kilometers.

And this is a pretty powerful demonstration of LD, Linear Drive, where you can sort of remove the DSP and push the envelope of capability. So embracing all of these different optical options rather than locking ourselves into one particular solution, including our own has always been our thesis.

Tal Liani

Got it. Once Ethernet, I love talking to you, by the way, Jayshree, because I can shoot all my questions, and I know that you're going to answer it like an engineer, not like a CEO that you like.

Jayshree Ullal

Yeah. I guess I'll work on answering like a CEO. What's your next question?

Tal Liani

No, no, no. Once Ethernet is in the backend. Is it a different product are Ethernet switches for a different than Ethernet switches for frontend or is it the same product?

Jayshree Ullal

I think they can -- don't know exactly and precisely how the shapes and turns will be. There are definitely products like our AI spine that can be the same, but they can have flavors of functionality and capability that make them more AI friendly.

And then in other cases, if you're starting to build extremely large clusters and you're very optimized for job completion time and end-to-end latency, they can also be different products. So I think it can be -- we can have both options depending on the used case.

Tal Liani

Talking about kind of more the applications, the networks is the -- we talk about -- we always say AI, hyperscalers -- is it just about hyperscalers? Or are you seeing or will we see an appetite for -- to build AI networks also in other parts of the market?

Jayshree Ullal

Well, I think if you're looking for large size production trials, it definitely favors at least the Cloud Titans. And I would say maybe some Tier 2 cloud providers and even some extremely large enterprises. So I think it's still single-digit customers, but they may not just be Cloud Titans. They would be those who have an appetite to really invest to offer an AI service of some sort.

So you've got to be thinking customers with large CapEx and deep pockets, whether it's in the enterprise or specialty cloud or Cloud Titans. And so we see that as single-digit customers right now that will be very large. And -- but we see them as very meaningful.

Now that doesn't mean the enterprise is uninterested. I think just about every enterprise will have some sort of a small cluster to prove the thesis on some of these AI applications. But I think the largest ones will be in these three categories, either Cloud Titans or specialty cloud or extremely large enterprises.

Tal Liani

Got it. And will all AI networks look the same?

Jayshree Ullal

No. Not at all. Going back to my small, medium, large, I think if you start with this premise that you want an extremely LLM, DLRM, billions of parameters, then you're going to build one size doesn't fit all, but you're going to build something that's AI at scale, AI Ethernet at scale. You're going to worry a lot about the congestion control and load balancing, the monitoring, the visibility, not just the hardware but the software.

If you're building a small cluster within a rack, you can go to another extreme and never to have it be a network. So I think the way to look at this is small, medium, large based on application and also based on size of network.

Tal Liani

Got it. I'm just looking through, we spoke a lot about kind of the -- what I wanted to ask you. I'm just going through my list of questions to make sure I don't have anything left. My kind of last question is about what drives AI. We kind of take it for granted that these networks are going to be deployed. And we speak about the architectures and technologies. But as an expert in the space, can you take a step up or step back and think about what drives AI, what drives it for -- from an enterprise point of view, from kind of applications from consumers' point of view, why are these hyper clouds or cloud titans investing in AI so much?

Jayshree Ullal

Well, I think if you take a step back and say why are customers building AI clusters, I think there's an incredible optimization work required to get the high flop utilization. If they're putting in all these GPUs, it's for applications that are real-time streaming, gaming, that need that high flop utilization. So if they didn't, then they just run the traffic like they always do today on a regular network. So these AI clusters are very specific to deal with the high bandwidth, high scale, predictable latency, not always ultra-low latency but predictable latency, right?

There's a second part to this, which is not just the application performance but also the lack of storage bandwidth, as you increase the number of GPU cores, a lot of these systems don't have enough memory and storage, and if the memory is not large enough, the storage isn't large enough, you're going to do frequent checkpointing to get the highest bandwidth and every few hours, you'll need to dump a checkpoint and deal with different types of mechanisms to do that at 400 gig, 800 gig.

So AI is not just about optimizing the application and job completion time but also about the fast and reliable connection between GPUs and the associated memory and storage.

There's another thing, I think, that people miss, which is in the traditional CPU world, you had well-defined compilers and you had very good frameworks for them. Today, whether you look at NVIDIA, CUDA or open source frameworks like PyTorch, this is an area where there's a significant amount of work that needs to be done because relative to AI, you can think of PyTorch, something like a new abstraction for the C language. And you can't just have a classic compiler and library and you got to really map the software ecosystem to be able to do all of this.

And obviously, TensorFlow and Google TPUs are doing similar things. And in these compiler based systems, which is again why these applications are driving so much optimization, you have to be looking at constant operations, GPU trips, memory trips, et cetera.

So the replication of all of these applications have a huge repercussion on the network being mission-critical and high performance on the memory, on the storage and the applications and the ecosystem to run these applications. These are all things we did in the CPU world and take for granted these last couple of decades, and that's what we have to look forward to.

Tal Liani

Got it. Last question, I always ask about the impact of power and physical constraints. So -- and it's so important within AI. What does it mean for the future products of AI on the switching side -- what are the -- we less discuss these topics, but it's those that design datacenters know that this is probably the most important topic for them.

Jayshree Ullal

Yes. It's probably one of those dull-boring topics, but the most critical as you said because it's real world. If I just step back and say what's happening to power, as you start to go in to 400-gig, 800-gig territory, the optics is playing a bigger role in the power. It can be 30% of your switch power, whether it's AI or not, by the way, here, AI will just make that worse. And this is why the linear drive and things we're doing to improve the power on the optics is so critical to bring it down by half.

The second thing is you don't always need optics if you're within a data center. So there's a lot of different types of non-optical cables, we can use to reduce that power in a network configuration, but remember, the network is only probably 10% or 15% of the power contribution. In the AI network, we have to worry about the other 85%- 90%, particularly as you're bringing all these clusters of GPUs together. And we're starting to work with customers that are looking at very, very advanced techniques of liquid cooling, not just ambient where they have to really worry about the immersion and cooling systems required for these GPUs, which then, of course, has an impact on the network as well. And that is the 90% problem in AI today.

Tal Liani

Got it. One day, someone will learn how to operate these GPUs inside an aquarium.

Jayshree Ullal

I see them along with the fishes, I guess, that will be a new museum.

Tal Liani

Jayshree, thank you so much. It was a very deep and thorough discussion. I managed to ask all the questions I got from investors and pretend it where it -- these were my questions. So thanks for the time and the effort. And for the investors, if you have any other questions, please don't hesitate to call me if I don't know the answer, I will forward it to the IR team of Arista.

Jayshree Ullal

Tal, always a pleasure, and thank you, guys, for having me and look forward to connecting again soon.

Tal Liani

Perfect. Thank you so much.

Jayshree Ullal

Take care.

Tal Liani

Bye-bye.

Jayshree Ullal

Bye now.