Video: Nebius Edit V1 | Duration: 3307s | Summary: Nebius Edit V1 | Chapters: Welcome and Introduction (0.56s), Nextflow and Seqera (94.68s), Introducing Nebius AI (299.045s), Protein Design Pipeline (679.03s), Kubernetes Cluster Setup (1735.94s), Kubernetes Cluster Integration (2236.395s), Sakeira Platform Benefits (2677.225s), Upcoming Events and Q&A (2806.55s), Q&A and Conclusion (2911.95s), Conclusion and Farewell (3267.6848s)
Transcript for "Nebius Edit V1": Okay, I'm back. Welcome again, everyone. Thanks for joining us today in our webinar. I'm gonna start sharing my screen in a second. We'll get going with the webinar. We have a bit of a presentation, some slides in the beginning, and then we'll kick it off into some live demos. Okay. I hope everyone can see my screen. And let's kick it off. Yeah. Thanks so much for joining us today to our webinar, From Blank Page to Pipeline, where we will show you how you can utilize Seqera, Neveos and Seqera platform to run Nextflow workflows for your analysis at scale. I'm joined, so my name is Florian Wudemann. I'm a bioinformatics engineer at Sicera. And I'm joined today by Daria Balashova, who's an AI engineer at Neveus, and Ilya Burkov, who's the global head of healthcare and life science at Neveus. Today, to give you a bit of an overview of what we're going to talk about today, we'll start off with getting you introduced to Nextflow and Seqera. We'll then talk a bit about, Ilya will then talk about Nebius and introduce you to their products. I will then go back and explain to you what the Nebobinder competition is and give you some background about protein design in general. And then I'll show you how to build a protein design Nextflow pipeline with Secura AI. After that, I'll pass it over to Daria, who's gonna show us how you can set up a managed Kubernetes cluster on Neveus. And then I'll finish this off by showing how to use that cluster to launch your pipeline at scale. And then we'll have some closing slides and time for questions. So let's get started. So what are Nextflow and Seqera? Nextflow has become the standard for reproducible, scalable scientific workflows compared to other solutions. We've seen the adoption of Nextflow increase not only by citation number, but also by the number of bioinformaticians that run and write Nextflow. The number of modern bioinformatics workflows written in Nextflow is also steadily increasing, as we can see by numbers, and there's about 2,500,000 executions of Nextflow monthly around the world. So, why has Nextflow become the standard for scientific analysis? Nextflow is extremely powerful for prototyping extremely fast. You can write a pipeline on your laptop and then ship it on a small test dataset and then basically go to cloud and run your analysis at scale. The abilities to use a large range of containerization engines, such as Docker, Singularity, Potman, make your analysis really reliable and reproducible. The fact that you have a large range of cloud executors and executors for local HPC systems also makes it extremely powerful, which means, again, you can code an Xflow pipeline on your local computer and then easily port that same pipeline to AWS, Nevis or other providers. And this portability is really one of the greatest features of Nextflow. Write it once and run it anywhere. So, who is Sequeira? At Sequeira, we are the company behind Nextflow. In 2013, our founders, Evan and Paolo, developed Nextflow at the CRG in Barcelona. And since then, as I've just shown you, it's really risen in popularity. It got so popular that in 2017, a bunch of people created NF Core, a community project that now hosts over 100 different state of the art next flow pipelines that people can use off the shelves to analyze the omics data. In 2018, Evan and Paolo then founded Sekera with the goal of really allowing enterprises and large corporations and people power users to run Nextflow at scale in their organizations. Today, Sekera serves about 130, more than 130 customers, of which about more than half are biopharma and biotech. But we've also seen the adoption of Nextflow in areas outside of biopharma and biotech. For example, we have people that run Nextflow for agriculture pipelines, satellite imagery. You can basically use it for whatever you want. And it's used by more than 100,000 scientists worldwide. At Sekira, we provide a unified governed execution layer for modern computational science. That means that you can run your workflows standardized. You can orchestrate them around all these different executions and cloud providers that we've shown you. You have full execution and lineage auditability. You have access control that allows your organization to determine who gets to see and execute what. You get cost visibility across your providers and you really get production grade, scalable and reliable. And all of this is augmented now with Secure AI that can help you set up, run, debug, and really run your Nextflow pipelines at scale. At Secure, we basically take away the infrastructure headache to deploy your Nextflow pipelines reliably and at scale. Now I'll pass it over to Ilya Burkov, who's going to tell us all about Neveus. Ilya, welcome on stage. Hi. Thanks, Florian. Hi, everyone. Huge thank you for joining us today. My name is Doctor. Deberkhoff. I'm the Global Head of Healthcare and Life Sciences here at Nebius. It's an absolute pleasure to co host this event. This webinar is going to be a fantastic opportunity to share what we've done with our friends at Sukara. Before I jump into live build, I want to give you a quick, clear picture of who Nebios is, why we're so passionate about life sciences and why our partnership with Sukera is such a game changer for all of you. Nebius is a global AI infrastructure company. We're headquartered in Amsterdam, and we are publicly listed on the NASDAQ. We build full stack solutions that are vertically integrated within our AI cloud infrastructure. And this infrastructure is purpose built, from the silicon up to the huge demands that are required to tackle modern AI. Think massive clusters of the latest NVIDIA GPUs from the H100s all the way to the GB300s and everything in between. We offer ultra low latency InfiniBand networking, production grade orchestration tools with Kubernetes and Slurm that are delivered with simplicity and affordability of a modern cloud environment. All below the prices of typescalers and a package that offers far more than most neo clouds out there. We offer this all in one place for all of the training and inference needs that you might have. Our mission is simple: we want to democratize access to world class AI compute so that innovators everywhere, from startups to research labs and enterprises, can move faster, cheaper and at greater scale than ever before. Nowhere is this mission more exciting than in life sciences and healthcare. We all know that AI is fundamentally rewriting the rules of drug discovery, genomics, personalized medicine and protein design. The very topic we're going to be tackling today in the webinar. But these breakthroughs demand access to enormous compute, training millions of molecules or multi billion parameter foundation models, running thousands of docking simulations in parallel or scaling AlphaFold style predictions across entire proteomes. That's exactly what Nebius was engineered for. We give you access to the state of the art GPUs in multiple locations around the world at highly competitive prices, and we are in hyper growth mode. As you can see, we were increasing our capacity around the world, including in Europe and in The US, and we're just getting started. So we offer GPUs that are pre optimized with Bionimo and NVIDIA NIMS microservices, if you need that. We have SOC two and HIPAA compliant environments that are very good for sensitive biological data. One less thing for you to think about. And more than 90% GPU utilization all out of the box. So customers regularly tell us that they're seeing two to four times faster training. And even though they're using the same GPUs in other environments, they are not seeing those kind of benefits in performance increases. They're cutting the model development cycles from weeks to months down to a few days in some instances. And we're very excited to be able to share that we even put our money where our mouth is. Last year, we launched the Nebius AI Discovery Awards. Those awards basically gave hundreds of thousands of dollars in GPU credits to pioneering teams in cancer diagnostics, protein targeting, single cell transcriptomics and precision medicine. Why? Well, because we believe that the next generation of life saving breakthroughs should never be bottlenecked by the infrastructure. And we're happy to be launching the twenty twenty six awards next week. So consider applying if you've got a brilliant idea or you're working on something interesting. And that brings me to why today's webinar is so special. Great infrastructure alone isn't enough. You also need great workflows. And that's where our close partnership with Sukera comes in. Secara's next flow and the new Secara AI tools that Florian mentioned, and we'll see later on, are the gold standard for reproducible, portable, cloud native bioinformatics pipelines. We worked hand in hand with the Secura team so that you can go from a Jupyter notebook or a blank Git repo straight to a fully orchestrated production scale Nextflow pipeline running on a few or perhaps a thousand RGPUs and with zero log in, full auditability and effortless scanning. You'll see exactly that live in the next hour or so. And by the end of the session, you'll walk away knowing precisely how to turn your own ideas into scalable, reproducible, production ready workflows in a matter of days, not months. So thank you again for being here. We're incredibly excited to share this with you. Florian, Daria, the stage is yours. Let's build something amazing together. Thank you. Thank you so much, Ilya. That was really inspiring. I'm gonna start sharing my screen again, and we're gonna jump now to the section where we're gonna talk about protein design, as Ilya mentioned. I'll give you first a short introduction about, for those of you that are not familiar, in very rough terms what protein design means and how we think about it. And then we're going to jump into how we've built a protein design pipeline using Zakera AI and Nextflow. So, what is protein design? In essence, protein design often what is the case is that we have a target and this might be some virus protein, a transmembrane protein, a cancer target. Usually what you want to do is you want to block part of that protein to interact with another protein that prevents some kind of signaling. For example, maybe you want to block inflammatory signaling or you're trying to block a virus from binding somewhere. To do that, there has been a huge explosion of tools recently that allow you to design, to basically use generative modeling to come up with these new binding proteins. And these tools often are written in a vast array of languages. Many use Python, but Python is certainly not the only language being used. Sometimes they require specialized hardware, such as what Ilya mentioned, GPUs. And Nextflow basically facilitates to use all of this together and combine it into one functional pipeline by offering containerization using Docker, Singularity, or Conva, or really any of the container engines that you are used to, and wrapping these different languages in these containers, which means that we can run these analysis anywhere we want in any cloud, on any computer, and they run reproducibly, which means pipeline I write today on my computer runs the same way tomorrow and in a year and in two years on the newest cloud or the newest nebious infrastructure that comes out. So what we're going to show you today is why we think that using Nextflow to orchestrate these different protein design tools and build pipelines connecting these tools across different languages is the way to go to allow you to make reproducible and scalable analysis that you can trust. And what you end up in the end is basically these candidate binders, novel proteins that we've dreamed up that might be able to become novel therapeutics and cure diseases in the future. So a bit of background why we really started with this last year. At the 2025, this company called Adaptive organized the SNPLA binder competition. The goal of this competition was, in essence, quite simple. It was to target the glycoprotein G of this Nipah virus, which is a quite deadly zoonotic virus, and come up with a new binding protein. There were some rules. The binding protein had to be smaller than two fifty amino acids, and it had to be 10 amino acids away from the closest protein out there. Basically, to not just take the known binders and put them into the competition. We have dabbled a bit before in making a protein design pipeline. For our annual summit in Barcelona in 2024, my colleagues and I from the scientific development team at Seqera actually already built a protein design pipeline by hand. At the time, we used the tools that were state of the art to use generative modeling to come up with new binders. We used RF diffusion followed by Protein MPNN and a bunch of other tools to pretty much dream up new binders and get some metrics about them that would tell us whether they are good binders or not. However, in 2024, some of you might remember, AI coding tools were really not at the point where you could just throw them at a problem and they would write you reliable code. And so when we did this, we really had to do it the old way, I would say now in 2026, which was quite time consuming and required a significant amount of manual work. It pretty much took us a few weeks to build up this POC pipeline. And simply getting containers to work, find how the tools work, took a lot of searching around. Now that in 2026 with AI tools, this is a lot easier. And that's pretty much what I wanna show you. Now, how can we, in 2026, use AI tooling to really help and guide us in writing these pipelines and accelerate the development workflow from weeks or months to days? And that's what I'm gonna be talking about and showing you today. Secura AI, the tool that we built at Secura to build entire Nextflow pipelines from scratch. Secura AI can really do a lot. It can generate and validate entire Nextflow pipelines. You can convert legacy pipelines from other legacy workflow managers, such as CWL, WL, or Whittle, or Snakemake, or any Bash scripts you have flying around, and pretty much weave them into a functional Nextflow pipeline quite quickly. And the cool part with Secure AI is that we just brought out this week a new Secure AI CLI, similar to other CLI tools you might have seen, that can also directly connect to your GitHub repositories. And so you can use Secure AI while you're developing your Nextflow pipeline, author commit messages and author pull requests on your behalf, meaning it can be your partner in developing and testing your Nextflow pipelines. Overall, Secura AI really accelerates the writing deployment and optimization of your Nextflow pipelines. Before we get actually started into the demo, I just wanna talk about the actual pipeline that we built for this new Pubbinder competition. While I'm going to show you SecureAI writing some pipeline code today, I think we can appreciate that while pipeline development is accelerated, we're still not able to write an entire pipeline end to end in the length of a webinar. And so, we're going to use nfprotein design, the pipeline that we built for the NEPA Bind competition, as our guiding light and we'll run this pipeline throughout the webinar as the example of a full scale protein design pipeline. Now, when we came up with how do we want to build this pipeline, we started out with basically saying we would like a sample sheet for those that are familiar with NF Core. That's kind of the way that we put in data into NF core sample sheet. That's a CSV file that contains what are the regions of the protein, of the target protein that we want to target, and what are some of the parameters of the models we wanna use. Back in last year, at the end of last year, a new model came out called Bolstgen that was really exciting. And we wanted to really try to use a state of the art model in this pipeline. And so the first step of the pipeline is running Bolstgen to basically come up with binders. We then put the output of that through Protein MPNN. And I'm putting here the pull request messages that Secure AI made. Every little tool that we added here was completely implemented by Secure AI at the time. And the pull request was opened and merged by me and my colleagues. And that pretty much allowed us to go from a blank page to having a full scale Nextflow pipeline that we're going to run at the end of the webinar today. Protein MPN is to optimize the sequence for being more expressed and just better being expressible in bacteria. We then take that optimized sequence and have to refold it with the target protein to identify metrics of whether that is a potential good binder or not. And we then chose Prodigy and a custom script to do some IPSAE calculations, which IPSAE was the metric that the competition used to rank the binders. Ultimately, we then put all of this into an interactive HTML report that we could explore. The point here, though, is that the specific tools we were using are not actually that important. The point I want to get across is that Nextflow and Seqera allow you to write your pipelines with any tools that are state of the art at the moment, that you wanna try out in your organization. You're not locked into any specific language, any specific tool. This is just a kind of a snapshot of the field at the time when we were writing this pipeline. And this is what we're gonna use today as an example. Okay, now let's jump into the demo. I'm gonna go left here, right here to my screen. And I'm on sekera.io here on the Sekera AI page. And I've already populated this with a prompt, but I shortly wanna just walk you through what secara.ai in the browser offers you. So when you are on secara.io and you're an existing cloud customer, you'll be automatically signed into secara.ai with your credentials that you also use to sign into SecureAI platform. This behaves like any other chat AI interface that you might know with a few added bonuses. One of the coolest features that I think is really nice is this managed token or basically connecting it to a GitHub repository. If you go on your little user button here and you click on managed token, you can add a GitHub token here that needs content and pull request permissions, which will connect SecureAI to your GitHub, to that repository. I've already done this here and I've created an empty repository on the right here that's called Secure Anibious Protein Design Webinar. I've made a main branch that already contains some code, but SecureAI doesn't see this. I've also created an empty branch, which that's what we're going to use to let Secure AI write Nextflow code for us. What I've asked Secure AI to do here is pretty much recreate the first step of the pipeline that I've shown you on the previous slide. I said, Hey, please let's use Bolstgen. I'm pointing it to the GitHub repository so it could look at the code. And we wanna use NF core best practices and NF core style sample sheets. We'll then pass these binders to another process that we'll add later on using Sakira CLI on a virtual machine. And we wanna use Docker as our main container engine. And I also like documentation, so please write a readme and then commit all of these changes. So I'm gonna submit this live now. This might take a second or two to actually work. But what we're gonna see is now SecureAI give us the make a plan, figure out what it has to do to implement this pipeline, this first process, and hopefully write us a first functional step. Here we go. There's a bunch of tasks that SecureAI scheduled for itself. And the beautiful thing with this is that SecureAI actually uses a sandbox environment. So what it will do is it will git clone your git repository. That means you can also connect it to existing git repositories with legacy pipeline code from other languages or with existing Nextflow pipelines that you want to update or add features to. Whatever you can dream of with your Nextflow pipeline, SecureAI can hopefully do it. In the sandbox, it'll pretty much now change the code, potentially run it and test things. This is great and really allows you to work asynchronously. I just have this in my browser. I don't need a dedicated machine to run Nextflow. I don't even need to have Nextflow installed to do anything here. I can immediately get going writing code. However, as you might imagine, especially when working with protein design tools, sometimes we need dedicated infrastructure like GPUs to actually test these tools. As many of these protein design binding tools require NVIDIA GPUs to do what they do best. Now for that, while Sakira is working in the background, I'm going to switch my tab here and I'm gonna go on to Nebius. And on Nebius, you have a bunch of services. The first thing I will use is a virtual machine. Making a virtual machine is extremely easy. I just click on create virtual machine and it gives me the option to either create a machine with GPU or without GPU, make it spot or preemptible or regular. I wanna have this machine available and not be taken from me. And then I can use any reservations I have here, for example, to say I have some GPU capacity and then select what type of GPU I want. I'm gonna go back to, again, show the options of what we have here. I can use an H200, an H100. In this case, I'm gonna use an H100. One of the great things here is I can use the operating system I wanna use, but one of the really cool things is that I can connect it to a shared file system. I'm going to do that because I have a prepared shared file system here that we will use to put some of the databases that some of the tools in our pipelines use and require. And by providing these databases upfront, Nextflow doesn't have to download them every time and get slowed down by this, but can directly access them from the shared file system. By being able to connect this to a virtual machine beforehand, I can pretty much prepare my production type runs using an available, quickly accessible virtual machine. Okay, that's pretty much all we need. I have an SSH key connected to this machine. I'm gonna assign it an IP so that I can connect to it from my computer. And then I'm gonna create this. Since it's obviously gonna take a second to boot up and I prepared stuff for the webinar, I already have a virtual machine running here. And I'm going to switch to that in a second. We're just gonna check-in whether Secure AI is done pushing our changes. It's still running right now. Okay. So, what we're gonna do then, I have prepared, like I said, another branch where we already have some code and going to the right here to my Versus Code environment. So, this Versus Code is now connected to a Nebius virtual machine that has GPU available. So, if I do NVIDIA SMI here, I can see that I have an H100 available. As I mentioned, while the Sakura AI web interface is great for when you wanna do async development, sometimes you need GPU and you wanna actually be in the terminal developing with the AI. For this, we have Seqera CLI. What I will do first is quickly, in that same repository, I'll go back to the main branch, and this one already has that first false gene model in it. I'm gonna start Seqera AI, CLI here. Hide this. And this brings us to the CLI interface of SecureAI. I'm first gonna ask it actually, what is this pipeline doing? And what it will do now is it'll go and look at the pipeline code in this repository that it's in, tell me which files it's looking at, and then hopefully explain quickly what the code in this repository is doing. So when you're looking at a new code base or when you're just developing code from someone else, or you're trying to add features to an existing pipeline, this is really helpful to get a quick overview. We can see here, I have a complete understanding of the pipeline, this is a protein binder pipeline, use Bolsh Gene, what it does. So, this gives us a very nice high level overview. But now I could say, Great. Please add protein MPNN as a method after BolstGene. And I obviously could be a bit more verbose here, but in the interest of time for the webinar, I'm using a very simple prompt here. But this will basically kick off SecureAI to go and add this module to our pipeline code. Again, this can take a bit longer than we have time for here. The point that I want to get across to you is that you can basically seamlessly jump from developing in the web browser using Secura AI web interface to developing on a dedicated machine with GPUs, testing your pipeline should the AI still make mistakes, fix them manually, and really quickly get to the point where your pipeline can run. So, while this is running, I'm actually going to jump to the existing pipeline that we have. I told you I have a copy here of the full nfprotein design pipeline that contains our full Nextflow pipeline code. We don't have to go through details here. The repository is available online for you to look at. What I want to showcase you is that now that I'm on this GPU enabled VM, I can pretty much run this pipeline. I could also ask Care AI to run it for me and check that it runs end to end with a small test dataset. So here we're really running a very, very small test dataset. I have, of course, already run this pipeline previously because running it would take about five to six minutes, which is a bit too long for the webinar. But because Nextflow has this great caching mechanism, which means if it knows it has already produced and processed this data, it will just use the cached results. We can see that it really runs end to end here and that all the steps that we had implemented for the competition are functional. So now this gives us the confidence from going to pipeline development in the browser, testing it on a machine with GPU enabled, we know that this pipeline works end to end on a small test dataset. And the next step now is obviously scaling this up. And instead of doing a few small designs, maybe do hundreds of thousands of designs that run-in parallel on dedicated Kubernetes infrastructure, maybe on hundreds of GPUs, to really get the data processed or create these new binding proteins that you're interested in. Okay, and for this, I will pass it over to Daria, who will walk us through how to set up a Kubernetes cluster on Nebius. Take it away, Daria. Thanks, Florian. Hello, everyone. My name is Daria, and I'm a junior in Canvas. Today, I'll continue administration with deploying Kubernetes cluster and then configuring it to run secure Nextflow pipeline. So let me share my screen. Okay. Now I guess you can see in your news console. And what I'm going to do here, first, I'll create Kubernetes cluster, and I'll start with service account that gives CPU and GPU nodes the permission they need to access cloud services, such as pulling images, mouse, and storage, and so on. So let's start here. I go to service account and create a new one. Let's call it secure service account. And I'll give it admins rights. Okay. Now we can see it here. Then I'll continue with shared file system. A shared file system is stretched that multiple nodes and pods can access at the same time. So we'll need it during a run of our pipeline. It's here. It's shared file system. I'm creating a new one. Let's give it 1,000 gigabytes memory and call it secure file system. Okay. We see the process, and next step is to create a Kubernetes cluster itself. I go to Compute Kubernetes and create cluster. I call it Secure Cluster and we enable the public endpoint so external tools like secure can connect to the Kubernetes API, and I can do it here. Now you can see that cluster is on the way, and for actually our pipeline, we need CPU and GPU nodes. Let's start with CPU nodes. I go here and create first node group. So it will be CPU node group. I'll stay with number three without GPU, and I'll configure preset with CPU two and sixteen CPUs for every node. What else? About size. So disk size defines how much local storage a node has and also affects disk performance, like IOPS and bandwidth. Now we stay with 96. We don't need more, but in some larger pipelines, we can change this. And I'm attaching shared file system that I just created. And I'll change my tag to data so further with persistent volume claim, I can connect it with this tag. What else? Here I use service account that I created before. And I guess that's it. So I'm creating first group, and then I'll continue with second node group. Second node group will be GPUs. I call it GPU node group. And here I will use auto scaling nodes. It means that we can start with zero to, let's say, two nodes. And when we actually have GPU workload, these nodes start running and work on our tasks. When we don't have GPU workload, they just stop running, and we don't pay anything for this. So we work with GPU, and we reserved some capacity before, so I'll use specific reservations. Everything is right here, so we will use two node two nodes with h 100 with eight GPU per node. Mhmm. Everything is fine. So disk is on place, and I'm attaching the same shared file system. And same mount tag data. And finally, I use security service account and create note group. Okay. That's fine. We see both notes group in provisioning state. Let's just check quickly what's going on here. I think it's it is on the way, our CPU node group. And meanwhile, I'll go to configure cluster from command line. So for that, I will need project ID. I can find it here. And here you see the plan. So what I gonna do? First, I connect AWS CLI within project ID and run it in Compline. Next, I'll connect CLI with existing cluster. So I'll go how to connect and just copy paste this comment. Mhmm. Everything seem to be fine. Next, I'm going to mount SharedVal system to pods of Kubernetes cluster. So first, we set up shared storage. We install CSI driver called CSI mounted FSPath using Helm. So these these steps are here. Okay. Everything is on the way, and then I create namespace and apply PVC claim. The code is here. So here we create persistent volume and persistent volume claim with 100 gigabytes of storage, and we use read or write many so multiple pods can read and write to the same storage. And I'm using these commands to apply. Okay. Seem to be fine. And next step, I'm preparing cluster for the CK Platform. For this, first, I need to apply Tower Launcher. This code is here. So create service account and give its permission to create and manage bots, jobs, and other sources needed to run workflows. Mhmm. So I'm running this code. And after that, finally, generate token for the service account, and secure uses this token to connect to the Kubernetes cluster and launch workflows. Okay. Now you can see token on the screen, and that's it. So next step is to go back to secure website and to actually use this cluster to run the pipeline. Florence, they just Yes. Thank you, Daria. Okay. Perfect. Thank you so much for this walkthrough of how to set up a Kubernetes cluster on Nebius. I'm going to share my screen again, and I'm jumping right into a secure platform now. As Daria has prepared the Kubernetes cluster for us and it's running, the next thing we have to do is basically tell Seqera Platform how to interact with it. The window on the left you can see here, this is Seqera Platform. I'm in a workspace on a Seqera Labs called Protein Design. You can find Sakeira platform on sekeira.io on the top right when you click on platform. As you can see, we already added two pipelines to our LaunchPad here. A Hello World pipeline I used to test some things beforehand and NF Protein Design pipeline. This is exactly the same pipeline that I was running earlier on our GPU enabled VM. Now, to enable and use the Kubernetes cluster in Seqera, there are two things that we have to do. The first one is we have to write some credentials here so that we can talk to that Kubernetes cluster via the service account. And the second one is adding a compute environment that lets us define where these pipelines should run. Now, can do this in the UI and I can say, Hey, I wanna add workspace credentials here. But I actually wanna show you how I can do that using Secure AI as well. Because Secure AI uses your login from the Cara platform, you can ask it to do things for you on the platform end. So I'm going to open Secure AI CLI. I wrote it wrong, sorry about that. Start it up here. I'm gonna say, Can you please add credentials for me for Kubernetes? Using this command. And this is the command that Daria used to get the token. And I'm just telling SecureEye specifically that it should use this command so it doesn't use any other thing that it could come up with from Kubernetes. And it asks me for approval now. It wants to run a kubectl command that will basically get the secret. I'm gonna say yes. See the secret here. And now it should use the Seqera API to interact with Seqera platform on my behalf. It knows that it's in this workspace. It has my user permissions. And we should see a new Kubernetes credential pop up here in a second. I already added one in the preparation of this webinar, of course, so that we can run on an existing Kubernetes cluster, as the one that Daria has created is still setting up. It takes a bit longer than from her handing it over to me to have this cluster available. So I'm going to set up credentials and a compute environment here, but for actually running the pipeline, we are going to use one that has been already set up. You can see that it's calling a secure API call here. And if I refresh the page, Tower Launcher Kubernetes token has been created. So, I can really use secure IO also as an assistant to help me set up infrastructure, launch pipelines, debug pipelines on the SecureAI platform as well. The next thing is the compute environment. Okay, let's say, awesome. Can you also create a new K8 compute environment for me using these new credentials, please? And now it's gonna check what compute environments exist. It might tell me that there is already a Kubernetes compute environment, or it'll go directly and create it. Nice thing here is, if I wanted to create that computer environment by hand, I would have to select the platform, Kubernetes, give it a name, and there's a bunch of configuration for Kubernetes that we have to do. What is the control plane URL of the cluster? What's the SSL certificate? And I could go and find all of this information, but no, I could also let SecureAI find this information for me and automatically enter it via an API call. Now, this is obviously a small isolated toy Kubernetes cluster, and I'm not suggesting you do this necessarily on your production organization cluster, because there might be rules preventing this. I just wanna show you what is possible with agents and AI CLIs today. It's basically finding out all of this information that we've asked for, that I've just shown you in the create compute environment field, and collecting that information and then making an API call to create that compute environment for me. Okay, it should only take about a few more seconds here. And then once that cluster is done, we're actually going to go and launch our pipeline with a larger dataset on the Kubernetes cluster. Okay. It says it created it, and here we go. A new Kubernetes cluster for us without me pretty much doing anything. It's quite quite handy. Okay. Now that we have our cluster registered with Sakara, we're going to jump back to the launchpad and launch NF Protein Design. I've added this previously, but adding this could also be done via Secure and just pointing it at the GitHub repository and saying to add this pipeline to the Launchpad. In interest of time, I've done this previously. And when we click launch, we're basically presented with a configuration page. Again, you could have SecureAI launch this for you and give it natural language commands of what are the parameters you wanna use. For this part, I'm actually gonna do it in the UI. I'm gonna check that we're using the production Kubernetes cluster that I previously created that works. We have a shared file system mounted on Scratch, which contains some of the references that we require. And I registered a dataset, so a sample sheet that contains the designs that I care about. It's the actual designs we used in the competition, and we want to run these. I'm gonna write the results to Scratch. Webinar results. And one more thing, I'm gonna point it to this gen models, to the actual path that contains the checkpoints for Walsh Gins, so that it doesn't download them on each run. Then I can click launch, and we're gonna see that this pipeline is being launched on our runs page. The really cool thing with Kubernetes is because it's hot notes and they're already running, we don't have to wait for anything to spin up. The pipeline will immediately kick and start off here and start running. I've already, and it failed because I probably misconfigured something, but I've already created a previous run here that can show you some of the things that we can see in Secure Platform. You can see that the same kind of metrics that we had in VM are visible. We can jump straight into any task and look at the execution log of the tasks here, which is really handy when you're debugging things and you're looking at what might've gone wrong. We can also have some reports here that could show us things. And ultimately, when we're done with all of this, what we care about are the proteins that we created. We can use Data Explorer here to directly look at these protein renderings and then find out what kind of binders we created. So, Zakira really allows you here to go from the small pipeline that you've developed on computer, on your virtual machine, scale it up, run it across thousands of pods or nodes on your Kubernetes cluster, and use Sakira AI to guide that interaction with a platform for you and help you debug things. Okay, I'm going to go back to the slides now. So, what Sakira platform really offers is scale. It allows you to run thousands of workflows without any operational chaos. It provides governance, which means you know what your teams are doing. Only the people that should have access to specific compute infrastructure have access to that infrastructure. It gives you traceability. All these workflow runs in your organization are visible to the people that need to know them, to the stakeholders. It also provides security boundaries. Secure platform allows you to connect to your compute environments, to your infrastructure, like we did just now with Nebius. The data is on our Nebius account, not at Sakeira's accounts. Same is true for any other cloud provider. It provides reproducibility by being Nextflow native, and it seamlessly integrates with the tools you are using today. With that, we believe Sakeira is the infrastructure for modern science. Okay, now to sum up what we've done today. We've built an EXFO pipeline and using secara.ai, we're able to do this in days rather than weeks. And we then deployed that pipeline after testing it on a small scale on a virtual machine to a full scale Kubernetes cluster that can have hundreds of thousands of GPUs and really run extremely large workloads. Now, I've mentioned that we've participated in this binder competition. We actually ranked sixth on the in silico metrics, which was quite exciting to us. Unfortunately though, the actual in vitro validation of adaptive showed that our protein that we designed did not actually bind to the target in vitro. What was very cool to me though, was that the region that we targeted was actually targeted by another team with a slightly different binder design. And they actually were the strongest binding possible, which suggests to me that in general our pipeline wasn't too bad. I'm not a domain expert in protein design, but using these AI tools, was able to make an end to end pipeline that created a result that was realistic enough that it ranked on some in silico metrics and was actually tested in vitro. And the important part, again, is not the actual tools we used, but the approach using Nextflow and using AI driven development with some domain expertise to drive these types of developments. If you are interested in trying out Secura AI yourself, if you have a Secura platform account, you can scan these barcodes, they'll bring you to the pages. You can immediately use SecureAI today. If you don't have an account, you can create one. If you are an enterprise customer, talk to your account executives on how we can enable using SecureAI in your organization. And to round it up, I wanna highlight some of the events that we have going on at Secure. We have some really cool Secure sessions coming up on the American West Coast in San Diego and San Francisco. We also have one for the European people in London. And then we have our annual Nextflow Summit Boston happening at the April. And for all of you that are just getting started with Nextflow and would like some training, there's a lot of material online that you can self serve at any time, but we also have an online training where people participate from all over the world happening in the May. With that, it's time for any of the questions you have, and thank you for your attention and joining us today. And I'll get Daria on stage with me in case there are any questions specific to Nebius. Okay. Let's go through some of the questions that have been asked. Is there any integrations with other clouds besides Nevis? I think I should probably start at the earliest question that was asked. Will there be a recording of this webinar? I believe so. I don't know where it is. It's gonna be on YouTube, Lizzie. It's gonna be on YouTube, yes. So we do have expertise with Nextflow and MLOps to train and update models. Contact us, send an email to our secure support or to our marketing department, and we're happy to give you more information on that. Does Secure IO use Nebius infrastructure? Yes. So we basically ran the pipeline now on the Kubernetes cluster that was running on Nebius by basically connecting that service account to Secure Platform. And so the service account was doing actions on our behalf, submitting the pipeline, monitoring the pipeline. But everything, the data, everything that was done was on the actual Nebius infrastructure. So for data privacy reasons, all of the data stays on the actual cloud object storage or shared storage, file storage, wherever that is, and in that region. Can this be done locally instead of using a browse and Sakura IO login? So if with this you mean Sakura AI, then the answer to that is no, because Sakura AI needs a Sakura login to check your credentials, to connect you to your Sakura platform resources, and also ultimately to take your credits when you do reach that limit of the daily free usage. So there's some daily free usage, but if you go over, obviously at some point, we're going to charge you for the usage and to be able to bill that, it goes over your Sakira platform account. Can I use Sakira AI on my local laptop to build a pipeline? So, you can definitely use it on your local laptop. Like I said, You do need a Seqera login, though, for that. But there's absolutely no issue with running it locally on your machine rather than running it on a virtual machine, as I did in this webinar. How can we rerun locally? Okay, think we've answered that question. Is there a way to connect editors like Cursor to this model, or do you have to use the CLI? Yeah, so we are obviously aware that not everyone wants to adapt a new CLI. Maybe they wanna keep using Claude or Codex or any other model. We actually have a page where we show you how you can install Secura AI as a skill in your existing coding agent, and that basically will use it effectively as a sub agent. So you can use it in your existing agents like Claude Code or Codecs. What model is SecureAI based on? So, at the moment we use anthropic models. So, that means we're running on Haiku, Opus, and Sonnet state of the art models. Where are the cached results stored in Nebius? So, the cached results mentioned here, I believe, are the Nextflow cache, when I was showing that the pipeline ran very quickly because the results already existed. There are two answers to this. The first one is, when I ran it on the virtual machine, those cached results were on the disk that was mounted to the virtual machine by default. So if I delete that virtual machine, those cached results are gone. I can also define where the cache will be written, and I did that for the Kubernetes cluster, and I wrote it to the shared file system. So in this case, even if the virtual machine goes away, that shared cache is still there, and I could boot up a new virtual machine that has access to the cache. That is basically on Kubernetes how you can allow Nextflow to still have resume functionality, even though the pods and the virtual machines that are spinning up will die over time and be replaced by other pods. Can you share the prompt? If you go to the video recording, you can see the prompt. There was really nothing extremely special about it, to be honest. I don't think you need to be a prompt engineer to give proper prompts nowadays with AI models like Secura AI to give you good results. But check back the recording in case you really wanna see how I prompted it. Is there integration with any other clouds beside Nebius? Yeah, as I've mentioned on some of the slides in the beginning, and as you could maybe see when I was making the compute environment, Sekera integrates with definitely all of the major clouds like AWS, Google Cloud, Azure, and a bunch of others. You can go to the Sakera homepage to find out if the cloud that you're specifically interested in is supported, and if not, you can let us know and we can see if there's already development work going on to support the cloud you care about, or if there's enough demand that it warrants development efforts. Is there a secure plugin for PyCharm JetBrains supported? I don't believe so. We do have a Versus Code plugin. I don't know if those Versus Code plugins work in PyCharm or JetBrains. We'll have to pass that to someone else that knows that. If I configure a pipeline with Seqera, how easily can I run it local with less resources? Extremely easy. It's the same pipeline. You just need to provide a different configuration that tells it the limited resources you have. If some of the processes are requiring by default a lot of resources, that can overwrite that. And so, as long as the pipeline is actually possible to run on your computer or your machine, you can provide configuration that will allow you to do that. Okay, I think I actually made it to the last question here. Since the file system is all connected, would you get any error reports directly on Sekera? I believe the question is if there is a problem with the file system. So yes, we would get the same errors that you would get if we would try to run Nextflow with a connected file system. Often, these would be permission issues or any other things that went wrong when you configured the file system. And yes, you can see them in the Nextflow log that is directly exposed in the Secure Platform interface, so you do not have to necessarily be connected to a terminal at the same time as you are connected to a secure platform. Okay, those were all the questions, I believe. I'm gonna check with my backstage stuff if there's anything else we should address. No? Okay, well with that, I would say Okay, so we do have one question for Ilya, so I'm gonna let Ilya answer that one. Yeah, just replied in the chat. The thing is you can set up various limits within the user interface. So there's quite a lot of functionality behind it and you can have a look at that. Perfect. Awesome. Well, thank you all so much again for joining us today. If you have any questions, don't hesitate to reach out to us at Sekera or at Nebius. Happy to connect you with the right sources and answer any of your questions async. I hope you have a great day and a great rest of your week, and see you again soon. Bye bye. Thanks everyone. Thanks, bye.