tl;dr
- Announced major compute innovations: Graviton4, Trainium2/3, and P6 instances with NVIDIA Blackwell, delivering significant performance improvements for both general and AI workloads
- Revolutionized storage with S3 Table Buckets for Iceberg tables (3x better query performance) and S3 Metadata for instant data discovery and analytics
- Launched Aurora D SQL and enhanced DynamoDB Global Tables, enabling truly distributed databases with strong consistency and 4x faster performance than competitors
- Introduced Amazon Nova AI model family and enhanced Bedrock with automated reasoning checks and multi-agent collaboration, making GenAI practical for production
- Reimagined SageMaker as a unified platform for data, analytics, and AI with Zero-ETL capabilities, representing a fundamental shift in how enterprises handle data and AI workloads
Contents
Knowledge Graph
graph LR
subgraph AWS_Core[AWS Core Services]
Compute --- Storage --- DB[(Databases)] --- AI --- Analytics
end
subgraph Compute_Services[Compute]
Compute --> EC2
EC2 --> Instances[Instance Types]
Instances --> |New| P6[P6 with NVIDIA Blackwell]
EC2 --> Hyperpod
EC2 --> Chips[Custom Silicon]
Chips --> Nitro
Chips --> Graviton4
Chips --> Trainium
Trainium --> Trainium2
Trainium --> |2024| Trainium3
end
subgraph Storage_Services[Storage]
Storage --> S3
S3 --> |New| TableBuckets[S3 Table Buckets]
TableBuckets --> IcebergTables[Iceberg Tables]
IcebergTables --> Performance[3x Query Performance]
S3 --> |New| S3Meta[S3 Metadata]
S3Meta --> AutoUpdate[Real-time Updates]
S3Meta --> MetadataQuery[Queryable Metadata]
end
subgraph Database_Services[Databases]
DB --> Aurora
DB --> DynamoDB
Aurora --> |New| DSQL[Aurora D SQL]
DSQL --> MultiRegion[Multi-Region]
DSQL --> LowLatency[Low Latency]
DSQL --> StrongConsistency[Strong Consistency]
DynamoDB --> GlobalTables[Global Tables]
GlobalTables --> |New| GTConsistency[Strong Consistency]
end
subgraph AI_Services[AI & ML]
AI --> Bedrock
AI --> |New| Nova[Amazon Nova]
AI --> SageMaker
Bedrock --> |New| AutoReasoning[Automated Reasoning]
Bedrock --> |New| MultiAgent[Multi-Agent Collab]
Bedrock --> ModelDistillation
Bedrock --> RAG
Nova --> NovaModels[Nova Models]
NovaModels --> Foundation[Foundation Models]
Foundation --> Micro
Foundation --> Lite
Foundation --> Pro
Foundation --> Premier
NovaModels --> |New| Canvas[Nova Canvas]
Canvas --> ImageGen[Image Generation]
NovaModels --> |New| Reel[Nova Reel]
Reel --> VideoGen[Video Generation]
NovaModels --> |Coming| Speech[Speech-to-Speech]
NovaModels --> |Coming| AnyToAny[Any-to-Any]
end
subgraph Developer_Tools[Developer Tools]
AmazonQ[Amazon Q] --> QDev[Q Developer]
AmazonQ --> QBusiness[Q Business]
QDev --> |New| Transform[Q Transformation]
Transform --> Windows[Windows .NET]
Transform --> VMware[VMware Workloads]
Transform --> Mainframe[Mainframe]
QDev --> AutoAgents[Autonomous Agents]
AutoAgents --> Testing
AutoAgents --> Documentation
AutoAgents --> CodeReview
QDev --> |New| OpsTools[Operations Tools]
OpsTools --> Investigation[Issue Investigation]
QBusiness --> Automation
QBusiness --> QIndex[Q Index]
QIndex --> ThirdPartyAPI[3rd Party API Access]
end
subgraph Analytics_Platform[Analytics]
Analytics --> Redshift
Analytics --> EMR
Analytics --> OpenSearch
Analytics --> |New| ZeroETLApps[Zero-ETL for Apps]
SageMaker --> |New| UnifiedStudio[Unified Studio]
UnifiedStudio --> |New| SagemakerLH[SageMaker Lakehouse]
SagemakerLH --> IcebergCompat[Iceberg Compatible]
UnifiedStudio --> DataCatalog[Data Catalog]
end
Opening and Community Updates
Welcome and re:Invent Overview
- Please welcome the CEO of AWS Matt Garman.
- [MUSIC]
- Hello everyone, and welcome to the 13th annual AWS re:Invent.
- So awesome to see you all here.
- Now this is my first event as CEO, but it’s not my first re:Invent.
- I’ve actually had the privilege to have been to every re:Invent since 2012.
- Now, 13 years into this event, a lot has changed.
- But what hasn’t is what makes re:Invent so special.
- Bringing together the passionate, energetic AWS community to learn from each other.
- Just hearing you this morning and as you walk through the halls, you can feel the energy.
- I encourage you all to take advantage of this week to learn from each other.
Event Scale and Community
- This year we have almost 60,000 people here in person and another 400,000 watching online.
- Thank you to everyone who’s watching.
- We have 1900 sessions for you all to attend and almost 3500 speakers.
- Many of those speakers and sessions are led by customers, partners and AWS experts, many of whom are in the audience today.
- Thank you so much.
- You're sharing your content and your expertise is what makes re:Invent so special.
- Thank you.
- [APPLAUSE]
- Re:Invent has something for everybody.
- It has stuff for technologists, for executives, for partners, students and more.
- But at its core, re:Invent is a learning conference really dedicated to builders and specifically to developers.
AWS Heroes and Developer Community
- In fact, one of the first things that I did when I took over this role was to spend a little bit of time with our AWS Heroes.
- [APPLAUSE]
- A shout out to the Heroes who I believe are sitting over here.
- I can hear them already.
- Our Heroes are some of our most dedicated and passionate AWS developers.
- The way the whole AWS developer community has grown is really incredible.
- We now have 600 user groups all around the world, spanning 120 different countries.
- The entire AWS community is many millions all around the world.
- One of the great things about that community is we get feedback from you that goes directly into the products.
- This feedback informs what we build and announce here today.
Startup Initiatives and Early Enterprise Adoption
- In 2006 is when I first started AWS, and when we launched the business our very first customers were startups.
- Startups have a really special place in my heart.
- One of the things I love about startups is they are anxious to use new technologies.
- They will jump in and they give us great feedback.
- They push us to innovate, they innovate on top of us, and they move really fast.
- We learn a ton from startups.
- Now with generative AI, there is never a more exciting time out there in the world to be a startup.
- Generative AI has the potential to disrupt every single industry out there.
- When you look at disruptors, disruption comes from startups.
- It’s a fantastic time if you’re a startup, to really be thinking about how you disrupt industries.
- I’m excited to announce that in 2025, AWS will provide $1 billion in credits to startups globally.
- As we continue to invest in your success.
- [APPLAUSE]
Early Enterprise Journey
- While startups were our first customers, it actually didn’t take long for enterprises to catch on.
- Enterprises quickly realized there’s a ton of value in the cloud.
- Today, the largest enterprises across every single industry in the world, every single vertical, every single size, every single government are running on AWS.
- Millions of customers running in every imaginable use case.
- But it wasn’t actually always this way.
- Let me share one story from early on.
- It was early on in our AWS journey, and we took a trip to New York to visit some of the banks.
- They were really interested in what this whole cloud computing thing was.
- We sat down with them and they were very curious.
- We outlined our vision for how cloud computing could change how they run their IT and technology.
Core Infrastructure Innovation
AWS Building Blocks and Security Foundation
- When you're innovating, one of the important things to remember is you really want to start with the customer.
- You want to ask them what’s important to them, but then you don’t just always deliver what the customer asks for.
- You want to invent on their behalf.
- We call this starting with the customer and working backwards.
- The original AWS vision document that we wrote was written in 2003.
- At the time, there were a lot of technology companies building bundled solutions that would try to do everything for you.
- What they ended up doing is there were these big monolithic solutions that would do everything fine.
- It was good enough, and we had this observation that good enough shouldn’t be all you strive for.
- You want the best possible component.
Compute and EC2 Evolution
- Today, AWS offers more compute instances than any other provider by a long shot.
- It all started with EC2.
- Some of you might know I actually used to lead the EC2 team for many years.
- Technically, I’m probably not allowed to say I have favorites, but I really love EC2.
- The good news is lots of customers love it too.
- EC2 has more options, more instances, and more capabilities.
- It helps you find the exact right performance for the app or for your workload that you need.
- At this point, we’ve grown to where EC2 has 850 different instance types across 126 different families.
Graviton and Trainium Developments
- In 2018, we saw a trend in compute.
- We were looking out there and we saw that ARM cores were getting faster.
- Most of them were in mobile, but they were getting more powerful.
- We had this idea that maybe we could combine that technology curve with our knowledge of what’s most important to customers.
- We decided to develop a custom, general purpose processor.
- At the time, this was a very controversial idea.
- Today, Graviton is widely used by almost every AWS customer out there.
- Graviton delivers 40% better price performance than x86.
- It uses 60% less energy that is fantastic.
- You can both reduce your carbon footprint and get better price performance.
Latest Compute Innovations
- Today I’m happy to announce the P6 family of instances.
- [APPLAUSE]
- P6 instances will feature the new Blackwell chips from NVIDIA.
- They’ll be coming early next year.
- P6 instances will give you up to 2.5 times faster compute than the current generation of GPUs.
- Today I’m excited to announce the GA of Trainium Two power GCN Two instances.
- [APPLAUSE]
- Trainium Two instances are our most powerful instances for generative AI.
- These are purpose built for the demanding workloads of cutting edge generative AI training and inference.
Trainium and AI Acceleration
- [APPLAUSE]
- It’s fantastic to see how some of the most innovative companies anywhere in the world are leaning in and leaning into Trainium Two for their cutting edge AI efforts.
- One of the biggest, most innovative companies in the world is Apple.
- Here to talk about how Apple and AWS are working together over a long term partnership to accelerate the training and inference that they use to build their unique features in for their customers is Benoit Dupin, Senior Director of Machine Learning and AI from Apple.
- [MUSIC]
- Thank you Matt, good morning everyone.
- [MUSIC]
- I’m happy to be back today as a customer, I spent many great years at Amazon where I got to lead product search.
- Ten years ago I joined Apple and I now oversee our machine learning, AI and search infrastructure for the company.
- This includes our platform for model training, foundation model inference, and other services used across series search and more.
- At Apple, we focus on delivering inference experiences that enrich our users lives.
- What makes this possible is hardware, software and services that come together to create unique experiences for our users.
- Many of these experiences come as part of the device we make, and some run in the cloud like iCloud Music, Apple TV News, App Store, Siri, and many more.
- One of the unique elements about Apple business is the scale at which we operate, and the speed with which we innovate.
- AWS has been able to keep the pace, and we have been customers for more than a decade.
- They consistently support our dynamic needs at scale and globally.
- As we have grown our efforts around machine learning and AI, our use of AWS has grown right alongside.
- We appreciate working with AWS.
- We have a strong relationship and the infrastructure is both reliable, performant and able to serve our customers worldwide.
- And there are so many services we rely on, we could not even fit them all on this tiny screen.
- As an example, when we needed to scale inference globally for search, we did so by leveraging AWS services in more than ten regions.
- More recently, we have started to use AWS solutions with Graviton and Inferentia for ML services like approximate nearest neighbor search and our key value streaming store.
- We have realized over 40% efficiency gains by migrating our AWS instances from x86 to Graviton, and we have been able to execute some of our search text features twice as efficiently.
- After moving from G4 instances to Inferentia Two.
- This year has marked one of our most ambitious years for machine learning and AI to date, as we have built and launched Apple Intelligence.
- Apple Intelligence is personal intelligence.
- It’s an incredible set of features integrated across iPhone, iPad and Mac that understand you and help you work, communicate, and express yourself.
- Apple Intelligence is powered by our own large language models, diffusion models, and adapts on device and servers.
- Features include our system wide writing tools, notification summaries, improvement to Siri, and more, including my favorite Genmoji’s.
- And these all run in a way that protects user’s privacy at every step.
- To develop Apple Intelligence, we needed to further scale our infrastructure for training.
- To support this innovation, we needed access to a large amount of the most performant accelerators.
- And again, AWS has been right there alongside us as we’ve scaled.
- We work with AWS services across virtually all phases of our AI and ML lifecycle.
- Key areas where we leverage AWS include fine tuning our models.
- The post-training optimization where we distill our models to fit on device, and building and finalizing our Apple Intelligence adapters.
- Ready to deploy on Apple devices and servers.
- As we continue to expand the capabilities and features of Apple Intelligence, we will continue to depend on the scalable, efficient, and high performing accelerator technologies that AWS delivers.
- Like Matt mentioned, Trainium Two is just becoming GA.
- We’re in the early stages of evaluating Trainium Two, and we expect from early numbers to gain up to 50% improvement in efficiency in pre-training.
- With AWS, we found that working closely together and taking advantage of the latest technologies has helped us be more efficient in the cloud.
- AWS expertise, guidance and services have been instrumental in supporting our scale and growth and most importantly, in delivering incredible experiences for our users.
- Thank you so much.
- [APPLAUSE]
- [MUSIC]
- All right, thanks a lot, Benoit.
- We really appreciate the longtime partnership together and we’re excited about all of those super useful features that you’re delivering that we can all take advantage of.
- Can’t wait to see what you come up with using Trainium Two.
- All right, now, while we’re really excited as you can tell by announcing the GA of Trainium Two today, it turns out that the generative AI space is moving at lightning speed.
- And so we’re not slowing down either.
- We are committed to delivering on the vision of Trainium long term.
- We know that we have to keep up with the evolving needs of generative AI and the entire landscape that you all need from us and your instances, which is why today I’m excited to also announce the next leap forward.
- Today we’re announcing Trainium Three, coming later next year.
- [APPLAUSE]
- Trainium Three will be our first chip that AWS makes on the three nanometer process, and it'll give you two times more compute than you get from Trainium Two.
- It'll also be 40% more efficient, which is great as well.
- So it’ll allow you all to build bigger, faster, more exciting Gen AI applications.
- More instances, more capabilities.
- More compute than any other cloud of silicon innovations and Nitro and Graviton and Trainium.
- We have high performance networking to make sure that you can really the network doesn’t get in your way.
- And this is all why on average, every single day, 130 million new EC2 instances are launched every day.
Storage Solutions and S3 Enhancements
- Pretty incredible.
- Now, every day we continue to reinvent what compute means in the cloud.
- But of course, your applications don’t stop with compute.
- So let’s move on to our next building block, and that’s storage.
- Now, if every application needs compute because it provides all that processing power of course every application also needs storage because that’s where your data lives.
- Now, I know it’s a long time ago, but some of you may actually remember what storage used to be like before AWS.
- You would have a room and you just have all these boxes, storage boxes, and you’d fill up a box and you’d have to have another one, and then you’d have to have another one.
- Then you’d have to go back to the first one, because you have to replace some disks, and it was really hard to manage.
- And in today’s world it would actually be almost impossible to keep up with the scale that you all have of your data in.
- Back in 2006, we envisioned a better way we could just provide simple, durable, highly scalable, secure storage that virtually could scale with any application.
- And so we launched S3.
- It's our very first service that we launched in 2006, and it fundamentally changed how people manage data built from the ground up to handle explosive growth over the last 18 years.
- S3 now stores over 400 trillion objects.
- That is just incredible.
- It’s really hard to get your head around what that is.
- Here’s one interesting thing.
- Ten years ago, we had fewer than 100 customers that stored a petabyte.
- A petabyte’s a lot of storage.
- We had fewer than 100 customers that stored a petabyte of storage inside of S3.
- Today we have thousands of customers, all storing more than a petabyte and several customers that are storing more than an exabyte.
- And this scaling is something that customers today just take for granted.
- It’s not just scaling though.
- Data today is just exploding.
- And the scaling part you largely assume S3 takes care of it for you.
- The next thing however many of you do have to worry about is cost.
- But it turns out we make that easier for you too.
- No one gives you more options to balance the performance you need from your storage together with the cost.
- We have a ton of SKUs to help you with this.
- Things like S3 Standard, which is highly durable, and it’s good for the majority of workloads that you regularly access.
- But when you have objects that you don’t access that frequently, we have S3 Infrequent Access that allows you to lower your costs.
- We have things like S3 Glacier, which can further reduce costs by up to 95% for objects like backup and archive that you don’t need to access very much at all.
- Now, customers have told us that they love having all of these different SKUs and all of these different stops that help them balance cost and performance, but it’s also a lot of work can make it complex to figure out if you should use this SKU or this SKU.
- And so we decided to make that easier.
- A couple years ago we launched S3 Intelligent-Tiering.
- What S3 Intelligent-Tiering does is it analyzes the access patterns for your storage, and it automatically moves your data to the right tier.
- Since we launched S3 Intelligent-Tiering without having to do anything, customers have saved over $4 billion with zero additional work on your part.
- It’s pretty awesome.
- And what’s really powerful is when you can completely eliminate all of that complexity and just focus on growing your business.
- And so then when you’re managing gigabytes of data or petabytes of data or even exabytes of data, S3 can handle it for you.
- And part of that benefits of that scale, the performance that cost and ease of use and advanced capabilities is why S3 is underlying more than a million data lakes all around the world.
- Now data lakes support large analytics workloads.
- Things like financial modeling, things like real time advertising and AI workloads and S3 over the years has delivered a number of data lake innovations.
- They’ve given translate transactions per second increases so you can support faster analytics.
- They added support for strong consistency.
- They added lower latency SKUs so that you could get quicker access in the cloud.
- And oftentimes when I talk to customers, it’s funny.
- You’ll say, what is it that you like best about S3?
- And what they’ll tell me is, you know what I like best about S3? S3 just works.
- And that is why we take that as quite a compliment.
- But as you know, AWS is never satisfied.
- And so the S3 team stepped back and said, how can we make S3 work better?
- And so they thought about how can we make S3 improved for supporting large analytics and AI use cases.
- And first, actually, let’s take a step back and understand a little bit about how you think about your analytics data.
- It’s first helped to understand how it’s organized.
- So most analytics data is actually organized in tabular form.
- It’s a highly efficient way to work with a lot of different data that you want to query.
- And Apache Parquet has effectively become the de facto open standard for how you store tabular data in the cloud.
- And most of that is stored in S3.
- In fact, because it’s such a good fit for data lakes, Parquet is actually one of the fastest growing data lake or data types in all of S3.
- Now, when you have a bunch of these Parquet files and many customers have millions, actually some customers in AWS have billions of Parquet files that they store.
- You want to do things like query across them, and so you need a file structure to support that.
- And today most people use Apache Iceberg to support this.
- Iceberg is an open source, highly performant format that allows you to work across all of these various file formats and Parquet files that you have and enables some really useful things.
- It enables SQL access across this broad data lake, so you can have different people in your organization using various different analytics tools.
- Maybe they’re using Spark or Flink or whatever, and they can all safely work on the data without having to worry about messing up each other’s workloads.
- Iceberg is a super useful open source construct for this, and but it enables a lot of these capabilities.
- A lot of customers will tell you that as many open source projects are, Iceberg is actually really challenging to manage, particularly at scale.
- It’s hard to manage the performance.
- It’s hard to manage the scalability.
- It’s hard to manage the security.
- And so what happens is most of you all out there hire dedicated teams to do this.
- You do things to take care of things like table maintenance.
- You worry about data compaction.
- You worry about access controls.
- All of these things that manage that.
- You go into managing and trying to get better performance out of your Iceberg implementations.
- So we asked the question, what if S3 could just do this for you?
- What if we could just do it automatically?
- Well, I am thrilled to announce the launch of a new S3 type S3 Table Buckets.
- [APPLAUSE]
- And S3 Table Buckets.
- This is S3 Tables.
- It’s a new bucket type specifically for Iceberg tables.
- And what this does is we basically improve the performance and scalability of all of your Iceberg tables.
- If you store all your Parquet files into an Iceberg, into one of these S3 Table Buckets, you get three times better query performance.
- You get ten times higher transactions per second compared to storing these Iceberg tables in a general purpose S3 bucket that is massive performance for really doing no additional work.
- Now how this works is S3 does this work for you.
- You put it here and we’ll automatically handle all the table maintenance events, things like compaction, things like snapshot management, all of that undifferentiated things.
- We’ll remove unreferenced files to help manage the size and all of those things are going to be continually optimized.
- We’re continually optimized that query performance for you and the cost as your data lake scales.
- So it’s pretty like this is a fantastic thing where S3 is completely reinventing object storage.
- Specifically for the data lake world to deliver better performance, better cost, and better scale.
- I think this is a game changer for data lake performance, but performance is actually only a small part of the equation.
- You all know as your data volume scales, it actually gets harder and harder to find the data you’re looking for.
- And so as you get really large, as you get petabytes of data, metadata becomes really important.
- Metadata is this information that helps you organize and understand information about the objects that you store in S3.
- So you can really find what you’re looking for, whether you have petabytes or exabytes of data.
- That metadata helps, but you have to have a way to look at it.
- I’ll just use an example of why metadata is useful on my phone.
- I don’t know about you all.
- I have tons of photos and so I actually went and said I want to find an old photo of me from an old re:Invent so it wasn’t that hard.
- I actually searched for Las Vegas.
- I searched for 2001 and I quickly found this.
- Now this is a picture of me and Don MacAskill, who’s sitting right here in the front row, CEO of SmugMug.
- And Don was actually our very first S3 customer back in 2006.
- Thank you, Don.
- [APPLAUSE]
- Now, how did I quickly find this photo?
- I don’t know, my phone just automatically added metadata, right?
- It added the location.
- It added the dates of the photo when it was stored, and so it was easy for me to search it.
- You need a way to find this data easily, but when you’re doing it in S3 today, it’s actually really hard.
- You have to build a metadata system where you have to build, first of all, a list of all of your objects that are in storage, and then you create and manage this event processing pipeline, because you’re going to have to figure out how you add metadata and associate it with all your S3 objects.
- And so you basically build these event processing pipelines that do this.
- You store the metadata in some sort of database that you can query, and then you develop code to keep these things in sync.
- So as objects change or added or deleted, you keep the metadata up in sync to that.
- Now at scale, you can imagine first of all, this is undifferentiated heavy lifting, and it’s pretty impossible to manage at scale.
- Now there’s a better way.
- I’m excited to announce S3 Metadata.
- [APPLAUSE]
- S3 Metadata is the fastest and easiest way for you to instantly discover information about your S3 data, so this just makes sense.
- What we do is we take when you have an object, we take the metadata that’s associated with your file, and we make this easily queryable metadata that updates in real, near real time.
- How does it work?
- We take all of your object metadata and store it in one of these new table buckets that we talked about.
- So we automatically store all of your object metadata in an Iceberg table.
- And then you can use your favorite analytics tool to easily interact and query that data.
- So you can quickly learn more about your objects and find the object you’re looking for.
- And as objects change, S3 automatically actually updates the metadata for you in minutes.
- So it’s always up to date.
- We think customers are just going to love this capability, and it’s really a step change in how you can use your S3 data.
- We think that this materially changes how you can use your data for analytics, as well as really large AI modeling use cases.
- Super excited about these new S3 features now.
- From day one, we’ve been pushing the boundaries of what’s possible with cloud storage.
- We’ve helped many of you grow to just unprecedented scale.
- We’ve helped you to optimize your cost, and we’ve helped you to get unmatched performance.
- And now we make it incredibly easy for you to find the data that you’re looking for.
- But I will tell you, we’re never done.
- Our promise to you is that we will keep automating work.
- We’ll keep simplifying all of these complex processes that you have, and we’ll keep reinventing storage so you all can focus on innovating for your customers.
- Now let’s hear from another startup customer to see how they’re reinventing their own industry.
- [MUSIC]
- In 1995, the way you existed digitally was to have a website. - In 2015, it was to have a mobile app you could install on your phone in 2025. - The way you're going to exist digitally is to have an AI agent.
- We moved from an era of rule based software to an era of software built on goals and guardrails.
- We’re in the age of conversational AI, and the way you exist digitally is to have a conversation with your customer any time of day 24/7, and that’s what we’ve built with Sierra on Amazon Web Services.
- [MUSIC]
- Old software was based on rules.
- If you think about a typical workflow automation, it looked like a decision tree.
- You had to enumerate every possibility that your customers could do.
- With AI it’s different, and with Agent OS, you can model your company’s goals and guardrails to build any customer experience that you can imagine.
- What does it mean if 90% of your customer experience is conversational?
- [MUSIC]
- It's remarkable how easy it is to start a company thanks to services like Amazon Web Services.
- We can focus on where we want to add value and really riding the coattails of the incredible investment that Amazon has made into their cloud infrastructure.
- [MUSIC]
- I think the companies that have run successful experiments today will be the ones that, five years from now, have business transformation driven by AI.
- I think the way to deal with technologies like that is to dive in and learn as quickly as possible.
- [MUSIC]
- [APPLAUSE]
- Very cool.
- Thanks, Brett.
- I think customers are really going to love that.
- All right.
Database Evolution
- I want to shift our focus to another important building block databases.
- Now, early on in AWS, we saw an opportunity to really improve how databases operated.
- It turns out databases were super complicated, and there was a ton of overhead in managing them.
- And customers spent a lot of time doing things like patching and managing.
- And we knew that there was a lot that we could take on for them.
- So we set off to remove this heavy lifting.
- We launched RDS, the first fully managed relational database service.
- And when you talk to customers today, they’ll tell you they are never going back to an unmanaged database service.
- They love managed databases.
- Now when we first launched RDS, the vast majority of applications out there in the world were running on relational databases, but it turns out the nature of applications was evolving.
- A bit and with the Internet applications started to have more users.
- They were increasingly distributed all around the world, and customers have started to have very different expectations around performance and latency.
- We ourselves experienced this at Amazon.com in our retail site.
- So back in 2004, we had a couple engineers who realized that over 70% of our database operations were just simple key value transactions, meaning we’d run a simple SQL query and you’d get the primary with the primary key, and you’d get a single value back.
- And we asked ourselves why are we using a relational database for this?
- It seems overweight and the thought of the team at the time was maybe it would perform like we could make this faster.
- We could make it cheaper, and we could make it scale better if we could build a purpose built database.
- So those engineers, two of you whom you see up here, are our very own Swami and Werner, who are beginning keynotes later this week, wrote what is now called the Dynamo paper about a technology that spans that really spawns the NoSQL movement.
- And it also led us to develop DynamoDB.
- Dynamo is a serverless, NoSQL, fully managed database that gives you single millisecond latency performance at any scale, scales completely up and all the way down.
- But Dynamo is just the first purpose built database that we built.
- We got really excited about that, and we started building lots of purpose built databases from graph databases to time series databases to document databases.
- And the idea here was you all needed the best tool for the best job.
- And that’s what these databases all provided.
Database Evolution and Aurora Updates
- Now, these NoSQL databases in this wide swath of purpose built databases have been incredibly popular.
- They have enabled workloads that otherwise just wouldn’t have been possible.
- And you all have loved them.
- But it turns out that sometimes the best database for the job is still relational.
- And so relational didn’t go away.
- It’s still by far the best solution for many applications.
- So we kept innovating there as well.
- You asked us to build your relational database with the reliability of commercial databases out there, but with friendlier licensing terms and the portability of open source.
- And so we’re actually celebrating the ten year anniversary of launching Aurora at re:Invent, celebrating ten years.
- [APPLAUSE]
- Of Aurora is, of course, fully MySQL and Postgres compatible, and it delivers 3 to 5x the performance that you get from self-managed open source, all at one tenth the cost of commercial databases.
- It’s no surprise, really, that Aurora became our fastest growing, most popular service with hundreds of thousands of customers.
- But we didn’t stop innovating, of course.
- And here’s just a sample of the innovations that we’ve delivered in Aurora over the years.
- We delivered serverless so you could get rid of managing capacity.
- We delivered I/O optimized for Aurora to give you better price, performance, and better predictability of price.
- We gave you limitless database, which allowed you to have completely unlimited horizontal scaling of your databases.
- And we’ve added vector capabilities in AI inside of Aurora to help with Gen AI use cases.
- And there’s been many others, and we continue to push the boundaries of cost, performance and ease of use and functionality.
- So the team took a look at all of these innovations, and they sat down with some of our very best data based customers, and they asked them what would a perfect database solution look like?
- Like if you just take away the constraints, what would a perfect database look like?
- And the customers told us, look, we assume you can’t give us everything, but if you could, we’d like a database that had high availability.
- There was, of course, running multi-region that offered really low latency for reads and writes, offered strong consistency, had, of course, zero operational burden for them, and of course had SQL semantics.
- Now that is a lot of ands.
- And, you know, a lot of people will tell you you can’t have everything.
- In fact, how often are you given the choice when you’re trying to build something or you’re trying to say, do you want A or B?
- And the problem and this is an interesting when you have to pick A or B, it actually kind of limits your thinking.
- And so at Amazon that’s not how we think about it.
- In fact we call that the tyranny of the or.
- It creates these false boundaries, right?
- You instantly start thinking I have to do A or B, but we push teams to think about how you do A and B, and it really starts to help you think differently.
- Now look, there are databases out there that will give you some of these capabilities already.
- Sometimes you can get a database today that’ll give you low latency and high availability, but you can’t get strong consistency out of those.
- Now there’s other database offerings that are global and have strong consistency and high availability in across multiple regions.
- But for those the latency is really, really high.
- And forget SQL compatibility with those.
- So we challenged ourselves to go solve for the end.
- And it turns out because we control the end to end environment for Aurora, right?
- We control the engine.
- We control the infrastructure, we control the instances, everything we can change a lot of things.
- And so one of the things we first did is look at the core database engine, how it works in combination with our global footprint, to see if that might help us deliver.
- And so the first big problem we needed to tackle, if we were really going to deliver all these capabilities, is how would we achieve multi-region, strong consistency while also delivering low latency.
- That is a really hard problem.
- So you’ve got these apps that are writing across regions, right?
- And when you do that, the transactions need to be sequenced in the right way so that your app so that all of your applications are guaranteed to read the latest data right.
- But when you do that, you typically take a lock on the data to avoid conflicts so that you can write all back and forth.
- Now it turns out you can actually do this today with the database engines and how they operate.
- But it’s incredibly slow and actually, I’ll take a second to explain why.
- So a typical let’s just say we have an active active database setup across two regions.
- And we want to complete a transaction much like this one.
- This transaction has about ten statements to it, which is, I think pretty average for a database statement and a traditional database.
- What would happen is you’re in a single region or a single location, and you’d do ten commits between the application and the database.
- And if you’re all in the same location, the latency is really fast.
- And this just works fine.
- And that’s how databases are operated and how they’ve built from the beginning.
- But now let’s say you’re doing that across regions.
- It becomes really slow communication actually has to go back and forth ten times.
- Right.
- And so before that can actually commit.
- So in this example let’s say we have a database running in Virginia.
- And another one running in Tokyo okay.
- The round trip is about 158 milliseconds between Virginia and Tokyo.
- Now in this example, that data has got to go back and forth ten times, right, to commit all every single one of those bits of the transaction.
- That's 1.6 seconds.
- And if you add more regions, it actually becomes even slower.
- So that is way too slow for today’s applications.
- For most use cases.
- Now, given how databases operate, this is a bit of a physics problem.
- Unfortunately, you’re all going to have to wait for us at AWS to solve the speed of light until a future re:Invent.
- But today we are going to look at fundamentally how you change how this database engine works.
- We had this thought what if we built an architecture that could eliminate all those different round trips?
- If you didn't have to do those, you could reduce the latency by 90%.
- Instead of a 1.6 second transaction, you could have a 158 millisecond transaction.
- So we developed a whole new way to process transactions.
- We separated the transaction processing from the storage layer.
- So you don’t need every single one of those statements to go check at commit time.
- You instead you do the single on commit.
- We parallelize all of the writes at the same time across all of the regions, so you can get strong consistency across regions with super fast writes to the database.
- However, those of you that are paying attention might have noticed this introduces a second major problem that as you’re writing all of those independently across various regions, how do you get all those transactions to commit in the order that they occurred?
- Because if that doesn’t happen, you get corruption and bad problems happen.
- You have to make sure all of those are ordered correctly.
- Now, again, in theory, this architecture would work great if your clocks were perfectly synced because in a in a traditional database, you just simply look at the timestamps and you can make sure that those are all in ordered.
- But as you have these databases spread around the world, you have to deal with this problem that’s known as clock drift.
- What happens?
- And many of you are sure aware of this, is you get times that are all slightly out of sync.
- And so it’s actually hard to know if the time over here is the same as the time over here.
- Having those perfectly synced is easier said than done, but fortunately we control the global infrastructure all the way down to the component level.
- And so we added this building block called the Amazon Time Sync to service EC2.
- What we did is we added a hardware reference clock in every single EC2 instance all around the world.
- And those hardware reference clocks sync with satellite connected atomic clocks.
- So that means that every EC2 instance now has microsecond precision, accurate time that's in sync with any instance anywhere in the world.
- Now that’s about as tech dive deep as I’m going to go today.
- And Werner is going to go a lot deeper in his talk on Thursday.
- So if you’re interested, I encourage you to check out.
- But the net is that now that we have microsecond precision time and this redesigned transaction engine, all the pieces are there for us to deliver what we need to avoid those or trade offs and deliver on the end.
- So I am really excited to announce Amazon Aurora D SQL.
- [APPLAUSE]
- This is the next era of Aurora Aurora DE SQL is the fastest distributed SQL database anywhere and delivers the next generation of end Aurora DSQL delivers virtually unlimited scale all across regions, with zero infrastructure management for you and a fully serverless design that scales down to zero.
- Aurora SQL delivers five nines of availability.
- It’s strongly consistent.
- You get low latency reads and writes, and Aurora SQL is Postgres compatible, so it’s really easy to start using today.
- So we want to see how this new offering would compare against Google Spanner, which is probably the closest offering out there today.
- So we did a multi-region setup and we benchmarked committing that same, just that same ten statement transaction that we saw earlier.
- And it turns out that Aurora delivers four x faster reads and writes than Spanner.
- Pretty awesome.
- And we’re really excited to see how you’re going to leverage this in your applications.
- [APPLAUSE]
- But one more thing.
- It turns out that relational databases are not the only ones that benefit from multi-region, strongly consistent, low latency capabilities.
- So I’m also pleased to announce that we're adding the same multi-region, strong consistency to DynamoDB Global Tables.
- [APPLAUSE]
- So now whether you’re running SQL or NoSQL, you get the best of all worlds active, active, multi-region databases with strong consistency, low latency, and high availability.
- This type of core innovation in these fundamental building blocks is why some of the biggest enterprises in the world trust AWS with their workloads.
JPMC Cloud Journey
- One of those companies is JPMorgan Chase.
- In 2020, we had JPMC CIO Lori Beer on stage to talk about how they were starting their cloud migration to AWS.
- Now, over the past four years, the team at JPMC have been doing a ton of work to modernize their infrastructure, and I’m really excited to welcome back Laurie to share where they are in their journey.
- Please welcome Laurie Beer.
- [MUSIC]
- Good morning.
- JPMorgan Chase is a 225 year old institution that serves customers, clients, businesses and governments across the globe.
- Our stated purpose is to make dreams possible for everyone, everywhere, every day.
- And we do this at tremendous scale.
- We serve 82 million customers in the US, financing home ownership, education and other family milestones.
- We bank more than 90% of Fortune 500 companies and every day we process $10 trillion of payments.
- All of this is why we invest $17 billion in technology and have an ambitious modernization agenda to drive growth.
- We have 44,000 software engineers who run more than 6000 applications and manage nearly an exabyte of data from markets, customers, products, risk and compliance, and more.
- Some of you may remember I spoke at re:Invent four years ago and it’s great to be back to update you on our progress and while the industry has evolved dramatically over the past four years, the core principles of our cloud program have not.
- We are still focused on establishing a strong security foundation that is resilient and represents our robust regulatory framework.
- Prioritizing modernization across both the business and technology, enabling innovative services like AI and serverless to drive new product development and accelerate our go to market.
- Being thoughtful and prioritizing migration for the most impactful use cases.
- These principles have enabled to us support and drive business growth.
- Today, we process more than 50% of e-commerce transactions in the United States.
- Think about the volume from this past weekend, which kicked off the busiest shopping season of the year.
- When our business is that critical to the world economy, resiliency is core in everything we do.
- This is why we have been reinventing the way we build global banking and payments infrastructure, and why we’ve pushed the element of the art of the possible in the cloud.
- We’ve been on our cloud journey with AWS for several years.
- Our first apps leveraged key services like EC2, S3, EKS a fast forward a few years.
- In 2020, we reached our milestone of 100 applications on the cloud.
- In 2021, we doubled that and expanded into Europe, including launching our consumer bank Chase in the UK.
- Built from the ground up on AWS in 2022, we started using AWS Graviton chips and saw increased performance benefits, and in 2023 we started leveraging GPUs and by then had nearly a thousand applications running on AWS, including core services like deposits and payments.
- Today, we’re actively unlocking GenAI use cases and working with AWS on their Bedrock roadmap.
- The way we have architected on the cloud has allowed us to modernize our business platforms and build brand new ones, enabling us to continuously innovate.
- Three examples of modernization can be seen in our markets and payments businesses, as well as Chase.com for our flagship platforms and markets and payments.
- Massive amounts of elastic compute and modern cloud services have helped us analyze risk and market volatility.
- It also enables us to be one of the largest payment processors in the world.
- Two years ago, we successfully migrated Chase.com, our flagship consumer app that also powers our mobile experience.
- We migrated it to AWS, reducing costs and uplifting our resiliency.
- How did we do it?
- We use active, active, active configuration across multiple AWS regions.
- The strong resiliency posture allows two out of three regions to fail without customer impact, but geo optimized routing also improved our overall customer experience.
- Our strong partnership with AWS ensures that the infrastructure stack is frequently and automatically refreshed to improve our risk and resiliency posture and meet our security controls.
- We also launched a new business which leverages AWS Fusion, offers a data management platform and analytics across investment life cycles, improving interoperability across multiple data sources, and providing our institutional clients access to extensive foundations of data at scale.
- It is just one example of a data management solution we have at the firm I mentioned earlier.
- We have nearly an exabyte of data, which makes it one of our most critical assets, especially as we increasingly embed AI in the way we modernize and build our technology with the help of AWS data management tools like Glue, our data is discoverable, accessible, interoperable, and reusable on our secure end to end data and AI platform.
- This platform is empowering us to build the next wave of AI applications at the firm.
- SageMaker is helping us simplify the model development life cycle from experimentation to model deployment.
- In production.
- It is the foundation of our firmwide AI platform, which we designed to be repeatable with an extensible architecture so that our data scientists can leverage a range of best in class solutions from the ecosystem.
- Over 5000 employees use SageMaker every month, and we’re now starting to explore Bedrock, with the goal of providing data scientists with seamless access to more models that can be fine tuned on our data.
- Our goal is to use GenAI at scale, and we continue to learn how to best leverage these new innovative capabilities within the regulated enterprise.
- We rolled out LLM C-suite, our internal AI assistant to about 200,000 employees, and we’re already seeing value from some of our other GenAI use cases.
- Our bankers and advisors receive AI generated ideas to better engage with clients.
- Our travel agents leverage LLMs to help build and book trip itineraries for our customers.
- Our contact center reps summarize, call transcripts, and provide insights at scale, and our developers are using AI code generation tools.
- We’re currently exploring ways to leverage developer AI agents in parallel.
- There’s never been a more exciting time to be leading an enterprise through a transformation, giving all the technology that providers like AWS are bringing to market.
- I’m excited for the next wave of technology evolution, and I’m proud that JPMorgan Chase is well positioned to deliver the future of financial services.
- Thank you.
- [APPLAUSE]
- [APPLAUSE]
- [MUSIC]
- Thanks a lot for coming back.
- It’s great to see how far your teams have come in the last four years, and it’s awesome to see how the modernization efforts are actually paying dividends in increased agility.
- But also it’s really setting the team up at JPMC to take advantage of new technologies like generative AI.
AI and Developer Experience
Bedrock and Model Announcements
- All right.
- So we talked about wanting this set of building blocks that builders could use to invent anything that they could imagine.
- And we also talked about how in many of the cases that we’ve walked through today, that we’ve redefined how people thought about these as applications changed.
- Now people’s expectations are actually changing for applications.
- Again, with generative AI.
- And increasingly, my view is generative AI inference is going to be a core building block for every single application.
- In fact, I think generative AI actually has the potential to transform every single industry, every single potential to transform every single industry, every single company out there, every single workflow out there, every single user experience out there.
- And look at what’s already happening.
- If you look at in the finserv world, they’re already using generative AI to detect market manipulation.
- You see drug companies like evolutionary scale that talked earlier.
- We’re using this to discover new drugs faster than you ever could before.
- Personally, I love watching football.
- On Thursday night football and I can see before a play happens a next gen stats predictions when a blitzer is maybe going to rush the quarterback.
- It’s pretty fun to watch that and see how you can change those user experience.
- And we’re just at the beginning now, today, when you hear a lot of customers talk about it, they’ll talk about applications and generative AI applications.
- But increasingly I think inference is going to be part of every single application.
- There’s not going to be this divide.
- Every application is going to use inference in some way to enhance or build or really change an application.
- And if you’re going to really do that, it means you need a platform that can deliver inference at scale.
- It means you’re going to need tools to help you integrate that into your data.
- And you’re going to have to have all the right performance, all of the right security, and all of the right cost.
- Now, this is why we built Bedrock.
- Bedrock is by far the easiest way to build and scale generative AI applications, but one of the things that Bedrock is particularly good at, and where it’s really resonated with customers, is it gives you everything you need to actually integrate generative AI into production applications, not just proof of concepts.
- And customers are starting to see real impact from this.
- Let’s take, for example, Genentech.
- They’re a leading biotech and pharmaceutical company, and they were looking at how they could accelerate drug discovery and development using a bunch of scientific data and AI to rapidly identify and target new medicine and biomarkers for their trials.
- But finding all this data requires scientists to scour through huge number of sources, like the PubMed library of 35 million different biomedical journals, public repositories like the Human Protein Atlas and their own internal data source that has hundreds of millions of different cells that have data around them with Amazon Bedrock, Genentech designed a GenAI system where scientists can actually ask the data detailed questions.
- They can ask the data what cell surface receptors are enriched in specific cells.
- In inflammatory bowel disease, a question I’m sure many of you are very interested in asking frequently, but for them, it’s really critical because this system can identify the appropriate papers and data from this huge library.
- And it synthesizes all the insights and data sources.
- It summarizes where it gets the information and cites the sources, which is incredibly important for scientific reasons and traceability.
- And they have that data that they can go and do their work.
- This was a process that used to take Genentech scientists many weeks just to do one of these lookups, and now it can be done in a matter of minutes.
- Genentech is expecting to automate nearly five years of manual effort and ultimately deliver new medications to customers.
- More quickly.
- Now, tens of thousands of customers every day are using Bedrock for production applications.
- That is nearly five x the growth in the last year alone.
- And it’s not just AWS customers directly.
- Many of the world’s leading ISVs like Salesforce and SAP and Workday, are integrating Bedrock deep into their customer experiences to deliver GenAI applications to all of their end customers.
- So why exactly is everybody using Bedrock?
- Part of it is we had this observation that it’s not just one model that everyone was going to want to use.
- There is a lot of different models that people were going to want to take advantage of.
- Some customers wanted open weights models like a Llama or a Mistral that they could customize.
- Some customer applications needed image models like those from Stability or from Titan, and many customers are really enjoying using the latest Anthropic models, which many consider to be the best performing models in the market today.
- For general intelligence and reasoning.
Bedrock and Model Announcements
- But this is a space where innovation is happening really fast.
- There’s almost new releases almost every single week, new capabilities, new models, new updates, and new costs.
- But actually, with all that innovation and all those models, it's still actually surprisingly hard to find a perfect model for your use case.
- A lot of times what you want is you want the right mix of expertise for what you’re trying to accomplish, with the right mix of latency and the right cost.
- But getting some of those is hard, right?
- Sometimes you’ll have a model that has the right expertise, right?
- It’s this really smart model that’s really good, but it's more expensive than you'd like, and it's probably a little slower than you need for your application.
- Other times, you’ll find a model that’s faster and cheaper, but it’s not as capable as what you need for now, one way that people are solving for this is called model distillation.
- And what model distillation does is you take this large frontier model.
- In this example, it’s a Llama 405B model, and you take this highly capable model and you send it all your prompts, right?
- All of the questions that you might want to ask it.
- And then you take all of the data and the answers that come out of that.
- And together with the questions you use that to train a smaller model, in this case, Llama 80 model, to be an expert at that one particular thing.
- So you get this smaller, faster model that does know the expert on how to answer these one particular set of questions in the right way.
- This actually works quite well to deliver an expert model, but it requires ML experts to do.
- It’s actually pretty hard.
- You have to manage all of these data workflows.
- You have to manage that training data.
- You have to tune model parameters and think about model weights.
- And it’s pretty challenging.
- We wanted to make all of that easier.
- So today I’m happy to announce Model Distillation in Bedrock.
- [APPLAUSE]
- And distilled models can run 500% faster and 75% cheaper than the previous model that they got distilled from.
- This is a massive difference, and Bedrock does it completely for you.
- This difference in cost actually has the potential to completely turn around the ROI.
- As you’re thinking about if a generative AI application works for you or not, right?
- It changes it from being too expensive to you can't roll it out in production to actually flipping to be really valuable for you.
- And Bedrock does all of this work for you.
- You simply send Bedrock your sample prompts from your application, and it does all of the work.
- All right, so at the end of the day, you now have a custom distilled model with the right mix of expertise, latency and cost.
- But getting the right model is just the first step.
- The real value, the real value in generative AI applications is when you take your enterprise data and bring it together with a smart model, that’s when you get really differentiated and interesting results that matter to your customers, your data and your IP really make the difference.
- And one of the most popular ways of adding your data into a model together and pulling it together is a technique called retrieval augmented generation, or REG.
- And what this does is it helps your models deliver more relevant, accurate and customized responses that are based on your enterprise data.
- Now earlier this year, we launched knowledge bases, which is a managed RAG index as part of Bedrock.
- What it does is it automates all of your data ingestion and retrieval and augmentation workflows so that you don’t have to put these things together, fully managed.
- All you do is you point knowledge bases at your data source and we automatically convert that to text embeddings and then we store them in a vector database for you.
- So you’re ready to go.
- And all the retrievals will also automatically include citations.
- So you know where the information came from and increases your level of understandability.
- Now, knowledge bases is one of the most popular features in Bedrock, and we’ve been adding a ton of new features.
- We’ve expanded the support for a wide range of formats, and we’ve added new vector databases like OpenSearch, and Pinecone support.
- Okay, so Bedrock now gives you you can see it’s building these tools, right?
- It allows you to get the right model.
- It allows you to bring your own enterprise data.
- Next, you’re going to want to be able to set up boundaries of what applications can do and where the responses look like.
- And for that, we launched Bedrock Guardrails.
- Bedrock Guardrails make it really easy for you to define the safety of your application and for you to implement responsible AI checks.
- They’re basically guides to your models and so you only want your generative AI applications to talk about the relevant topics.
- Let’s say, for instance, you have an insurance application and you have customers coming and asking about various insurance products you have.
- You’re happy for it to answer questions about policy, but you don’t really want it to answer questions about politics or give health care advice.
- Right?
- You want to have these guardrails that say, I only want you to answer questions in this particular area.
- And this is a huge capability as you think about building, again, production applications.
- And this is where this is why Bedrock is so popular.
- If you remember, last year, lots of people were building proof of concepts where these things weren’t that important, right?
- It was okay to have models just do cool things.
- Now that you're really integrating these deeply into your enterprise applications, you need to have a lot of these capabilities as you move to production applications.
- But one of the things that actually stops people from moving generative AI into real production, right into the things that are mission critical, is there’s one more problem that lots of people worry about, and that is hallucinations now, because in reality, as good as the models are today, sometimes they get things wrong.
- And so when you did a proof of concept last year or the year before it was okay, 90% was okay.
- But when you really get down into the details and a production application, that’s not okay.
- Let’s take the insurance example.
- Let’s say you walk into your bathroom in the morning and you see that you’ve sprung a leak and there’s water all over the floor.
- So you go to your insurance website and you want to know if it’s covered by your insurance.
- Right.
- You need as the insurance company, if somebody is asking you if an incident is covered by their insurance, you kind of need to get that right.
- You need to answer it correctly.
- That’s one where you can’t sometimes get it wrong.
- And so we asked a group of people at Amazon to actually think, do we have any technologies around that?
- We think that could be applied in new or different ways that could help us solve this problem?
- And so the team looked at a variety of different techniques, one of which is called automated reasoning.
- Now, automated reasoning is actually a form of AI that can prove something is mathematically correct.
- And it’s typically used to prove that a system is working as specified.
- Right?
- So automated reasoning works really well when you’ve got something that has a really, really large surface area that’s too big to manually look at.
- But and you have a corpus of knowledge of how the system is supposed to work.
- And when it’s really, really important that you get the answer right now, it turns out at Amazon AWS, we have some of the most capable and deepest experts in automated reasoning anywhere in the world, and we use it in AWS behind the scenes and a number of our services.
- We use automated reasoning to prove, as an example that the permissions and access that you all define in your IAM policies are actually implemented in the way that you intend them.
- We call this approach provable security.
- In S3 we actually use automated reasoning.
- What we do is we use it to automatically check scenarios in the software that makes up the vast majority of the big chunk of S3 storage systems, and we check those with automated reasoning before deployment that includes validating things like correct behavior in response to unexpected events.
- We do this to ensure that we aren’t introducing risks to availability or durability, and that all of those remain protected.
- And we use it automated reasoning in a number of different areas.
- And so we thought, is there a chance that this technology could help us with model correctness?
- Spoiler.
- Since I’m talking about this on stage right now, the answer is obviously yes.
- So today I’m happy to announce Amazon Bedrock Automated Reasoning Checks.
- [APPLAUSE]
- Automated reasoning checks prevent factual errors due to model hallucinations.
- So when you implement one of these automated reasoning checks, what happens is Bedrock can actually check that the factual statements made by models are accurate.
- And this is all based on sound mathematical verifications.
- And it’ll show you exactly how it reached that conclusion.
- So let’s take this insurance example one more time as the insurance company, you decide to implement automated reasoning checks.
- So what you do is you upload all your policies, right.
- And then the automated reasoning system inside of Bedrock automatically develops rules.
- And then you go through a bunch of iterative processes, usually takes about 20 or 30 minutes maybe to tune the right responses to really say, yes, this is how it works, or this is, and it asks you questions so that it really understands how the policies work.
- Now go back to my bathroom example and I have this leak.
- Automated reasoning can see the result come in.
- And if the model isn’t actually sure that the answer is right, it’ll actually send it back and say suggest other prompts or give you as the customer ideas of how you can send these back to the model.
- And once the automated reasoning checks are assured that the answer is right, only then would you send it back to the customer, so you can be 100% sure that you’re sending accurate results to your customers.
- This is a capability you cannot get anywhere else, and we think it's going to really help customers as they start building inference into mission critical applications.
- Now, there is a ton of value that customers get from GenAI use cases today, and we think a bunch of these capabilities help them add these capabilities to more and more applications.
- But and there’s this buzz out there today and we agree with it, that the next big leap in value is not just about getting great data, but it's about taking actions and doing something.
- And for that, we have Bedrock Agents.
- Bedrock makes it really easy to build agents and create agents that can execute tasks across all of your company systems and your data.
- By using Bedrock, you can quickly build agents simply by using natural language describing what you want it to do, and the agent can then handle things like sales orders, or compile financial reports, or analyze customer retention examples.
- Now, what we do is we use model reasoning behind the scenes.
- It breaks down the workflows, and then the agents are able to call the right APIs and execute the action that you want to do.
- Now, today, these actions actually or these agents work quite well for simple tasks, a simple, isolated task where it can go and accomplish these tasks.
- And actually it’s quite valuable.
- And customers are getting a lot of value out of Bedrock Agents already.
- But the feedback we get is customers want more.
- They want to be able to do complex tasks across maybe hundreds of different agents and doing them in parallel.
- But that’s super hard and almost impossible to coordinate.
- Today.
- Let’s use an example again.
- Let’s say you run a global coffee chain and you want to create a number of agents to help you go analyze the risk of opening up a new location.
- So you’re going to create a bunch of agents.
- You might create one that analyzes global economic factors, maybe looks at relevant market dynamics, maybe even builds a financial projection for an independent store.
- And all said, maybe you go create like a dozen agents that can go look at a location and come back with these individual pieces of information.
- It’s actually quite valuable.
- But when they come back, you still have to compile them together, look at how they might interact with each other and then figure out how you compare against a whole bunch of different regions.
- All said, though, it’s manageable.
- However, you’re probably not looking at one location in isolation.
- You probably want to look at hundreds of locations for your potential coffee chain and across these various different geos.
- And when you do that, it turns out all of these agents probably aren’t working in isolation.
- Agent A probably has information that could be relevant to the second agent, and so you actually want them to interact and share information back and forth.
- That gets very complicated.
- If you think about hundreds and hundreds of agents all having to interact, come back, share data, go back that suddenly the complexity of managing the system has ballooned to be completely unmanageable.
- It’s hugely valuable if you could get it to work.
- But really hard.
- So today I’m announcing Bedrock Agents support for multi-agent collaboration.
- [APPLAUSE]
- Now, Bedrock Agents can support complex workflows.
- What happens is you just like in the earlier example, you create these series of individual agents that are really designed for your special and individualized tasks.
- Then you create this supervisor agent and it kind of acts like the think about it as acting as the brain for your complex workflow.
- It configures which agents have access to confidential information.
- It can determine if a task need to be fired off sequentially, or if they can be done in parallel.
- If multiple agents come back with information, it can actually break ties between the multiple of them and send them off to do different tasks.
- It ensures all this collaboration against all your specialized agents.
- Let’s take an example.
- We actually worked with Moody's to use an early beta version of this.
- Their Moody’s is a leading provider of financial analysts analysis and risk management services.
- And when they were testing this multi-agent collaboration and Bedrock, they used it to deliver a proof of concept for an application very similar to our coffee chain example, an application that could generate comprehensive financial risk reports for their customers.
- Now it turns out that before this proof of concept, this is a workflow that would take one of their agents or one of their employees about a week to do to do one of these, they ran this proof of concept with this multi-agent collaboration, they were able to accomplish the same task in one hour, and the ability to seamlessly scale it across any number of companies in parallel.
- That is a fantastic efficiency gain.
- Bedrock takes what was going to be an almost impossible coordination engineering task, and makes it simple to do, and that’s what we’re doing.
- Look, we are still in the very earliest days of generative AI, right?
- We’re already starting to see some incredible experiences.
- We’ve seen some of them here today built using inference.
- And you see that they’re being built as this core part of these applications.
- And all being powered by Bedrock and why it’s because Bedrock gives you all of the best models.
- It gives you the right tools and capabilities, and many of these capabilities you cannot get anywhere else.
- Bedrock is the only place where you can get these game changing results.
- And of course, everything that you use is built from the ground up to have privacy and security built in.
- Because remember, your data and your IP is really what’s differentiating.
- And so it’s super critical to keep that secure and that access private.
- And that’s one of the things that Bedrock was built to support from the ground up from day one.
- And we’re not done.
- I will tell you, this is just a sampling of the new capabilities that we’re announcing this week.
- One of the hardest parts to put in this keynote together was figuring out which of the number of Bedrock announcements that I was going to be able to fit in.
- Fortunately, Swami is going to be talking about a ton more during his keynote, and so I encourage you to check it out tomorrow.
- All right.
Future Vision and Innovation
Andy Jassy’s Perspective
- Customers around the world are taking advantage of AWS to build incredible things with inference as this core new building block.
- But I will tell you, there is one company who probably takes more advantage of AWS building blocks than anyone else, and that’s Amazon AWS has been a critical part of allowing Amazon to innovate and scale over the years.
- Now, to talk more about that, I’m excited to re welcome to the AWS keynote stage, a good friend, the original godfather of cloud computing, and Amazon CEO Andy Jassy.
- [APPLAUSE]
- [MUSIC]
- [APPLAUSE]
- [MUSIC]
- Thank you Matt.
- It is great to be back with all of you.
- Thank you for having me.
- So I’m going to share a little bit about how we’re thinking about AI across Amazon.
- We have been using AI expansively across the company for the last 25 years, but the way that we think about technology and this goes for AI as well, is that we’re not using it because we think it’s cool, we’re using it because we’re trying to solve customer problems.
- And that’s why when we talk about AI, it’s typically less to announce that we beat the best world class chess player in the world.
- And more to allow you to have better recommendations and personalized recommendations in our retail business or to equip our pickers and our fulfillment centers with the optimal path so we can get items to you faster, or to put it in our Prime Air drones, where we hope to deliver items to you in less than an hour.
- In a couple of years, or for our just walk out technology and our Amazon Go stores, or to fuel Alexa, or to provide you the 25 plus AWS AI services so you can build great applications on top of our services.
- We prioritize technology that we think is going to really matter for customers.
- And with the explosion of generative AI in the last couple of years, we’ve taken that same approach.
- There is a ton of innovation, but what we’re trying to do is solve problems for you.
- What we think of as practical AI.
- And so what have we seen so far?
- The most success that we've seen from companies everywhere in the world is in cost avoidance and productivity.
- And you see lots of companies having gains there, but you also are starting to see completely reimagined and reinvented customer experiences.
- And we see these same trends when we look at the applications that we’re building inside of Amazon around generative AI.
- So I’m going to give you a few examples.
- So take customer service.
- We have a retail business with a few hundred million customers.
- They occasionally need to contact customer service.
- The vast majority of them prefer to do it in a self-service way, so they can do it quickly and take care of it themselves.
- And we had built a chatbot many years ago, and it of course used machine learning, but it had static decision trees and the customers had to endure a lot of words before you got answers.
- So a couple of years ago, we rebuilt this using generative AI and so now it’s much easier for customers.
- So imagine I ordered an item a couple of days ago.
- I get on the chatbot the new chatbot.
- We know who you are that you ordered a couple of days ago, what you ordered, where you live.
- And we can predict in this model if you’re contacting us just a couple of days later, that you might be contacting us about a return.
- And so when you start to tell us that we quickly can tell you where the nearest physical location at a Whole Foods or somewhere else is that you can return that item, and then the model is also smart enough to be able to predict when you’re getting frustrated with it, and you might need to be connected to a human for resolution.
- Now this chatbot before we re-engineered it already had very high customer satisfaction, but since we added the generative AI brain to it 500 basis points, better customer satisfaction.
- That’s practical AI.
- We’re take sellers.
- We have about 2 million sellers who sell in our retail store.
- Worldwide, it’s over 60% of the units that we now sell.
- And the way they get a product onto the website is they have to fill out this very long form.
- And the reason there are so many fields is we’re trying to make it easy for our customers to navigate and understand what the products are, but it’s a lot of work for sellers.
- And so we rebuilt the tool.
- We basically built a brand new tool using generative AI such that now sellers only have to enter a few words, or they can take a picture, or they can point to a URL and then the tool fills in a lot of those attributes.
- It’s much, much easier for sellers.
- And we have over 500,000 sellers now using our generative AI tools.
- Or look at inventory management.
- So think about the scale of the problem we have to solve in our retail business.
- We have over 1000 different buildings or nodes as we call them.
- And everything we do is optimized to get the right product in a fulfillment center or building close to the end customer to save on transportation time, which means we get it to you faster and we do it for lower cost.
- And so that means that any one point we have to understand what’s in that fulfillment center, what’s the inventory levels of each item, which items are being ordered, and at what rate do we have more capacity in that fulfillment center?
- Do we need to move inventory around to other fulfillment centers to balance the network?
- And so we’ve used transformer models to solve these problems and make forecasts.
- And already our long term demand forecasting transformer models have improved that accuracy by 10%.
- And then we've also improved the regional prediction accuracy by over 20%.
- Those are big gains at our scale.
- Or think about robotics.
- We have over 750,000 robots roaming our various fulfillment centers, and they have all sorts of AI in them.
- But I’ll give you the example of Sparrow, which is a robotic arm that does re sorting.
- And so if you’ve got a chance to zoom out of our fulfillment centers, it’s really an operation that’s constantly taking items from lots of different disparate parts.
- And aggregating them into containers.
- So we optimize the capacity we have and the conveyance that we have.
- And so what Sparrow is doing is it’s taking items from one bin and it’s aggregating them into another bin.
- And so what the generative AI needs to do in Sparrow is it has to tell them what’s in the first bin, what item do we want them to go pick up.
- It has to discern which item is which.
- It has to know how to grasp that item, given the size of it and the materials and the flexibility of that material.
- And then it has to know where in the receiving bin it can put it, and so these are these are all inventions that are critical to us changing the processing time and the cost to serve our customers.
- So we have about five of these brand new robotics inventions that we’ve pulled together in our Shreveport, Louisiana fulfillment center just a couple of months ago that we launched.
- And already we're seeing 25% faster processing times, and we believe we're going to have 25% lower cost to serve during the holidays because of these AI inventions and our robotics.
- So these are all examples inside of Amazon of cost avoidance and productivity that are having a real impact.
- But we’re also seeing altogether brand new shopping experiences that were able to generate and invent with generative AI.
- So a few examples.
- I’ll start with some agents.
- Let’s start with Rufus, which is our shopping agent.
- So if you’re going to buy an item and you know what you want, I would argue that there isn’t a better experience than ordering it on Amazon.
- Having it shipped very quickly to your home.
- However, if you don’t know what you want and you’re trying to decide, you can obviously do it at Amazon.
- Many of you do.
- Bless you and thank you for doing that.
- However you know it’s you know and you do it through browse nodes and you do it through recommendations that we make and you do it through customer reviews.
- But there’s something nice when you don’t know what you want about going into a physical store.
- And asking a salesperson, you know, telling them what you’re thinking about and having them ask narrow in questions and then pointing you to maybe the couple items that that you want to consider.
- And then you look at those items and you don’t have all the data in front of you, and you ask that salesperson, well, what about this?
- What about that?
- They can answer it quickly.
- If they don’t walk away and then, you know, you have the ability to make those decisions on what you want quickly and what we’re trying to do with Rufus is we’re trying to make that experience even better.
- So with Rufus, you can go to any product detail page, and instead of going through the plethora of information on that detail page, you can ask any question and Rufus will answer it really quickly.
- Rufus will make comparisons for you across products and categories.
- It’ll make recommendations.
- You can make really.
- You can ask really broad questions for recommendations, and it’ll ask narrowing questions.
- So it gets at really what your intent is.
- If you say to Rufus, hey, I want that same golf glove that I always get, can you find that for me?
- The one I’ve ordered.
- Rufus will find it for you.
- You can say to Rufus, give me the order status of the items that haven’t been delivered yet, and you’ll get that, too.
Amazon Q and Rufus Updates
- And one of the nice things about Rufus, relative to a physical salesperson is that Rufus is not going to take another job in another retailer, or it’s not going to start working in another profession.
- Rufus is going to be there with you all this time, getting to know your intent and your interests and what you want better and better.
- Or take another agent and Alexa.
- So when we started Alexa and we shared that our goal and our mission was to build the world's best personal assistant, a lot of people scoffed at that, and they scoffed at it because it’s actually really broad surface area.
- And it’s hard to do.
- I think with the advent of large language models and generative AI, it's quite clear that this is going to exist.
- And I think if you look at Alexa, which has 500 million active nodes between all of the devices that we’ve sold and the way people use this for entertainment and shopping and information and smart home over 500 million active endpoints, we have a real chance with Alexa to be the leader here.
- Now we are in the process right now of rearchitecting the brains of Alexa with multiple foundation models.
- And it’s going to not only help Alexa answer your questions, even better, but it’s going to do what very few generative AI applications do today, which is to understand and anticipate your needs and actually take action for you so you can expect to see this in the coming months.
AI Features and Applications
- Now, in addition to agents, there’s a whole bunch of new features that we’re able to build with generative AI, which are leading to very different customer experiences.
- I’ll give you a few of these.
- There’s a feature we have called Amazon Lens.
- So let’s say that you’re at a friend’s house and you see a planter they have that you admire.
- Because this happens to me very often.
- And you want to know where that planter is from.
- And you ask your friend, and my friend doesn’t know what you can do today is you can plug into a search engine like Amazon or somewhere else, planter hanging, macrame.
- Maybe you’ll get a decent answer.
- Probably not.
- Instead, you can use Amazon Lens and you can take a picture of that item.
- And what Amazon Lens is doing is it's using computer vision and then a multimodal model underneath it to do a search query that leads you right to the right search result in Amazon where you can buy it easily.
- It’s really magical and cool.
- Or take sizing.
- We’ve all had this experience where, you know, we’re buying a shirt and you don’t really know if that brand runs large or small, whether you’d be a medium or large in that shirt.
- What we've built is a large language model that takes all the sizing relationships between the many brands that we have and compare which ones run like each other.
- And when you’re at a new brand, we can make the right recommendation for what size you really should order.
- Very handy, very practical.
NFL Partnership and AI Integration
- And then if you look at what we’re doing in Prime Video, we have a very deep partnership with the NFL.
- We’ve built something together over the years called Next Gen Stats, and we collect 500 million data points every season.
- And then we’ve built AI models on top of that.
- And so you can see some of the features we’ve built.
- We’ve built something called defensive alerts, which shows you which defensive player might blitz.
- The quarterback puts a circle around it, changes the viewing experience.
- Or we can look at different formations and sets and detect where the defense may be vulnerable.
- And so we have a defensive vulnerability feature where we can highlight for viewers where the offense should attack these change the experience for fans.
- And so these are by the way, just a few of the almost 1000 generative AI applications that we are either building or have built inside of Amazon.
Generative AI Lessons
- And we have obviously learned a lot of lessons along the way.
- I thought I’d just share a few of those with you.
- Just three right now.
- The first is that as you get to scale in generative AI applications, the cost of compute really matters and all of our generative AI applications around the world, all of us, it’s primarily been built with one chip in that compute, and people are very hungry for better price, performance and that’s why people are so excited about Trainium two.
- It's actually quite difficult to build a really good generative AI application.
- It’s about you need a good model, but it’s not just the model.
- In addition to the model, you have to have the right guardrails and you have to have the right fluency of the messaging, and you have to have the right UI, and you have to have the right latency, or it’s a really slow, laggy experience and you have to have the right cost structure.
- And I think a lot of times what happens is you build these apps is you use a great model, you do a little bit of work and you think, I have a great generative AI app and it turns out that you really only about 70% of the way there.
- And the reality is, customers don't take kindly to apps that have 30% wonkiness.
Model Selection Strategy
- And then the third thing I would say is that I have been surprised with all of the internal building inside of Amazon, with the diversity of the models being used, we gave our builders freedom to pick what they want to do, and I figured that almost everybody would end up using Anthropic's Claude models because they’re the very best performing models in the world and have been for the last year or so.
- And by the way, we have a lot of our internal builders using Claude, but they’re also using Llama models and they’re also using these models, and they’re also using some of our own models, and they’re also using homegrown models themselves.
- And so this kind of surprised us.
- But in some ways, as I think about it, it doesn’t surprise us because we keep learning the same lesson over and over and over again, which is that there is never going to be one tool to rule the world.
- It’s not the case in databases.
- We’ve been talking about this for ten years.
- People use lots of different relational databases or non-relational databases.
- It’s not the case in analytics.
- You know, I remember 6 or 7 years ago being on stage and we were talking about how everybody thought the TensorFlow was going to be the one AI framework we kept saying they’re going to be a lot of them.
- And there were.
- And it turns out that PyTorch ended up being the most popular one, and the same is going to be true for models.
- And we see this internally.
Nova Models Introduction
- What we’ve noticed as we’ve been building all these applications is that our internal builders have been asking for all sorts of things from our teams that are building models.
- They want better latency, they want lower cost, they want the ability to do fine tuning.
- They want the ability to better orchestrate across their different knowledge bases to be able to ground their data.
- They want to take lots of automated, orchestrated actions, or what people call agentic behavior.
- They want a bunch, they want better image and video.
- They want a whole bunch of things.
- And we share that feedback with our model provider partners.
- And they’re very receptive, but they’re busy.
- I mean, so you guys want a lot there’s a lot to do.
- And so it’s one of the reasons why we have continued to work on our own frontier models.
- And those frontier models have made a tremendous amount of progress over the last 4 to 5 months.
- And we figured if we were finding value out of them, you would probably find value out of them.
- So I’m excited to share and announce the launch of Amazon Nova, which are our new state of the art foundation models that deliver frontier intelligence and industry leading price performance.
- [APPLAUSE]
Nova Model Features
- So in this intelligent set of models, there are four flavors.
- The first is micro, which is a text only model, which means you feed it text and it outputs text.
- It’s laser fast, very cost effective, and our internal builders are really enjoying it for a lot of their simple tasks.
- And then we have three flavors of multimodal models.
- And so multimodal models you you can input text or image or video and you output text.
- And so each of these are in ascending order of size.
- And intelligence.
- The micro lite and pro models are generally available today.
- The premier model will be available in the Q1 time frame.
- So I’m going to share a few benchmarks.
- I’ll just say that we used external published benchmarks whenever we could and when they weren’t available, we did it ourselves.
- We published a methodology on our website.
- So you can try and replicate it if you like.
- So I’ll just share some of the benchmarks.
- So on the micro model you can see it is a very competitive model.
- We if you look at the raw numbers relative to the leading models in this class, Llama and Google's Gemini, I think on the raw numbers, it’s it benchmarks better on all the variables versus Llama and 12 or 13 versus Gemini.
- But if you do statistical significance testing, which we did, we just took all the numbers that were overlapping in the 95% confidence interval.
- And we called those equal.
- So if you look at it that way, which I will moving forward, you can see that we are equal or better on all the benchmarks compared to Llama and Gemini in these in this class of models, if I look at the light model, it’s a very similar story, very competitive.
- If you compare Nova Light to OpenAI's GPT four mini, you can see that we're equal or better on 17 of the 19 benchmarks equal or better on 17 of the 21 benchmarks versus Gemini, and then equal or better on Haiku 3.5 on ten of the 12 benchmarks.
- Haiku isn’t doing images or video yet, so we couldn’t benchmark on as many dimensions.
- But again, a very competitive model.
- And then if you look at pro, same story, if you compare it to GPT four, oh, it’s equal or better on 17 of the 20 benchmarks, it’s equal or better on 16 of the 21 benchmarks versus Gemini, the very best model in this class of models is Sonnet V2 3.5.
- But even here you can see that our pro model is equal or better on about half of those.
- And on the ones that are not, it’s very competitive.
- And you’re going to like the cost and the latency characteristics here.
- And then our premiere model, which will be our largest multimodal model, will be available in the Q1 time frame now.
- So that’s for very competitive, compelling intelligence models.
- But there are some other things that I think are going to really like about these models.
- First, they're really cost effective. They're about 75% less expensive than the other leading models in Bedrock two.
- They are fast. They're the fastest models that you'll see with with respect to latency.
- We’ll also make the Nova models available in the SKU that Peter was talking about last night.
- And the latency optimized inference skew as well, the very fast.
- And then they’re not just integrated, they’re just not in Bedrock, but they’re integrated deeply with all the features and Bedrock that any model provider can use.
- It’s just that this team took the time to do them.
Nova Model Integration
- And so that means that you get fine tuning increasingly a lot of our app builders for generative AI, they want to do fine tuning with labeled examples to make the applications perform better.
- You’ll also be the Nova models are also integrated with the distillation feature.
- The mattress talked about, so you can infuse intelligence of bigger models into smaller models that are more cost effective and lower latency.
- It’s deeply integrated in Bedrock knowledge bases so that you can use RAG to ground your answers in your own data.
- And then also we have optimized these models to work with the proprietary systems and APIs so that you can actually do multiple orchestrated automatic steps agentic behavior much more easily with these models.
- So I think these are very compelling.
- I’m looking forward to taking a shot at them and using them now.
- Customers want to actually do more around generative AI than just with outputs that are text.
- They also have a lot of needs around images and around video, and there’s lots of examples of it, but simple ones are advertising or marketing or trading materials and so we’ve worked hard, you know, it’s expensive.
- There aren't a lot of options out here.
- They're not easy to do yourself.
- And we’ve worked hard on this problem.
Nova Canvas and Reel
- And I’m excited to announce two more models for you.
- First is Amazon Nova Canvas, which is our state of the art image generation model.
- [APPLAUSE]
- And so Canvas, it allows you to import natural language text, get images back in.
- They're beautiful images. - They're studio quality images.
- It allows you to edit images using natural language or text inputs. - It gives you controls for color scheme and layout.
- It has a number of built in controls for responsible use of AI, including watermarking for traceability, as well as content moderation to limit the generation of harmful content.
- And we benchmark this as well.
- We tried to benchmark it versus some of the other state of the art players in this space.
- In this case, we pick, you know, typically what people consider the two leaders here, which are you’ll see DALL-E three and Stable Diffusion 3.5.
- And we benchmarked on the two variables that matter most, which is image quality and instruction following.
- And you can see that Canvas outperforms both of them on both of those dimensions.
- We also did a human evaluation where you saw similar types of results.
- So this is a compelling model.
- And then of course we also want to allow you to have it be easy to generate video.
- And so we’re excited to announce the launch of Amazon Nova Reel, which is our state of the art video generation model.
- [APPLAUSE]
- So again, with Reel, it's studio quality video. - It's really stunning videos that you can create.
- It gives you full control of the camera, lets you have motion control. - It lets you do panning. - It lets you do 360 degree rotation and zoom.
- It also has built in AI controls for safe AI, including watermarking and content moderation.
- We'll launch it with the ability to do six second videos, which works really well for a lot of marketing and advertising on its way up to two minute videos in the next few months.
- We benchmark this as well.
- There aren't really many video generation services that have an API, and then none of them have automated benchmarks.
- So we just took one of the we benchmarked with human evaluation versus one of the leaders here in runway.
- And you can see again that real benchmarks very favorably relative to others.
- So that’s six new frontier models for you.
Future Nova Developments
- What’s going to be next for us in Nova?
- Well, the first thing is the team is going to be working really hard over the next year. - On the second generation of these models.
- But I also have a couple of things that I thought I’d give you a sneak peek into.
- The first is that in the Q1 time frame, we anticipate giving you a speech to speech model, which will allow you to input speech and get speech back very fluent, very fast.
- [APPLAUSE]
- And then around mid-year, we're going to give you an any to any model. - So this is really multimodal to multimodal.
- [APPLAUSE]
- So you'll be able to input text speech images or video and output text speech images and video.
- This is the future of how frontier models are going to be built and consumed.
- And we’re really looking forward to giving this to you.
AWS Model Strategy
- So you may be asking yourself, how should I think about AWS’s model strategy?
- They have very deep partnerships with a number of model providers.
- They have some of their own models now.
- And the way I would tell you to think about it is the way that we always provide you selection in everything we do, which is that we are going to give you the broadest and best functionality you can find anywhere.
- And what that’s going to mean is it’s going to mean choice.
- The reality is that all of you are going to use different models for different reasons at different times, which by the way, is the way the real world works.
- The human beings don’t go to one human being for expertise in every single area.
- You have different human beings who are great at different things.
- You're going to sometimes be optimizing for coding, sometimes for math, sometimes for integration with RAG, sometimes for agentic needs, sometimes for lower latency, sometimes for cost, most of the time, for some combination of these, and at AWS, we are going to give you the very best combination of all of these.
- As we always do, and we think we’ve added some pretty interesting models to the mix today.
- But the great thing is, is that all of these models are available for you in Bedrock, and you can use them in whatever combination you want, and you can experiment and you can change over time.
- And we will give you that selection and that choice today, as well as in the future.
- So with that, I will say have at it.
- Giddy up and back to Matt.
- Thank you.
- [MUSIC]
- [MUSIC]
- That was awesome.
- Thank you Andy.
- It’s so fun to have him back at re:Invent.
- Thanks so much.
- And it was it’s fun to share a number of the things that Amazon is doing, but also about the Nova models.
- And I’m sure many of you are excited to get out there and try many of those Nova models.
Developer Innovation Focus
- All right.
- Our goal in in AWS is to help every builder be able to innovate.
- We want to free you from the undifferentiated heavy lifting to really focus on those creative things that make what you're building the most unique, now, generative AI is a huge accelerator of this capability.
- It allows you to focus on those pieces and push off some of that undifferentiated heavy lifting.
- Now, how many developers do we have out there in the audience?
- Show of hands.
- We’ve got some developers out there.
- Excellent.
- I know we have a bunch of developers here today, so I thought I’d spend a couple of minutes talking about how we make help make developers more efficient.
- Last year we introduced Amazon Q Developer, which is your AWS expert, and it’s the most capable generative AI assistant for software development.
- Now, customers like Dave Chappelle have achieved up to 70% efficiency improvements by using Q Developer.
- They reduce their time to deploy new features.
- They completed tasks faster, and they minimized a bunch of repetitive actions.
- But it’s not just about efficiency.
- FINRA, as an example, has witnessed a remarkable 20% boost in the code quality and integrity by using Q Developer, helping them create better performing, and more secure software.
- Now, when we first launched, our goal was to deliver a great coding assistant with Q Developer, and we did just that.
- In fact, Q has the highest reported acceptance rate of any multi-line coding assistant out there in the market today.
- But it turns out that a coding assistant is just really a small part of what most developers need.
- Throughout your day, we talked to developers, and in fact, it turns out most developers spend an average of just one hour a day coding.
- That’s it.
- The rest of the time they spend on other end to end development tasks.
- So we thought we’d look at that full cycle to see if there’s anywhere else that we could help.
- And it turns out there’s a bunch of tasks that take up developer time, but they’re part of the jobs that most developers don’t really love doing right?
- Things like writing unit tests or managing code reviews.
- And look, I used to run a large development team.
- I’m pretty sure I never met a developer who loved spending their time writing great documentation for their code, but it is important.
- It’s not super engaging, but it is actually really important.
- It’s tedious.
- It’s time consuming, but it’s critical.
- And it’s one of those things you don’t want to skip, unfortunately, because it’s not that fun.
- Sometimes people paper it over and don’t do a great job.
- So today I’m really excited to announce three new autonomous agents as part of Q Developer that can help announcing Q autonomous agents for generating unit tests, documentation, and code reviews.
- [APPLAUSE]
- Q can now automatically generate end to end user tests.
- You just type in slash test and Q uses advanced agents as well as knowledge of your entire project to create full test coverage for you.
- Their second agent can automatically create accurate documentation for you.
- It can.
- The interesting thing is it’s not just new code.
- The Q agent can actually apply to legacy code as well.
- So if you come across a code base that maybe wasn’t perfectly documented by one of your coworkers.
- Q can help you understand what that code is doing as well.
- You can now use code to automatically do code reviews for you.
- It'll scan for vulnerabilities, flag suspicious coding patterns, even identify potential open source package risks that you might have.
- Actually, one of the other cool things it does is it’ll even identify where it views a deployment risk and suggest mitigations for you of how you can make a safer deployment.
- We think that these agents can materially reduce a lot of the time that’s spent on these really important, but maybe undifferentiated tasks and allow your developers to spend more time on those value added activities.
- But again, it’s not just capabilities.
- It also matters where you access Q you want Q to be available where you need it.
- And so we’ve added Q in the console.
- We have Q available in Slack and it’s available on all the popular IDEs like Visual Studio VSCode, IntelliJ and starting today I’m excited to announce a new deep integration between Q Developer and GitLab.
- [APPLAUSE]
- With this new partnership, Q Developer functionality is deeply embedded in Q’s.
- In GitLab’s popular platform.
- It’s going to help power many of the popular aspects of their duo assistant.
- Now you’re going to be able to access Q Developer capabilities, and they’re going to be natively available in the GitLab workflows.
- And we’re going to be adding more and more over time.
- We trialed this concept with a couple of early customers like Southwest Airlines and Mercedes-Benz, and they told us that they were incredibly excited to take advantage of the combination of Amazon, Q Developer and GitLab together.
- Now, if we want to help with the full developer lifecycle, though, you can’t just stop with new applications.
- As many of you know, a lot of development time is not spent on new applications, but it's dealing with managing existing applications.
- Actually, a ton of developer time is spent maintaining modernizing, patching existing applications, and these take a lot of effort in fact, when you think about legacy application upgrades as an example, those are huge multi-month efforts and a lot of times they’re super complicated and can be really long.
- One of Q developers most powerful capabilities we already have is automating Java version upgrades.
- What it can do is it can transform a Java application from an old version of Java to a new version in a fraction of the time it would take to do manually.
- This is work that no developer loves to do, but is critically important.
- Earlier this year, Amazon integrated this capability into our own internal systems where we have a lot of older, older Java code that needed to be upgraded using Q Dev.
- We migrated literally tens of thousands of production applications up to Java 17, and we did it in a small fraction of the time.
- The estimate by the teams is this saved us 4500 developer years.
- This is like a mind blowing amount of time that was saved by upgrading and because we’re now running on modern Java applications, we actually can use less hardware too.
- And so we actually saved $260 million a year through this process.
- This got us thinking, that is fantastic.
- I wonder what else can help us transform.
- I actually love asking customers how we can help them and what their biggest pain points are.
- I think you hear some interesting things coming out of that, and one of the things that will quickly bubble to the top is Windows.
- Customers would love an easy button to get off of Windows.
- They’re tired of constant security issues, the constant packing or patching, all the scalability challenges that they have to deal with, and they definitely hate the onerous licensing costs.
- But we do recognize today that this is hard, actually modernizing away from Windows is not easy today, so I’m happy to announce Q Transformation for Windows .NET Applications.
- [APPLAUSE]
- Now with Q Developer modernizing Windows just got a lot easier.
- Q Developer helps you transform .NET applications that are running on Windows to Linux in a fraction of the time.
- What happens is Q Dev launches agents that can automatically discover incompatibilities, generate a transformation plan, and refactor your source code.
- And it can do this across hundreds or even thousands of applications in parallel.
- It turns out Q Dev can help you modernize .NET applications for x faster than doing it manually.
- And once you’re done, the good news is you save 40% by saving all your licensing costs.
- One customer Signature it is a European leader in digital transactions and they’re really focused on modernizing their legacy .NET applications away from Windows and they really wanted to move that from Windows to Linux.
- We worked with them on an early beta of Q Developer and what it was a project that they estimated was going to take 6 to 8 months.
- They actually completed in just a few days.
- That is a game changing amount of time.
- Fantastic.
- But it turns out Windows is not the only legacy platform in a data center that’s slowing down all your modernization efforts.
- Actually, more and more, as we talk to customers, they're really wanting to get out of data centers entirely.
- Customers like Itau Unibanco and Experian and Booking.com and really thousands more of them have partnered with us to fully exit out of their data centers, lower their costs and just focus their teams on innovation as opposed to running infrastructure.
- It’s cool to see these full data center migrations, but we know that a lot of on premise workloads today run on VMware.
- Now, actually, it turns out many customers are actually happy for a portion of their existing VMware workloads to stay running on VMware, but they don’t want them to stay running in their data centers.
- They’d like to migrate those to the cloud.
- And for these workloads.
- Last week, we announced our new elastic VMware service that makes it easy for you to move VM VMware subscriptions to AWS and easily run the full VCF stack of VMware natively on top of EC2.
- However, there’s a lot of workloads that are currently running often times on VMware that customers would really love to modernize.
- The cloud native services.
- Now, we know VMware is deeply entrenched in your data centers and has been for a really long time.
- And what happens is in this VMware environment, because it’s been there for a long time, there ends up that there’s this kind of spaghetti mess of interconnected applications.
- And so it’s actually really the hardest part about modernizing is finding out what are those dependencies of those applications.
- And the migrations are error prone because it’s hard to understand that if you move something, is it going to break something else?
- And again, of course, licensing is expensive.
- We’re pleased to announce today CU transformation for VMware workloads.
- [APPLAUSE]
- Now CU is able to help you easily modernize workloads that are running on VMware and move them to a cloud native solution.
- What happens is, and the biggest value here is CU automatically identifies all of your application dependencies, and it generates this migration plan for you, which really reduces a ton of the migration time and significantly reduces your risk.
- It then also launches agents that can convert your on premise VMware network configurations into modern AWS equivalents.
- This takes what used to be months and months of work into hours to weeks.
- Now I’m really I wanted to quickly touch base.
- We did Windows, we did VMware.
- But there’s one complex system that’s by far the most difficult to migrate to the cloud, and that’s the mainframe.
- Even just the effort.
- Actually, it turns out when we talk to customers, just the effort of trying to analyze and document and plan mainframe modernization is often too much.
- People give up.
- It’s too hard.
- It can just be overwhelming.
- It turns out CU is really good and can help with this too.
- Today, announcing CU transformation for mainframe.
- [APPLAUSE]
- CU has a number of agents that can help you streamline this complex workflow.
- Can do code analysis for you.
- You can do planning, refactor your applications.
- And actually, just like I talked about before, most of mainframe code is not very well documented.
- People have millions of lines of COBOL code and they have no idea what it does.
- CU can actually take that legacy code and build real time documentation that lets you know what this does.
- Really cool and super helpful in understanding which of those applications you want to modernize.
- Now, most customers that you talk to will tell you that their mainframe migration, they estimate, will probably take 3 to 5 years.
- I don’t know about you, but planning a project for 3 to 5 years is nearly impossible.
- A lot of time they just don’t get done.
- But now.
- And by the way, I wish I could stand up here and tell you that I’m going to make mainframe migrations one click.
- We’re not quite there yet, and that’s not really possible yet.
- But based on early customer feedback and internal testing, we think that CU can actually turn what was going to be a multi-year effort into like a multi-quarter effort cutting by more than 50% the time to migrate mainframes, super big.
- If you can take a multi-year effort and bring it down to a couple of quarters, that’s something that people can really get their heads around and customers are incredibly excited about this all right.
Q Developer Operations
- The last part of the dev cycle that we haven’t talked about yet is operations.
- Now, we all know that operations are a critical part of running software services today.
- And we hear from customers that when they’re managing their AWS environment, they spend a lot of time sifting through CloudWatch graphs and logs, trying to understand what’s happening in that AWS environment.
- So today, I’m happy to announce a new capability in Q Developer, where Q can now make it easy to help investigate issues inside of your AWS environment.
- [APPLAUSE]
- What Q can do is it actually looks at your entire environment.
- It knows your entire setup, everything that you have running, and then it looks at CloudWatch data, it looks at CloudTrail logs.
- And it can help you uncover where you may be having an issue.
- It deeply understands your AWS environment, and it looks for anomalies.
- You may say I’m having an issue and it’ll trace everything down to where there’s a broken set of permissions that might have gotten changed, and it’ll suggest how you can fix those.
- And maybe suggest even best practices on how you don’t break them the next time.
- Because once you settle on those root causes, Q also has access to possible remediations from runbooks and curated documentation that you provide.
- Now, CloudWatch has integrations across many of the most popular incident management and ticketing systems.
- Also, to help you manage incidents across your entire landscape.
Pagerduty Integration
- One partner, who I’m sure many of us use that’s integrated with Q is Pagerduty.
- They’ve been helping customers prevent and resolve operational events for many years, and AWS has been proud to partner with them throughout that entire journey here.
- To tell us more about how they’re innovating for the future and embracing new technologies like AI is CEO of Pagerduty Jen Tejada.
- [MUSIC]
- How are we doing?
- [APPLAUSE]
- Thanks, Matt.
- It is so great to be here.
- So who is Pagerduty and no, we don’t make any pagers.
- I usually launch into this history of DevOps and it’s a culture of accountability and ownership.
- I’m a real hit at cocktail parties, but this is one party where no explanations are needed.
- In fact, I bet many of you are on Pagerduty now.
- That said, you may think you know Pagerduty, but you might not really know us.
- We were founded 15 years ago by three Amazon software engineers seeking to solve a single painful problem automating on call at the time, it was the last mile in DevOps.
- This was just the first step today, getting the right expertise orchestrated to the highest priority problem reliably and quickly remains critical.
- It can be the difference between happy customers or millions in lost revenues.
- Pagerduty operations cloud is agnostic, independent and central to the technology ecosystem.
- It’s like the brain for your digital operations.
- It connects with over 700 industry leading applications, serving as a modern operations hub.
- Our AI first suite of products liberates developers from manual toil.
- Our platform enables teams to innovate faster, automatically detecting events, filtering noise, and intelligently orchestrating resolution among people, machines and increasingly, agents.
- For nearly a decade, we've used AWS, AI, and automation to deliver resilience and security at scale.
- Our SLAs go beyond app availability to data transit, and we've never, ever had a maintenance window.
- Incident management and operations more broadly are time sensitive, unstructured, and mission critical.
- Making them fertile ground for AI and automation.
- We’ve applied that insight to operations, turning every Pagerduty interaction into an opportunity for smarter decisions, faster resolution, and more resilient services while shifting left.
- You can also scale up to the right of the operations maturity curve.
- Our foundational data model ingests billions of events and millions of incident workflows, underpinned by 15 years of experience serving responders to increasingly take that high cost toil off your hands.
- Today, over two thirds of the Fortune 100 and half of the Fortune 500 rely on Pagerduty as well as a number of the most innovative generative AI natives.
- Our commitment to reliability, fidelity and security at scale has earned us the trust of over 15,000 customers, nearly 6000 of which we share with AWS.
- When I was a kid, my dad’s advice was choose your partners wisely.
- After all, we’re a product of the people who surround us.
- Building Pagerduty on AWS and partnering with them to co-innovate was one of the best business decisions we’ve ever made, not the least of which because the AWS community is our community.
- You’re our kind of people.
- With AWS, we’ve instantiated high availability and resilience without sacrificing efficiency and effectiveness, maintaining high gross margins above 80% as we scale.
- And we’ve done so with the help of many AWS building blocks like Amazon S3, EventBridge, and Lambda.
- So it was natural to work with AWS in harnessing generative AI to supercharge your resilient infrastructure and applications.
- Here’s how it works.
- Imagine you build consumer apps for a global bank.
- It’s late Sunday when you receive a Pagerduty alert.
- Customers are reporting login issues.
- Your manager texts you a photo of it going viral on social media.
- You know from experience that this could quickly devolve into chaos.
- Multiple Slack threads, emails, calls, way too many people on bridges, and maybe a few war rooms across different continents.
- Hours of stress, trial and error, a sleepless night.
- Many of us have been there, but with the Pagerduty operations cloud, things can be different.
- You open a single integrated Slack, chat with your small team of experts.
- Pagerduty advance our generative AI assistant built using Amazon, Bedrock and Anthropic Claude answers diagnostic questions like what’s the customer impact?
- What’s changed?
- The Pagerduty operations cloud provides real time visibility across all types of telemetry, including cloud application, infrastructure and security events, and it instantly identifies a routine third party update as the likely culprit.
- But it doesn’t stop there.
- Pagerduty advance proactively suggests the best next step.
- With a click of a button, you deploy an automated runbook that rolls back the update and restores the service.
- Who in seconds our AI assistant also drafts an exec ready status update to keep your internal and external stakeholders informed, saving time when time is of the essence.
- Your customer service team is in lockstep and they’re able to follow and engage with Pagerduty within their Salesforce environment and users are automatically updated that the issue has been resolved.
- Now customers are able to log in, but when they review their banking accounts, many of them are startled to find zeros.
- This is a more complex issue than it initially seemed, so it’s clear you need a lot more context, and it’s also clear that time is not on your side.
- Let’s face it, the math is not on your side either.
- Major incidents are up 43% over last year, costing nearly $1 million on average, so you don’t have a lot of time to search around for people and information and send emails and race tickets and support cases are now skyrocketing.
- Your boss is texting again now about compliance, contractual obligations, and then the issue hits the news.
- Customer loyalty, reputation, even your company's right to do business are now all at stake.
- High priority incidents often impact several distributed development teams and a variety of owned and third party applications and infrastructure services, so it’s very time consuming to find the right resources to fix the issue.
- That’s why today, we're proud to announce the combined power of Pagerduty advance and Amazon Q together for the first time.
- [APPLAUSE]
- With Pagerduty advance integrated Amazon Q, everything you need is right at your fingertips from identifying, contributing factors to diagnostics, the data you shared with you is available for you in a safe and secure environment.
- The moment that you need it.
- The Pagerduty advance unified user experience, built on Bedrock and Claude and integrated into Q, means less time lost to incidents and more time for building.
- Trust is earned over years, but it can be lost in seconds.
- Generative AI and automation are powerful accelerators, but we do need safeguards, and that’s why Pagerduty leverages Amazon Bedrock guardrails.
- Guardrails helps protect against hallucinations, blocks undesired topics, and minimizes harmful content.
- And in addition to helping you manage incidents caused by AI and agents, we can also help you re:inforce your responsible AI policies.
- The potential of AI is undeniable.
- With disruption comes both opportunity and responsibility.
- AWS and Pagerduty are dedicated to putting AI to work in innovative ways, while never losing sight of what got us here.
- Earning your trust through innovation, scalability, and reliability.
- Together, we’re paving the way for a more secure and more resilient and bright future.
- Thank you.
- [APPLAUSE]
- [MUSIC]
- Thanks, Jen.
- I’m sure that there are many developers out there that are sleep better at night, knowing they have Pagerduty.
- All right.
Amazon Q Updates
- We’ve spent a bunch of time talking about how we can help developers, but it turns out developers are not the only people that can see incredible gains from generative AI.
- We thought we could also improve the efficiency of other roles in a company, roles like finance or sales or operations, by helping automate repetitive time consuming and frankly, unfulfilling tasks.
- And one thread across all of those different roles, when we talk to people was that people spend a lot of time looking for data, right?
- They want to find data to make decisions, but it’s in different apps and they go back and forth and it’s really hard to get it all together.
- And that’s the reason why we launched Q Business.
- Q is the most capable, generative AI assistant for leveraging your company's internal data and accelerating task.
- What Q does, is it connects all your different business systems, your sources of enterprise data, whether those come from AWS, whether they're from third party apps, whether from internal sources like wikis, all of those, all of this data can then be used to do better searches.
- It can do summarization of your data, and it lets you engage in a conversation with all of your enterprise data across your various data silos.
- And it's all done with security and privacy.
- All of the permissioning around that data stays with, with your data.
- Customers like Nasdaq, Principal Financial and Accenture use Q Business to empower their workforce to be much more productive with generative AI.
- Now, the power of Q Business is that it creates this index of all of your enterprise data.
- So it indexes data from Adobe, Atlassian, Microsoft Office, SharePoint, Gmail, Salesforce, ServiceNow, and more.
- And Q keeps this index always up to date, and it's highly secure.
- Maintaining user level permissions on all of that data.
- So if you don’t have data to access, if you don’t have permission to access data outside of Q, you won’t be able to access it inside of Q all of its controlled and compliant.
- Now, by bringing all of these storage and apps and productivity and corporate knowledge bases, all together, we're doing this in a way that no one has done this before, and we’re rapidly expanding the number of data types that we can support, integrating lots of new information like metadata, a new file, types like audio and images, which are coming soon.
- But it turns out that it’s also not just about integrating data from all your productivity and your SaaS apps.
- Customers also are interested in querying their structured data, and many companies have this structured data today in databases and data warehouses.
- And today, as you know, the vast majority of this information is accessed through business intelligence systems.
- Over 100,000 customers today are using QuickSight for their analytics needs.
- Using QuickSight interactive dashboards, pixel perfect reports, and embedded analytics.
- Last year, we put Q and QuickSight to make it much easier for you to get fast generative AI powered insights.
- You can ask questions directly about your data and get graphs and information back.
- Now what if we could bring this all together, right?
- What if we could take all of the information that’s in your BI systems?
- The structured data store, and bring it together with all that information that’s stored in the Q index documents and purchase orders and email, you could probably make better decisions if we unified all of that data.
- So today I’m excited to announce that we're bringing together QuickSight Q and the Q Business data all together.
- [APPLAUSE]
- So really, what does this mean?
- Imagine you’re looking at a QuickSight dashboard, right.
- And say this is a QuickSight dashboard.
- You built that show your sales.
- Now that’s pretty interesting and it’s useful.
- And you can see what your sales are doing.
- But maybe you also want to add information from your sales pipeline.
- And you know what?
- It actually would be interesting as you’re looking at that, to refer back to that monthly business review that you did last week where the team was talking about some of the interesting trends that they saw in the industry.
- Now, with QuickSight and Q Business, you can pull Salesforce data from your pipeline into your QuickSight report.
- You can pull the latest details from your Adobe Experience Manager, email campaign, or you can grab that last business review from SharePoint and you can use on Q will use all of that data to summarize it together and show it to you all in one view.
- All inside of QuickSight, making QuickSight much more powerful by pulling information in from other sources of enterprise data that you have.
- It’s really cool, and it makes QuickSight much, much more powerful as a BI tool.
- So we did that and we got this thinking, this Q index that we have, it's actually an incredibly powerful concept for you as a company.
- This can act as a canonical data source for all sources of your enterprise data.
- And because you have Q right.
- In theory, these other applications, like QuickSight, can get more powerful like we did.
- They can pull that in and you get much more value out of the other applications you’re using.
- And because this index is managed by AWS, you know that we've thought deeply about how you give you fine grained control over that data.
- So applications won’t be able to access that data without your permissions.
- And the right people will only have access to the right sets of data.
- And so we asked ourselves, why stop with AWS applications?
- So we went and talked to a couple of customers and as well as some of our ISV partners, and we said, what if we could give you the same access to this Q index that we gave to QuickSight, and you could just see their eyes light up.
- They could see that the value they could get out of those third party applications, or from the application developer, the value they could provide to their customers was immense.
- If they could pull in these other sources of data.
- So I’m pleased to announce that now there's a new set of APIs where ISVs can access this Q index that we use in Q Business.
- [APPLAUSE]
- What that means for you all is when you use Q Business, it makes the potential for all the rest of your applications to also get better.
- Now it comes with this fine grained permissions so you can control when a third party application.
- Maybe has access.
- But when you do that, those get better and more useful.
- ISV apps get more powerful personalized.
- It allows you to save time and money, and you don’t have to rely.
- If they wanted access to that data, you don’t want to have to let a bunch of other ISVs all have access to that data, and all try to handle permissions.
- You know that AWS has that covered.
- And so, you know, security is right there when you need it.
- Companies like Asana and Miro Pagerduty, Smartsheet and Zoom are all already building integrations to this Q index.
- Let’s take a quick look and see how this might look.
- Take Asana.
- All right.
- Asana is a workforce management software, a work management software provider.
- Right.
- And they’ve integrated with the Q index to surface content from other applications in their project management application.
- Let’s pretend for a second they are an IT leader, and you’re building a new set of tools using Asana.
- However, you can see that unfortunately your project is off track.
- What happens is Asana AI leverages the Q index to identify the problem.
- What they do is they look and they pull identify the problem.
- What they do is they look and they pull from your your your teams.
- Chat transcript.
- And they see that you don’t have a committed timeline from your CTO from the from your teams chat.
- It also found from an unread Gmail message that a key meeting that you were supposed to make related decisions to this project actually got pushed to next week, and now you’re late.
- Now, because you’re able to integrate all this information, you know where to focus your time, and you can try to get things back on track.
- Pretty cool pulling all this information.
- Awesome.
Analytics and AI Integration
- This past year, we also launched a new feature in Q called Q Apps.
- Q Apps makes it easy for you to quickly automate small little tasks that make your daily life easier.
- All you have to do is you go into Q Apps and you describe the app you want natural language and it quickly generates it for you.
- And by the way, these are not meant for big major applications, right?
- Like something like an Asana or Pagerduty.
- These are just small little apps that might be able to save you 15 minutes a day.
- Let’s say, for example, you’re a social media marketer and you build a quick app to tune your copy to different sites, text limits and audience preferences.
- So today, what happens is you have to come up with text and you maybe have to iterate on it.
- A dozen, two dozen, three different dozen times for the different websites, you can create a little app that you enter at one time.
- It’ll create all of the different copies for you.
- It's a simple app, but it saves you a ton of time, and you can imagine that if you use it over and over and over again on a daily basis, that can add up.
- We see customers making hundreds and hundreds of these apps all of the time, and sometimes they actually can have a pretty big impact.
- Volkswagen is currently upgrading to an a new air information system, and like many large corporations, they saw that they had over 4000 different job codes alone in North America, and they were trying to consolidate those down to a few dozen standardized job codes.
- Someone on the on the team built a Q App to do this, and it was so helpful.
- They actually rolled it out to the whole team.
- And now they can save over 100,000 hours worldwide by using this capability next year.
- What a win.
- All from an app that took about 5 to 10 minutes to create by someone on their desktop.
- So if Q can help automate just these simple little workflows, we thought maybe it could help with complex ones too.
- And every company has these really complex workflows, right?
- These are workloads that involve multiple applications.
- They involve approvals that involve manual entry.
- And these things are painful.
- They're really hard to automate.
- Right.
- And actually this is not a new concept.
- People have been trying to automate these style of workloads for a long time.
- But it's hard because usually they require human interaction.
- There's a UI, if any element of the UI breaks, the whole workflow breaks, and you're down for weeks while you go try to repair it.
- And these systems are brittle and they don’t really work in.
- They’re expensive to try to go build.
- Q has a better way.
- However, I’m very pleased to announce that coming soon, Q Business automate can help automate tasks across multiple teams and applications.
- [APPLAUSE]
- What happens is Q Business will use a series of advanced agents to create, edit, and maintain workflows, making them much more resilient to change and they'll do all of this automatically.
- Let’s say, for example, you tell Q Business that you want to automate a complex, say, auto claim processing workflow.
- Okay, you simply you simply tell it what your standard operating procedure is.
- In fact, you can even do is you can even install a browser plugin that will follow along and, and watch a video or make a video as you manually enter through the steps.
- It’ll then automatically create that workflow.
- Ask you a couple of questions that it doesn’t understand of what happened, and then it’s done.
- Now you have an automated workflow, and after you launch the workflow, there's a Q agent that will constantly monitor it to make sure that it adjusts on finding any UI changes that happen and fix them on the fly so you don't have any downtime.
- Previously, this would have taken weeks or months to do, and it's something you can now do in minutes.
- We think that we're only at the beginning of the automation that Q Business is going to help you all do, and we think it's going to be really impactful for saving time for a lot of these really tedious tasks.
- Now let’s hear from another startup company, a company that’s building on AWS and is taking automation to a whole new level.
- [MUSIC]
- Our preconceived notion of robots is that they lack some empathy.
- They lack some interactivity.
- I don’t think it has to be that way.
- [MUSIC]
- I Kohl we see a future where robots are helping in every sector.
- [MUSIC]
- The robots have to behave in a very predictable way.
- The sensing, the perception, the software.
- So it’s connected all the way from the cloud down to the physical actuation of the wheels.
- [MUSIC]
- Proxy is designed to take on these material movement tasks to take on moving boxes, totes and carts that can move up to 1,500 pound carts.
- And yet it’s a friendly robot.
- It’s not an industrial machine.
- It’s not a forklift.
- With the advances in AI, with the advances in large language models, we have that ability to have that much more empathetic interaction.
- We didn’t initially think about.
- How do we make it easy to plug and play with different models?
- [MUSIC]
- But AWS anticipated that for us, AWS with Bedrock had the insight that it wouldn’t come from one source.
- [MUSIC]
- As these models get updated, we get access to them almost immediately to make our robot better as technologists, as inventors, we get the ability to decide what the future looks like.
- AI is not only going to change the nature of cognitive work, but it's going to change physical work as well.
- [MUSIC]
SageMaker Evolution
- All right, that is super impressive Brad.
- Those those robots are awesome.
- Thank you for sharing.
- All right.
- There’s one area that we haven’t touched on yet today.
- And that’s analytics in AWS.
- We have the broadest and deepest set of purpose built analytics services.
- We have services for data warehousing, data processing, search analytics, streaming analytics, business intelligence and more.
- And just to highlight a few of these, Redshift is our cloud native data warehouse.
- It gives you industry leading price performance.
- It's used by tens of thousands of customers today, and Redshift processes exabytes of data every single day.
- We have EMR, which is our petabyte scale big data processing service, and it delivers 3.9 times the performance that you get from open source Spark.
- We have OpenSearch, which gives you everything from log analytics to vector search.
- It's a very flexible service and there's tens of thousands of active customers using OpenSearch with hundreds of thousands of clusters supported and trillions of requests handled every month.
- AWS Glue is our serverless data integration service, which does hundreds of millions of data integration jobs every single month, and increasingly, we’re seeing more and more customers start their analytics workflows with SageMaker.
- You heard this from Laurie about JPMC.
- We’re seeing customers use SageMaker to process and prepare their data for ML workloads, and an ML based analysis.
- And actually, this last point is super interesting.
- A lot of customers tell us that their analytics and their AI workloads are increasingly converging around a lot of the same data and a lot of the same workloads, workflows.
- It's actually changing how customers think about their analytics services, because it turns out you don't just use analytics and AI tools in isolation anymore.
- In fact, a lot of customers are using their historical analytics data to train machine learning models and increasingly that same data is then they incorporate into their genAI applications.
- So many people are already using these services together.
- There’s a finserv customer that I was talking to recently that was building a real time fraud detection system, and they were walking me through their architecture and they said, look, here’s what I do.
- I start with a homegrown data catalog, and it’s for analysts to try to access the data they need, and then they take that data and they load it into Redshift and their data warehouse, and that’s where they store it.
- And then they run SQL analytics on top of that data.
- And then using a combination of SageMaker and EMR, they passed this off to their data scientists, who then use those systems to prepare the training data.
- And then train and deploy the fraud systems.
- I model.
- It's actually a pretty cool system that they've built, and it turns out without these cloud systems, this is an outcome they never could have achieved, right?
- It’s something they couldn’t have done because they got to use the absolute best service for each one of those steps.
- The best tool for the right time.
- But as we thought about this example and others like it, we kept thinking, how could we make this even easier?
- What if we started from a blank piece of paper?
- What would an ideal AI and analytics system look like?
- So we have four rings here of capabilities that we need.
- We have all of the best analytic services.
- We have an easy way to understand and all of your data, whether it's structured or unstructured, all across S3, Redshift and Aurora and third party data sources, you’d want to be able to take advantage of your data management capabilities.
- Then datazone to do things like cataloging, share catalog sharing and governance of your data.
- And you’d want to have AI front and center of everything that you do.
- It’s almost like we need that fifth ring just to pull it all together to an integrated experience.
- Now, today, SageMaker is the most popular service in the world for building, training and deploying ML models.
- Now, as we talked about earlier, the secret to building great AI applications is customizing them with your own data.
- And SageMaker is the center of how most customers do that today.
- And increasingly, SageMaker is actually the center of a lot of data, AI, and data analysis as well.
- In fact, it’s the de facto way that we see customers dealing with their data and AI models together.
- And hundreds of thousands of customers are already using SageMaker today.
- Now we’re putting all of these pieces together, and I’m super excited to now to announce the next generation of Amazon SageMaker.
- [APPLAUSE]
- SageMaker is now the center of all of your data analytics and AI needs.
- What we’re doing is we’re taking the SageMaker that you all know and love and expanding it by integrating the most comprehensive set of data analytics and AI tools.
- Everything you need for fast analytics, data processing, search dataprep AI model development, and generative AI all with a single view of your enterprise data.
- As part of this, I’m also excited to announce the SageMaker unified studio.
- [APPLAUSE]
- It's a single data and AI development environment available in preview today.
- It allows you to access all the data in your organization and act on it with the best tool for the job.
- It consolidates the functionality that analysts and data scientists use across a wide range of standalone studios in AWS today, a standalone query editors and a variety of visual tools things like EMR and Glue and Redshift and Bedrock and all the existing SageMaker Studio capabilities.
- It allows you to create shared projects that include AI or analytics artifacts like notebooks or models, and allows your data scientists and your analysts and ML experts to all easily collaborate in the same integrated space.
- It also includes an integrated data catalog and governance capabilities, which allows an easy way for you to apply security controls so that different users all across your organization can access the artifacts that they need, but only have the ones that they have permission to.
- So how do we do this?
- We first had to break down the data silos that we have.
- And as you know, this is multiple years in the making.
- We've been on a journey toward zero ETL future.
- With this moment in mind, zero-ETL integrates all of your data between Aurora and RDS and DynamoDB and Redshift.
- And we also know, though, that a lot of your critical data, as we’ve talked about throughout the day, lives in powerful third party applications, which makes it hard to use all of this data together because it’s siloed.
- We wanted to make that easier as well.
- So today we’re introducing zero-ETL for applications, a new capability that lets you analyze data stored across many of the most popular third party SaaS applications, all without having to build and manage data pipelines.
- [APPLAUSE]
- So customers want to do analytics across all of this data.
- And as we announced earlier today, we have this new Iceberg capability in S3 that makes it much easier to query your S3 data lake.
- But you want a single interface across all of your data, whether it’s Redshift, structured data sources, or S3 data lakes or other federated data sources.
- So I’m very excited today to announce neha, the Amazon SageMaker Lakehouse.
- [APPLAUSE]
- It's a new Apache Iceberg compatible lake house in SageMaker that provides a simple, unified access across all of these sources of data.
- You can easily work with all of your data right in the unified studio, or you can actually access your SageMaker Lakehouse directly from any third party AI or analytics tool or query engine that supports Apache Iceberg APIs.
- You can query data no matter how or where it’s physically stored, and for all of those use cases, whether they’re analytics or ad hoc querying, processing or data science or whatever, it’s all through this consistent interface.
- And also don’t worry, SageMaker that you know and love still includes all of the existing AI capabilities that you’ve come to use.
- Things like data wrangling human in the loop, groundtruth experiments, MLOps, and of course hyperpod managed distributed training.
- We’re now referring to this as SageMaker AI, and there’s a ton of new launches that are happening in SageMaker AI that you’ll that you’ll hear about throughout the week.
Closing Remarks
- So let’s bring it all back together.
- Remember that fraud detection use case that we started with?
- Let’s see how we can improve that experience with the new unified studio and the new SageMaker.
- Now your analysts and data scientists have access to a consolidated set of business data catalogs.
- They can now search for the relevant data that they need for their use case.
- In this example, let’s say the data analyst wants to look for transactions.
- They can view details about the data, and they can subscribe to add it to their project.
- Next, they can open a query editor, which is natively involved in the unified UI.
- This query editor enables them to write SQL queries that they can run against Redshift or Athena, all without having to leave SageMaker, and they can then easily share this data set to the data scientists on their team to continue to work.
- All of these teams can then collaborate on this project, and then the data scientists jump in and start processing and preparing using data.
- They can use Spark and then train machine learning models and detect fraud and deploy them using SageMaker AI.
- All from this single notebook.
- Everything in a single unified studio, all working together.
- And we’re just getting started.
- Over the next year, we’re going to be adding a ton new capabilities to the new SageMaker capabilities like:
- AutoML
- new low code experiences
- specialized AI service integrations
- stream processing and search
- access to more and more services and data with zero-ETL all in this single unified UI.
- And most importantly, you can get started today.
- Just sign in to SageMaker and you can start building.
- All right.
- We’ve covered a ton today from the very best building blocks to applications that allow you to multiply your impact and work, they make it easier to work with all of your data in the ways that you need to.
- And we’ve been innovating faster than ever before to give you our customers, everything you need to build whatever you can imagine.
- And I’ll tell you, I have never been as excited as I am today about the future.
- We are at a seminal point.
- The amount of innovation that we’re seeing out there is really incredible out there in the world, and I don’t just mean the innovation that’s coming from AWS.
- I’m really excited about the innovation that we're seeing from customers, from partners, from enterprises, from incredible startups like the ones we heard from today.
- There has never been a better time to be innovating, and you’ve never had access to such a rich set of capable tools to help you do it.
- Please take the opportunity this week to learn from everyone that’s here.
- There’s a ton of great content and great people to learn from.
- Thanks so much and enjoy re:Invent.
- [MUSIC]