In this keynote speech at Computex, NVIDIA CEO Jensen Huang focused on the company’s latest advancements in accelerated computing and artificial intelligence (AI), and their profound impact across industries. He emphasized the underlying infrastructure required for generative AI, which will necessitate and drive a complete reformation of the entire computing industry, and NVIDIA has already accumulated a sizeable installed base to facilitate this transformation.

Huang highlighted NVIDIA’s pivotal role in this technological shift, having developed numerous groundbreaking technologies such as the Omniverse simulation platform, CUDA accelerated computing, and NVIDIA Inference Microservices (NIMs). This allows researchers across domains to focus on building domain-specific models and applications without worrying about the underlying technology. Huang painted a vision of a future where AI will be ubiquitous, from customer service agents and digital humans, to digital twins and robotics models that understand the laws of physics. He also discussed NVIDIA’s GPU roadmap, previewing upcoming larger and more energy-efficient GPU products.

tl;dr

NVIDIA has developed an “AI Generator” comparable to Tesla’s AC generator, capable of generating tokens (text, images, videos, etc.) that can serve industries worth trillions of dollars.
NVIDIA CUDA technology can accelerate various tasks, providing extraordinary performance boosts while reducing power consumption and costs, effectively addressing computation inflation.
NVIDIA Omniverse platform leverages accelerated computing and AI, and NVIDIA already possesses 350 domain-specific libraries, allowing them to support various industries and markets.
NIMs are a new software packaging approach. NIMs can build and organize AI teams to handle complex tasks, and can run both in the cloud and on personal computers.
Future AI models will need to understand the laws of physics, requiring more compute power and larger GPUs. NVIDIA is enhancing reliability, continuing to improve data compression/decompression efficiency, and data transfer efficiency.

Ernest’s Notes:

Recalling the historical experiences of “Apple Mac vs Windows + Intel” and “Apple iOS vs Android”, the market is likely to form two or three dominant camps. One camp may have the advantage of highly integration, while the others (in the short term) will provide flexible solutions with constraints.

After roughly ten iteration cycles, the major camps will have penetrated various customers and industries. The services and features they can offer will gradually converge (moving closer to essential needs). At the same time, all of us will also accumulate new problems, setting the stage for the next situation.

Previously, Apple’s hardware came first, followed by Apple’s software, transitioning from packaged software to Apps and SaaS. Now, it is NVIDIA’s hardware and software integration. In the future, it is believed that there will also be a rethinking of the infrastructure. It is worth continuously observing where there is redundancy, operational inefficiency, as those areas present opportunities.

Think in Context: Google I/O 2024 Keynote With Google CEO Sundar Pichai

Published: 2024-05-16

by Ernest Chiang

Think in Context

tl;dr

To cater to both individual (B2C) and enterprise (B2B) users, Google’s product lineup is more fragmented and scattered compared to Amazon, AWS. This isn’t necessarily a bad thing, but it’s not necessarily good either. Synchronization between products could be more challenging.
Google’s models Gemini and Gemma aim to handle long-form context, multimodal inputs, and information across file formats. However, maintaining technical leadership (facing developers, partners, and the ecosystem), multimodal outputs (facing customer needs), and controlling costs (facing investors) will be urgent priorities for Google as it integrates Gemini into hundreds of Google products and features.
Amidst this AI wave and technological iterations, the part that left the deepest impression on me from the entire keynote was this quote from DONALD GLOVER: “Everybody’s going to become a director and everybody should be a director. Because at the heart of all of this is just storytelling. The closer we are to being able to tell each other our stories, the more we will understand each other.”
While we’re at it, can’t Google leverage Gemini’s long context advantage to allow Google Translate to properly translate “LLM” as “large language model” instead of “Master of Laws” in the appropriate context?

Think in Context: NVIDIA GTC 2024 Keynote with NVIDIA CEO Jensen Huang

Published: 2024-04-07

by Ernest Chiang

Think in Context

tl;dr

At GTC 2024 representing over $100 trillion in global industries, the focus shifted from reducing computing costs to exponentially increasing computing scales. This paradigm shift is dubbed “generation” rather than “inference,” signaling a move away from traditional data retrieval methods towards generating intelligent outputs. The discussion highlighted the ongoing industrial revolution in artificial intelligence, where even complex entities like proteins, genes, and brain waves are being digitized and understood through AI, leading to the creation of their digital twins.

The keynote stressed the transformation of AI applications, citing the AI Foundry’s three main components: NIM, NeMo Microservices, and DGX Cloud. These tools underscore a new era where both structured and unstructured data is converted into a dynamic AI database. This database not only stores information but interacts intelligently with users, marking a significant evolution from traditional semantic encoding to a world where meaning is embedded in digitally generated scenes.

PHP Performance Benchmark (2024Q1) - PHP8/PHP7 Debian/Alpine nginx-php-fpm

Published: 2024-03-31

by Ernest Chiang

Decomposition & Comparison

(Image source: Photo by Jason Dent on Unsplash)

This PHP container was developed, adjusted, and maintained together with a friend and colleague. The original design purpose was to combine php-fpm with nginx, creating a simplified environment that facilitates the operation of Laravel in cloud-native environments, AWS Fargate, Amazon ECS, etc. We use it as a base for teaching and implementing various projects, comparative testing, including comparisons of x86/ARM computing architectures, and more.

With the update and release of new PHP versions, we plan to share the performance tests comparisons we’ve conducted every quarter (if not forgotten) with fellow PHP enthusiasts.

If you find this container project helpful, we look forward to you applying it to your projects, sharing it with friends, or giving this project a star ⭐ :)
- The released Docker image is available at Docker Hub: dwchiang/nginx-php-fpm
- The original Dockerfiles are placed at GitHub: dwchiang/nginx-php-fpm
- If you are using this container in any project, we welcome you to share your application with us, and let us know how many resources should be planned for this container project.
If you’re also interested in the topic Running Laravel on Amazon ECS, you might find these resources useful:
- This self-paced online workshop with complete architecture diagrams: GitHub: dwchiang/laravel-on-aws-ecs-workshops,
- Or my presentation, which includes slides and a recording: Running Laravel/PHP Container Applications on AWS (AWS Builders Day Taiwan 2022)

Preview

Simple Guide to Using Anthropic Claude 3 With Amazon Bedrock

Published: 2024-03-07

by Ernest Chiang

(Caption: A sonnet situated on a bedrock connecting knowledge graphs. Image source: generated by DALL-E.)

On 2024-03-04, Anthropic released the Anthropic Claude 3 model family. On the same day, Amazon also announced Amazon Bedrock adds Claude 3 Anthropic AI models.

This article will provide a simple guide to getting started with Claude 3 on Amazon Bedrock and exploring its capabilities. After playing around with it, you might feel more inclined to integrate these LLM models into your workflow to some extent.

Quick glossary:

Anthropic: A company focused on AI safety and research.
Anthropic Claude 3: A family of models including several different sizes suitable for various application scenarios.
Amazon Bedrock: An AI one-stop shopping portal provided by Amazon that offers the simplest way to start using, building, and scaling generative AI applications, incorporating responsible AI.

Read With Me: How to Do Great Work

Published: 2023-10-07

by Ernest Chiang

Read With Me

(Caption: First attempt at creating a thumbnail, likely only in 720P resolution, let’s go with it for now XDD. Image source: Ernest.)

The originally planned holiday trip was canceled at the last minute, so why not get up early and do some reading? I had been holding onto Paul Graham’s long essay How to Do Great Work. Coincidentally, I came across a high-quality Chinese translation 【如何做出偉大的成就】 shared by Michael Chou, the founder and chief editor of Daodu.tech (科技島讀). So, my reading for the morning was decided on this piece.

AWS Learning Path and Strategy (2023)

Published: 2023-09-12

by Ernest Chiang

Learning

(Illustration: A cat is required for happy learning, right? Let’s cats lead the way XDD Image source: Photo by hp koch on Unsplash。)

Volunteers of AWSUG Taiwan have always enthusiastically discussed how to lead newbies to get started with AWS more smoothly. Recently I organized all the AWS learning resources that you often discuss and some (unpopular?!) learning ideas that I found into an single article to share with you.

When Meteor-Grade AI Comes Knocking, Do We Still Need to Learn AWS? 👉 A Traveler's Perspective on a Decade of Decomposition and Integration (AWS Community Day Taiwan 2023)

Published: 2023-08-26

by Ernest Chiang

Learning | Ernest PKM

(Illustration: Slide title cover. Image source: Ernest.)

At the AWS Community Day Taiwan 2023 conference organized by the community, I aimed to address two key aspects. Firstly, to break down the complex and vast amount of AWS product information for attendees who might be overwhelmed by the information or struggling to find direction. Secondly, I adapted the content from a 10-minute presentation at the 4/19 Generative AI Meetup to a 20-minute version. Through examples, we introduced the importance of Frameworks and Foundations, and further presented five tools for organizing knowledge. These tools were integrated into the Ernest PKM IIDEE or Amazon Working Backwards frameworks. Lastly, I took this opportunity to record the session.

I extend my gratitude to Amy Lee, the organizer of AWS Community Day Taiwan 2023, and to Eric and Wyne from Track A for their invitations. I also thank the enthusiastic volunteers and sponsors at the event, which made the second edition of AWS Community Day Taiwan vibrant and energetic. It truly serves as a blessing for AWS enthusiasts and developers alike.

For more details about the expansion of the personal knowledge system, you can read more in-depth in my article 👉 Ernest PKM. In addition to organizing human thoughts, this version includes integrating human-computer collaboration and breaking down historical context.

The significance of historical context is emphasized, echoing the words of Werner Vogels (Amazon.com CTO) at AWS re:Invent 2020:

“Technology always moves forwards, but sometimes it’s good to look back, to look at where we came from, to understand our foundation.” – Werner Vogels, Amazon.com CTO at AWS re:Invent 2020

Decomposition & Comparison: Why sometimes AWS CLI create-invalidation cannot clear Amazon CloudFront cache

Published: 2023-06-05

by Ernest Chiang

Decomposition & Comparison

(Photo by Wilhelm Gunkel on Unsplash)

Introduction

Since moving my blog from GitHub Pages to Amazon S3 + Amazon CloudFront, the deployment time and webpage loading time have been greatly reduced.

Even for long articles like Ernest PKM (Personal Knowledge Management) Workflow, the webpage loading time can still achieve DOMContentLoaded < 650 ms and Load < 2.5 s. The deployment time has been significantly reduced from an average of 10-13 minutes on GitHub Pages build to less than 1 minute when there are no image changes.

However, this led to another issue of managing and deciding the CDN cache expiration period.

Ernest PKM Personal Knowledge System Workflow (2023.25)

Published: 2023-04-20

by Ernest Chiang

Ernest PKM

(Photo by Ash Edmonds on Unsplash)

✳️ Introduction

(Initial version on 2023-02-16)

Since childhood, probably elementary school, I’ve been fascinated by notebooks, enjoying recording numbers, temperatures, times, the number of pigeons on the neighboring building outside the window, and how long it takes for them to fly around and return to their pigeon house, etc.

As I grew older, I’ve always regarded notebooks as tools for recording, memorizing, and quick review. However, I felt that they didn’t match my workflow, whether at work or at home. In recent years, I’ve been better at integrating them into my workflow and documenting my current Personal Knowledge Management (PKM) process. This workflow is derived from the various shares of many people who are smarter and more focused on PKM, Smart Notes, and other areas than me.

This workflow may not be suitable for everyone, but I hope that by sharing these ideas, we can spark conversations and discussions that help us, and even our next generation, reduce the time spent exploring and improve learning or work efficiency, ultimately contributing to society or humanity.

AI, Machine Learning, ChatGPT, Google Bard, Amazon Titan, Auto-GPT, and other models and artificial intelligence tools can handle some tasks for us, but they also operate following a certain process. If we can master the idea of process design, I believe we can still make a difference in society and organizations. Let’s encourage each other.

(Updated to 2023.25 version on 2023-04-19)

Originally, I thought the Ernest PKM series of articles would be updated once a year, but in just a quarter of a year (0.25 year), various AI thoughts and products have descended like meteorites, prompting a forced industry upgrade.

Since the GPT version numbers have evolved from GPT-1, GPT-2, GPT-3, GPT-3.5, to GPT-4, Ernest PKM has also undergone a minor upgrade, integrating both human brain and AI (MLOps) workflows, resulting in the Ernest PKM 2023.25 version. I chose to publish it on 4/19 Generative AI Meetup to share and interact with old friends and new acquaintances whom I haven’t seen for a long time.