Introduction

Scope

  • Vector and vector search concepts review
  • Concepts and Setup of Milvus DB
  • Milvus DB data manipulation and search
  • Vector DB as a LLM cache
  • Vector DB for Retrieval Augmented Generation (RAG)

Prerequisites

  • NLP for Machine learning
  • LLM and embeddings
  • Python, jupyter notebooks, docker
  • LangChain

Introduction to Vector Databases

What is a vector?

A vector is an object that has both magnitude (size, quantity) and direction (line, angle, trend)

Vectors in Programming:

Read more »

1. Exploring Concepts and Costs of Fine-Tuning

Review the fine-tuning process

  • Why Fine-Tune?
    • Fine tuning improves the overall performance of a model for your particular use case. The responses to your prompts are improved when using a fine tuned model.
    • You can reduce token costs because your prompts are shorter and require fewer examples than few shot prompting or retrieval augmented generation or RAG
    • The model responds to your queries (prompts) faster with lower latency.
  • Fine-tuning process
    • Obtain the data: a minimum for 10 examples is required. 50-100 training examples will clearly improve the fine-tuning
    • Prepare the data: Format your data in the JSONL format.
    • Upload the data to the OpenAI server: Use the Files API to upload your data.
    • Create a fine-tuning job: Use the OpenAI API, SDK, or option in the Playground.
    • Evaluate the model: Use metrics like training loss and training token accuracy.
    • Use the model: Use in the model parameter of your ChatCompletions call

Understand the costs of the fine-tuning

OpenAI charges you based on tokens, the input and output tokens will be charged.

01

  • Fine-tuned prompt allows you send short tokens.
  • Getting to know the number of tokens helps 1. estimate the costs; 2. reduce the latency
  • Pricing
    • Training the fine-tuned model (base price * token * epoch)
    • Using the fine-tuned model during inference

2. Setting up your environment for fine-tuning

Explore the OpenAI API for fine-tuning

  • Fine-Tuning Endpoints
    1. Create fine-tuning job
    2. List fine-tuning job
    3. List fine-tuning events
    4. Retrieve fine-tuning job
    5. Cancel fine-tuning
Read more »

1
2
3
4
5
6
7
8
9
Status: Finished
Author: Terezija Semenski
Publishing/Release Date: April 15, 2024
Publisher: Linkedin
Link: https://www.linkedin.com/learning/pytorch-essential-training-deep-learning-23753149/deep-learning-with-pytorch?resume=false&u=3322
Type: Courses
Tags: AI
Start Date: June 1, 2024
End Date: June 2, 2024

Use Google Colab: https://colab.research.google.com/

https://colab.research.google.com/drive/1VHaPSHXGrLlJ5dzC628OVogfY4f8w2mZ

Tensors

Introduction to Tensors

We can think Tensor is generalizations of scalars, vectors, and matrices to any dimension

01

  • Tensor vs ndarray

    Advantages of Tensors

    • Tensor operations are performed significantly faster using GPUs
    • Tensors can be stored and manipulated at scale using distributed processing on multiple CPUs and GPUs and across multiple servers
    • Tensors keep track of the graph of computations that created them

Creating a tensor CPU example

1
2
3
4
5
6
7
8
9
import torch

first_tens = torch.tensor([[12, 10, 11, 9],[13, 15, 14, 16]])
second_tens = torch.tensor([[1, 2, 3, 4], [5, 6, 7, 8]])

add_tens = first_tens + second_tens

print(add_tens)
print(add_tens.size())
Read more »

昨天去了趟杭州,算是对自己多年想去杭州的心一个交代。早上开车,一路南下,高速上车一直很多,在从嘉兴到杭州的路段,前面两辆车忽然追尾,自己开始半刹还有二三十米发现可能也会撞上去,一脚踩死,立即打开双闪。所幸后车没有追上来。此时心里已经有些烦躁,为什么人这么多。

到了杭州市区,发现杭州的交通确实有些堵,快速路只能三四十的速度,下了快速路后有一段300米的路堵了十五分钟。路边的花不知道是桃花还是梨花,跟苏州的赛雪欺霜的满树缤纷不同,多了一些青青的绿色,甚是好看。

路边的树开了花

在南山路上转了一圈,除了堵还是堵,没有找到停车位,路边时不时有中年人说要停车不?转了一圈找到一个叫涌金广场的停车场。这个涌金广场不知是不是有些年头了,给人一些陈旧之感。一路按照导航,穿过过街通道,走过银泰广场。想去洗手间,去了二楼,在洗手间指示牌旁有家店,让我大受震撼,一个穿着可能已经比内裤多一些布料的短裤,上身可能有些类似泳装清凉,中间不知道纹的什么的一个挺好看的女生在整理店里的衣服,也不知道是老板还是服务员。杭州这么前卫的嘛。当然别人穿什么是否纹身跟我一毛钱关系也没有,更是不能对此人做评判。只是感叹世界之大,自己眼界太窄罢了。

来到湖边,远处是青山隐隐,湖光潋滟,但更多的是前面游人如织。没想到不是赶在节前来还是这么多人。湖边有长廊凉亭,聚集了很多退休大爷大妈,放生高歌,带音响的那种,你方唱罢我登场,我自己也喜欢听一些老歌,但唱的这么难听,还放着么大声,头都要被轰炸了。不得不感叹,浙江养老金高就是好啊。

湖心亭

湖边柳树掩映白墙

走了半天,发现这条路之前走过,那次是跟同事出差,夜里十点以后走在湖边。夏季的凉风伴着虫鸣,哪能想到白天喧嚣如此。

湖边人很多,我也是其中一个

近景是不能看的,毕竟全是人,只能远眺。如果在无人的晨间,必定很美。

Read more »

2024年了,我已经36岁了,我还是一个程序员,我还是在微软工作,似乎还是没有什么变化。但是仔细想来,周边发生了很多的变化。

去年举家从南京搬往苏州,女儿也上了小学。每天上班下班带娃写作业,周末带娃去骑自行车。

每年得学习一些新的知识来刷新自己,去年有几天学习了深度学习相关的知识,觉得似乎没有想象的那么难。今年打算继续在NLP和LLM方向深入学习。

之前一直在做微服务相关的工作,今年打算在微服务的基础上,深入学习一下Service Mesh相关的知识。把自己在istio上的一些实践总结一下。

新的一年,也要继续骑车,保持健康的身体。

新的一年,也要继续学习,保持年轻的心态。最近发现自己心态老了,不再愿意去深入了解一些新的技术,总是觉得自己已经够用了,不需要再去学习新的东西了。这是不对的。

新的一年,也要继续写作,保持自己的思考。最近发现自己的写作能力下降了,提笔时经常不知道自己在想什么。

2023-11-01-huzhou.jpeg

Deploying Applications the DevOps Way

[TOC]

1. Using the Helm Package Manager

  • Helm is used to streamline installing and managing Kubernetes applications.
  • Helm consists of the helm tool, which needs to be installed, and a chart.
  • A chart is a Helm package, which contains the following:
    • A description of the package
    • One or more templates containing Kubernetes manifest files
  • Charts can be stored locally, or accessed from remote Helm repositories.

Demo: Installing the Helm Binary

  • Fetch the binary from https://github.com/helm/helm/releases ; check for the latest release!
  • tar xvf helm-xxx.tar.gz
  • sudo mv linux-amd64/helm /usr/local/bin/
  • helm version

Getting Access to Helm Charts

The main site for finding Helm charts, is through https://artifacthub.io

This is a major way for finding repository names. We can search for specific software here, and run the commands to install it; for instance, to run the kubernetes dashboard:

1
2
3
# helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
Read more »

[TOC]


1. 什么是分布式系统

计算机科学家Andrew S. Tenenbaum对分布式定义为:

A collection of independent computers that appear to its users as one computer

他认为,分布式系统必须要有的三个特征是:

  1. The computers operate concurrently.
  2. The computers fail independently.
  3. The computers don’t share a global clock.

本篇文章包含以下分布式相关的内容:

  1. Storage(存储): Relational/Mongo, Cassandra, HDFS
  2. Computation(计算): Hadoop, Spark, Storm
  3. Synchroniztion(同步): NTP, vector clock(向量时钟)
  4. Consensus(共识): Paxos, Zookeeper
  5. Messaging(消息): Kafka

文中会以一个咖啡店业务的发展过程作为例子来进行分布式概念的引入,该咖啡店提供在线业务,从小到大的发展中遇到了各个技术问题,我们一点点进行说明讲解。

Read more »

[TOC]

1. 编写消息并编译为go代码

1.1 安装

  1. 安装go vscode
  2. vscode安装protobuf插件
  3. 安装protoc,编写makefile生成go代码

1.2 protocol message规则

  1. message使用驼峰

  2. 字段使用lower_snake_case

  3. 内置的类型

    1. string, bool, bytes

    2. float, double

    3. int32, int64, uint32, uint64, sint32, sint64

  4. 也可以使用其他message作为字段类型

  5. tag很重要:

    1. tag整数,从1到2^29-1
    2. 19000-19999保留
    3. 第1-15个tag使用1个byte
    4. 16-2047个tag占用2个byte
    5. tag不需要有顺序或者递增tag必须唯一定义消息

1.3 消息定义

定义文件proto/processor_message.proto

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
syntax = "proto3";
option go_package="io.liux/pcbook/pb";

package pb;
message CPU {
// comment 1
string brand = 1;
/*
* comment 2
*/
string name = 2;
uint32 number_cores = 3;
uint32 number_threads = 4;
double min_ghz = 5;
double max_ghz = 6;
}

1.4 编译为Go

Read more »

[TOC]

1. Opertimizing Data Access

MySQL提供了Sakila示例数据库供学习

  1. 只返回应用需要的:恰当地使用WHERE语句
  2. 只返回应用需要的:避免使用SELECT *,尽量指定列名
  3. 避免对同样的数据多次查询:使用应用缓存
  4. order the data only if you are not ordering them in application: 使用ORDER BY,避免在应用中排序
  5. SELECT DISTINCT name, last_name去重

2. MySQL Query Optimization

查询语句执行时经历的组件:

mysql-1

0%