为何最终我放弃了 Go 的 sync.Pool-golang-CSS教程网

为何最终我放弃了 Go 的 sync.Pool

声明: 本文并非否定 sync.Pool，而是分享技术选型的思考过程，帮助大家更准确地使用它

一、使用场景

一句话总结：保存和复用临时对象，减少内存分配，降低GC压力

1.1、引入：

举个简单的例子：

type User struct {
    ID       int64  `json:"id"`
    Username string `json:"username"`
    Email    string `json:"email"`
    Profile  [512]byte `json:"profile_data"` // 简介
}

var buf, _ = json.Marshal(
		User{
			ID: 1, 
			Username: "john_doe", 
			Email: "john@example.***",
		   },
		)

 user := &User{}
 json.Unmarshal(buf, user)

json的反序列化在数据解析和网络通信中非常常见，当程序并发度非常高的情况下，
短时间内需要创建大量临时对象。而这些临时对象都是分配在堆上的，会给GC造成很大的压力，严重影响程序的性能。
所以可以通过sync.Pool来解决。

1.2、什么是sync.pool？

Go语言，从1.3版本开始提供对象重用机制，即 sync.Pool。
sync.Pool 是 sync 包下的一个组件，可以作为保存临时取还对象的一个“池子”。
同时sync.Pool是可伸缩且并发安全的，他的大小受限于内存的大小。sync.Pool用于存储那些被分配了但是没有被使用，而未来还会使用的值。

这样就不用再次经过内存分配，而是直接复用对象，减轻GC压力，从而提升性能。

但个人觉得它的命名可能造成误解，因为 Pool 里装的对象可以被无通知地被回收，可能 sync.Cache(临时缓存) 是一个更合适的名字。

二、如何使用

sync.Pool 的使用方式非常简单：

2.1、声明对象池

只需要实现New函数即可，当对象池(sync.Pool)中没有对象时，就会自动调用New函数进行。

var userPool = sync.Pool{
    New: func() interface{} { 
        return new(User) 
    },
}

2.2、GET & PUT

// 取出
user := userPool.Get().(*User) 
json.Unmarshal(buf,user)
// 放回
userPool.Put(user)

Get() 用于从对象池中获取对象，因为返回值是 interface{}，因此需要类型转换。
Put() 则是在对象使用完毕后，返回对象池。

三、实例：

3.1、标准库中的应用

3.1.1: fmt.Printf

Go语言标准库大量使用了sync.Pool，例如: fmt和encoding/json
以下是fmt.Printf的源代码(go/src/fmt/print.go) - 你也可以到本地Go源码自行查看

// go 1.13.6

// pp is used to store a printer's state and is reused with sync.Pool to avoid allocations.
// pp用于存储打印机的状态，并与sync.Pool一起重用。以避免分配。
type pp struct {
    buf buffer
    ...
}

var ppFree = sync.Pool{
	New: func() interface{} { return new(pp) },
}

// newPrinter allocates a new pp struct or grabs a cached one.
// newPrinter分配了一个新的pp结构体或获取一个缓存的pp结构体。
func newPrinter() *pp {
	p := ppFree.Get().(*pp)
	p.panicking = false
	p.erroring = false
	p.wrapErrs = false
	p.fmt.init(&p.buf)
	return p
}

// free saves used pp structs in ppFree; avoids an allocation per invocation.
// 在ppFree中保存使用过的pp结构体；避免每次调用分配。
func (p *pp) free() {
	if cap(p.buf) > 64<<10 {
		return
	}

	p.buf = p.buf[:0]
	p.arg = nil
	p.value = reflect.Value{}
	p.wrappedErr = nil
	ppFree.Put(p)
}

func Fprintf(w io.Writer, format string, a ...interface{}) (n int, err error) {
	p := newPrinter()
	p.doPrintf(format, a)
	n, err = w.Write(p.buf)
	p.free()
	return
}

// Printf formats a***ording to a format specifier and writes to standard output.
// Printf根据格式说明符进行格式化，并写入标准输出。
// It returns the number of bytes written and any write error encountered.
// 返回写入的字节数和遇到的任何写入错误。
func Printf(format string, a ...interface{}) (n int, err error) {
	return Fprintf(os.Stdout, format, a...)
}

fmt.Printf 的调用是非常频繁的，利用 sync.Pool 复用 pp 对象能够极大地提升性能，减少内存占用，同时降低 GC 压力。

3.2、Gin框架的应用(context)

在Gin框架中，Context 对象代表了处理一个HTTP请求的上下文。每个请求都需要一个Context，请求处理完毕，Context的生命周期也就结束了。

高频的创建于销毁：在高并发下，每秒会创建和销毁大量Context对象。
固定生命周期：Context的生命周期始于请求到来，止于请求处理完毕，非常短暂。

3.2.1、定义对象池

在gin.Engine结构体的定义中，你可以看到pool字段就是一个sync.Pool

type Engine struct {
    // ... 其他字段
    pool sync.Pool // context 对象池
}
如下：

3.2.2、初始化对象池

在创建Gin引擎实例的时，会初始化sync.Pool，并指定New函数。
当池子中无对象可用的时，会调用此函数创建新的Context。

func New() *Engine {
    // ...
    engine.pool.New = func() any {
        return engine.allocateContext(engine.maxParams)
    }
    return engine
}

func (engine *Engine) allocateContext(maxParams uint16) *Context {
    // 分配并初始化一个Context
    v := make(Params, 0, maxParams)
    return &Context{engine: engine, params: &v, skippedNodes: &skippedNodes}
}

3.2.3、从池中获取Context

当HTTP请求到达时，Gin会从sync.Pool中获取一个Context对象。

func (engine *Engine) ServeHTTP(w http.ResponseWriter, req *http.Request) {
    // 从对象池中获取一个 context[citation:7]
    c := engine.pool.Get().(*Context)
    c.writermem.reset(w)
    c.Request = req
    c.reset()
    // ... 处理 http 请求
    engine.handleHTTPRequest(c)
    // 把 context 放回对象池[citation:7]
    engine.pool.Put(c)
}

3.2.4、处理请求后放回池中

请求处理完毕后，Gin会将Contex重置并放回sync.Pool中，以供后面复用。

func (engine *Engine) ServeHTTP(w http.ResponseWriter, req *http.Request) {
    c := engine.pool.Get().(*Context)
    // ... 处理 http 请求
    engine.handleHTTPRequest(c)
    // 请求处理完成后，将 Context 放回池中[citation:7]
    engine.pool.Put(c)
}

切记，重点是要重置的，如调用c.reset()。确保放回的是干净的上下文。

四、我在项目中的实战

4.1、为何最初选择sync.Pool

因以后其他博客还会提及，所以这里就简洁的说一下：
我的目的：

设计了一个支持多存储驱动的图片上传模块，重点解决了并发性能、资源管理和动态切换的问题

为了解决所谓的高并发，复用实例的问题，我就自然的想到去使用sync.Pool，但问题来了！

对象复用：避免频繁创建和销毁对象
并发安全：多个用户可同时使用不同驱动

为此，我还美滋滋的，描绘了一个草图：

// 多驱动对象池管理器
type MultiDriverPool struct {
    pools   map[string]ObjectPool
    mu      sync.RWMutex
    current string // 当前默认驱动
}
// 对象池接口
type ObjectPool interface {
    Get() (Driver, error)
    Put(Driver) error
    Close()
    Size() int
    Available() int
}

4.2、又为何选择放弃sync.pool

4.2.1、存储驱动通常是无状态的

比如： 七牛云驱动使用相同的A***essKey和SecretKey，每个实例都执行相同的操作，没有必要维护多个实例。实际上，一个驱动实例就可以处理所有请求，而且通常驱动本身是线程安全的（或者可以通过在方法内部分配资源来做到线程安全）

**换句话说就是：**认为每个驱动实例需要频繁创建和销毁，但实际上驱动实例是可以复用的，而且创建成本不高，并且“存储驱动是无状态的” ！

所以我最终的设计模式是：单例+多驱动模式。

五、总结

适合 sync.Pool 的场景：

创建成本高 对象初始化有显著开销
生命周期短 使用后很快就不再需要
使用频率高 大量并发创建销毁
可安全重置 能完全清理之前的状态

不适合：

// 1、存储驱动 - 创建成本低，生命周期长
var driverPool = sync.Pool{
    New: func() interface{} { return &QiniuDriver{} },
}

// 2、 数据库连接 - 需要连接池，不是对象池
var dbPool = sync.Pool{
    New: func() interface{} { return sql.Open(...) },
}

// 3、配置对象 - 长期存在，不需要频繁创建
var configPool = sync.Pool{
    New: func() interface{} { return loadConfig() },
}

在结尾处，我在声明一下：
sync.Pool 的核心作用，不是资源管理。
而是通过保存和复用临时对象，减少内存分配，降低GC压力！

六、sync.Pool的底层剖析

6.1 底层结构体

// [Go 内存模型]: https://go.dev/ref/mem
type Pool struct {
	noCopy noCopy

	local     unsafe.Pointer // 每个 P 的本地固定大小池，实际类型是 [P]poolLocal
	localSize uintptr        // 本地数组的大小

	victim     unsafe.Pointer // 上一个周期的本地池
	victimSize uintptr        // victim 数组的大小

	// New 可以选择性地指定一个函数，用于在 Get 否则会返回 nil 时生成一个值。
	// 不能在与 Get 调用并发的情况下修改此函数。
	New func() any
}

6.2 重点

在Pool的底层，核心有两点：分别是local与victim

6.2.1 local unsafe.Pointer

local 是一个按 P（GOMAXPROCS）分片的本地对象池。
每个 P 都有自己的 poolLocal，无需锁，极快。
Get / Put 操作优先访问本地池，不需要加锁。

6.2.2 victim

Go 认为 Pool 内的对象是可丢弃的，所以每次 GC 会清空 pool.local。
为了避免冲击（比如刚清空就马上又需要大量对象），Go 引入了：上一 GC 周期的 pool.local 备份。
避免 GC 后对象全部被清空导致性能抖动。
victim的大致流程如下：

七、性能测试

7.1 测试主函数


type User struct {
	ID       int64     `json:"id"`
	Username string    `json:"username"`
	Email    string    `json:"email"`
	Profile  [512]byte `json:"profile_data"`
}

// 创建 User Pool
var userPool = sync.Pool{New: func() interface{} {
	atomic.AddUint64(&poolMisses, 1)
	return new(User)
}}

// 创建 Buffer Pool
var bufPool = sync.Pool{New: func() interface{} {
	return new(bytes.Buffer)
}}

// 调用Get的次数
var totalGets uint64

// 必须创建新对象的次数
var poolMisses uint64

// 获取一个 User
func getUser() *User {
	atomic.AddUint64(&totalGets, 1)
	return userPool.Get().(*User)
}

// 放回 User
func putUser(u *User) {
	// 1、清空数据
	u.ID = 0
	u.Username = ""
	u.Email = ""
	for i := range u.Profile {
		u.Profile[i] = 0
	}
	// 2、放回
	userPool.Put(u)
}

// 处理 User
func processUser(data []byte) *User {
	u := getUser()
	_ = json.Unmarshal(data, u)
	return u
}

// 处理 HTTP 请求
func handleProcess(w http.ResponseWriter, r *http.Request) {
	// 1、获取 Buffer
	var b bytes.Buffer

	// 2、获取 User
	_, _ = io.Copy(&b, r.Body)

	// 3、处理 User
	u := processUser(b.Bytes())
	defer putUser(u)
	// ...
}

func handleMetrics(w http.ResponseWriter, _ *http.Request) {
	hits := atomic.LoadUint64(&totalGets) - atomic.LoadUint64(&poolMisses)
	_, _ = w.Write([]byte("sync_pool_gets " + strconv.FormatUint(atomic.LoadUint64(&totalGets), 10) + "\n"))
	_, _ = w.Write([]byte("sync_pool_misses " + strconv.FormatUint(atomic.LoadUint64(&poolMisses), 10) + "\n"))
	_, _ = w.Write([]byte("sync_pool_hits " + strconv.FormatUint(hits, 10) + "\n"))
}

func main() {
	http.HandleFunc("/process", handleProcess)
	http.HandleFunc("/metrics", handleMetrics)
	_ = http.ListenAndServe(":8080", nil)
}

7.2 对象的复用率


// TestHTTPConcurrent
// 测试结果：totalGets=500, poolMisses=28
func TestHTTPConcurrent(t *testing.T) {
	data := []byte(`{"id":10,"username":"concurrent","email":"c@example.***"}`)
	req := httptest.NewRequest(http.MethodPost, "/process", bytes.NewBuffer(data))

	n := 500
	var wg sync.WaitGroup
	wg.Add(n)
	for i := 0; i < n; i++ {
		go func() {
			defer wg.Done()
			w := httptest.NewRecorder()
			handleProcess(w, req)
			if w.Code != http.StatusOK {
				t.Errorf("bad status")
			}
		}()
	}
	wg.Wait()

	t.Logf("totalGets=%d, poolMisses=%d", totalGets, poolMisses)
}

=== RUN TestHTTPConcurrent
pool_test.go:65: totalGets=500（调用get的总次数）,poolMisses=4（新new的次数）
— PASS: TestHTTPConcurrent (0.00s)
PASS
但若大家自己测，由于处于不同环境，结果应该会有些许波动。

7.3 对象复用性能测试

采用 基准测试 ：用来测性能的测试，包括耗时、内存分配、GC 压力等

// b.N 是测试循环次数（Go 自动调整）
// b.ReportAllocs()：显示内存分配次数
// b.ResetTimer()：重置计时器（忽略前面初始化的耗时）

// ------------------------
// Benchmark - 无对象池
// ------------------------
func BenchmarkWithoutPool(b *testing.B) {
	data := []byte(`{"id":123,"username":"user123","email":"user123@example.***"}`)
	b.ReportAllocs()
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		u := &User{}
		_ = json.Unmarshal(data, u)
	}
}

// ------------------------
// Benchmark - 对象池
// ------------------------
func BenchmarkWithPool(b *testing.B) {
	data := []byte(`{"id":123,"username":"user123","email":"user123@example.***"}`)
	b.ReportAllocs()
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		u := getUser()
		_ = json.Unmarshal(data, u)
		putUser(u)
	}
}

测试项	ns/op（每次操作耗时）	B/op（分配内存字节数）	allocs/op（分配次数）
WithoutPool	721 ns	816 B	7 allocs
WithPool	664 ns	240 B	6 allocs

可以从，B/op（分配内存字节数），近4倍的差距，看出性能的差距。
当然大家自测时，应该会出现偏差，要以withoutPool与withPool的差距作为对比标准。

八、自测

sync.Pool 的主要作用是什么？为什么它能减少 GC 压力？
sync.Pool.New 是在什么情况下被调用的？
为什么从 pool 取出的对象必须重置（reset）？
为什么在你的代码中，putUser() 必须把结构体所有字段清空？(原：str = “” 设为空)
为什么结构体字段清空了却依旧要 Reset()？
sync.Pool 为什么不是普通的缓存？它有什么生命周期特性？

借鉴：
1、Go 语言高性能编程 - sync.pool
2、深度解密 Go 语言之 sync.Pool

转载请说明出处内容投诉
CSS教程网 » 为何最终我放弃了 Go 的 sync.Pool

命运amp我手中

分享到：