Principles of map type in GO

The map programming interface in GO described in GO blog. We just need to recall, that map is key-value storage and it should retrieve values by key as fast as possible.

Any comparable type can be a key — all simple scalar types, channels, arrays.
Non-comparable types — slices, maps, functions.

Map keys and values should be stored in memory allocated for map, consequentially. To deal with memory addresses we should use hash-function.

We have following requirements for hash-function:

Deterministic — should return same value for same key;
Uniform — values should be somehow equally related to all buckets;
Fast — should run quickly to be used many times;

Now to implementation. Simplified map structure is below (sources):

// A header for a Go map.
type hmap struct {
	count      int // # live cells == size of map.  Must be first (used by len() builtin)
	B          uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
	buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
	oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
}

Data addresses are stored divided into parts — in buckets array.

unsafe.Pointer — can be pointer of any variable type — generics issue solution (GO developers use it to make keys and values of any type).

Buckets is nil if there is no data in map. On first key 8 buckets are added.

Getting data from map

We need to find memory address of key and value.

First to find bucket. It is selected by a comparison of first 8 bits of key hash with corresponding data store in bucket struct.

Next to find key value by its address.

k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
if t.indirectkey() {
	k = *((*unsafe.Pointer)(k))
}

Next to find value same way.

if t.key.equal(key, k) {
	e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize))
	if t.indirectelem() {
		e = *((*unsafe.Pointer)(e))
	}
	return e
}

Map growing

We work with oldbuckets map struct property on data migration.

Migration starts when there is to much data in buckets. We save current value of buckets to oldbuckets, then create doubled amount of buckets in buckets property. Map data is copied from oldbuckets to buckets.

Amount of buckets is always power of 2.
That's why there is B property — it is power of 2 that shows current number of buckets.

On migration map is accessible. That's why there a lot of work both with buckets and oldbuckets in same functions of source code. After migration oldbuckets is set to nil.

On migration address of data can be changes, so GO doesn't allow to get pointer of map value:

mymap := map[string]string{"1": "1"}
fmt.Println(&mymap["1"])

// cannot take the address of mymap["1"]

The great GopherCon 2016 related lecture:

Principles of map type in GO

Getting data from map

Map growing

Related articles