The map
programming interface in GO described in GO blog.
We just need to recall, that map is key-value storage and
it should retrieve values by key as fast as possible.
Any comparable type can be a key — all simple scalar types, channels, arrays.
Non-comparable types — slices, maps, functions.
Map keys and values should be stored in memory allocated for map, consequentially. To deal with memory addresses we should use hash-function.
We have following requirements for hash-function:
Now to implementation. Simplified map structure is below (sources):
// A header for a Go map. type hmap struct { count int // # live cells == size of map. Must be first (used by len() builtin) B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items) buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0. oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing }
Data addresses are stored divided into parts — in buckets
array.
unsafe.Pointer
— can be pointer of any variable type — generics issue solution
(GO developers use it to make keys and values of any type).
Buckets is nil if there is no data in map. On first key 8 buckets are added.
We need to find memory address of key and value.
First to find bucket. It is selected by a comparison of first 8 bits of key hash with corresponding data store in bucket struct.
Next to find key value by its address.
k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) if t.indirectkey() { k = *((*unsafe.Pointer)(k)) }
Next to find value same way.
if t.key.equal(key, k) { e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) if t.indirectelem() { e = *((*unsafe.Pointer)(e)) } return e }
We work with oldbuckets
map struct property on data migration.
Migration starts when there is to much data in buckets.
We save current value of buckets
to oldbuckets
,
then create doubled amount of buckets in buckets
property.
Map data is copied from oldbuckets
to buckets
.
Amount of buckets is always power of 2.
That's why there is B
property — it is power of 2 that shows current number of buckets.
On migration map is accessible. That's why there a lot of work
both with buckets
and oldbuckets
in same functions of source code.
After migration oldbuckets
is set to nil.
On migration address of data can be changes, so GO doesn't allow to get pointer of map value:
mymap := map[string]string{"1": "1"} fmt.Println(&mymap["1"]) // cannot take the address of mymap["1"]
The great GopherCon 2016 related lecture: