likes
comments
collection
share

从头开始构建数据库:08. Rows and Columns

作者站长头像
站长
· 阅读数 5

8.1 Introduction

The first step in building a relational DB on top of a KV store is to add tables. A table is just a bunch of rows and columns. A subset of the columns is defined as the “primary key”; primary keys are unique, so they can be used to refer to a row (in queries and secondary indexes).

在 KV 存储之上构建关系数据库的第一步是添加表。表只是一系列的行和列。列的某个子集被定义为“主键”;主键是唯一的,因此它们可用于引用行(在查询和二级索引中)。

How does a table fit into a KV store? We can split a row into two parts:

在KV 存储如何放置一张表?我们可以将一行分成两部分:

  1. Primary key columns go into the “key” part of the KV store.

主键列进入 KV 存储的“键”部分。

  1. Non-primary key columns go into the “value” part.

非主键列进入“值”部分。

This allows us to do both point queries and range queries on the primary key. For now, we’ll only consider queries on the primary key, the use of secondary indexes is deferred to a later chapter.

这使得我们可以对主键进行点查询和范围查询。目前,我们只考虑对主键的查询,二级索引的使用推迟到后面的章节。

8.2 Data Structures

Below is the definition of rows and cells. For now, we only support two data types (int64 and bytes).

以下是行和单元格的定义。目前,我们仅支持两种数据类型(int64 和 bytes)。

const (
   TYPE_ERROR = 0
   TYPE_BYTES = 1
   TYPE_INT64 = 2
)
// table cell
type Value struct {
   Type uint32
   I64 int64
   Str []byte
}
// table row

type Record struct {
   Cols []string
   Vals []Value
}
func (rec *Record) AddStr(key string, val []byte) *Record
func (rec *Record) AddInt64(key string, val int64) *Record
func (rec *Record) Get(key string) *Value

The definition for the DB and tables:

DB和表的定义:

type DB struct {
   Path string
   // internals
   kv KV
   tables map[string]*TableDef // cached table definition
}
// table definition
type TableDef struct {
   // user defined
   Name string
   Types []uint32 // column types
   Cols []string // column names
   PKeys int // the first `PKeys` columns are the primary key
   // auto-assigned B-tree key prefixes for different tables
   Prefix uint32
}

To support multiple tables, the keys in the KV store are prefixed with a unique 32-bit number.

为了支持多个表,KV 存储中的键以唯一的 32 位数字为前缀。

Table definitions have to be stored somewhere, we’ll use an internal table to store them. And we’ll also add an internal table to store the metadata used by the DB itself.

表定义必须存储在某个地方,我们将使用内部表来存储它们。我们还将添加一个内部表来存储数据库本身使用的元数据。

// internal table: metadata
var TDEF_META = &TableDef{
   Prefix: 1,
   Name: "@meta",
   Types: []uint32{TYPE_BYTES, TYPE_BYTES},
   Cols: []string{"key", "val"},
   PKeys: 1,
}
// internal table: table schemas
var TDEF_TABLE = &TableDef{
   Prefix: 2,
   Name: "@table",
   Types: []uint32{TYPE_BYTES, TYPE_BYTES},
   Cols: []string{"name", "def"},
   PKeys: 1,
}

8.3 Point Query

Let’s implement the point query by the primary key, range queries will be added in the next chapter.

我们先来实现一下主键的点查询,范围查询将在下一章中添加。

// get a single row by the primary key
func dbGet(db *DB, tdef *TableDef, rec *Record) (bool, error) {
     values, err := checkRecord(tdef, *rec, tdef.PKeys)
     if err != nil {
        return false, err
     }
     key := encodeKey(nil, tdef.Prefix, values[:tdef.PKeys])
     val, ok := db.kv.Get(key)
     if !ok {
        return false, nil
     }

     for i := tdef.PKeys; i < len(tdef.Cols); i++ {
        values[i].Type = tdef.Types[i]
     }
     decodeValues(val, values[tdef.PKeys:])

     rec.Cols = append(rec.Cols, tdef.Cols[tdef.PKeys:]...)
     rec.Vals = append(rec.Vals, values[tdef.PKeys:]...)
     return true, nil
}

The procedure is:

  1. Verify that the input is a complete primary key. (checkRecord)

验证输入是否是完整的主键。 (查看记录)

  1. Encode the primary key. (encodeKey)

  2. Query the KV store. (db.kv.Get)

查询KV存储。 (db.kv.Get)

  1. Decode the value. (decodeValues)

解码该值。 (解码值)

// reorder a record and check for missing columns.
// n == tdef.PKeys: record is exactly a primary key
// n == len(tdef.Cols): record contains all columns
func checkRecord(tdef *TableDef, rec Record, n int) ([]Value, error) {
   // omitted...
}

The method for encoding data into bytes and decoding from bytes will be explained in the next chapter. For now, any serialization scheme will do for this chapter.

将数据编码为字节以及从字节解码的方法将在下一章中解释。目前,任何序列化方案都适用于本章。

func encodeValues(out []byte, vals []Value) []byte
func decodeValues(in []byte, out []Value)
// for primary keys
func encodeKey(out []byte, prefix uint32, vals []Value) []byte {
    var buf [4]byte
    binary.BigEndian.PutUint32(buf[:], prefix)
    out = append(out, buf[:]...)
    out = encodeValues(out, vals)
    return out
}

To query a table we must get its definition first.

要查询表,我们必须首先获取它的定义。

// get a single row by the primary key
func (db *DB) Get(table string, rec *Record) (bool, error) {
    tdef := getTableDef(db, table)
    if tdef == nil {
        return false, fmt.Errorf("table not found: %s", table)
    }
    return dbGet(db, tdef, rec)
}

The definition is stored as a JSON in the internal table TDEF_TABLE.

该定义以 JSON 形式存储在内部表 TDEF_TABLE 中。

// get the table definition by name
func getTableDef(db *DB, name string) *TableDef {
      tdef, ok := db.tables[name]
      if !ok {
           if db.tables == nil {
               db.tables = map[string]*TableDef{}
           }
           tdef = getTableDefDB(db, name)
           if tdef != nil {
               db.tables[name] = tdef
           }
      }
      return tdef
}
func getTableDefDB(db *DB, name string) *TableDef {
      rec := (&Record{}).AddStr("name", []byte(name))
      ok, err := dbGet(db, TDEF_TABLE, rec)
      assert(err == nil)

      if !ok {
         return nil
      }
      tdef := &TableDef{}
      err = json.Unmarshal(rec.Get("def").Str, tdef)
      assert(err == nil)
      return tdef
}

8.4 Updates

An update can be either insert a new row or replace an existing row. The B-tree interface is modified to support different update modes.

更新可以是插入新行或替换现有行。 B树接口被修改以支持不同的更新模式。

// modes of the updates
const (
    MODE_UPSERT = 0 // insert or replace
    MODE_UPDATE_ONLY = 1 // update existing keys
    MODE_INSERT_ONLY = 2 // only add new keys
)
type InsertReq struct {
    tree *BTree
    // out
    Added bool // added a new key
    // in
    Key []byte
    Val []byte
    Mode int
}
func (tree *BTree) InsertEx(req *InsertReq)
func (db *KV) Update(key []byte, val []byte, mode int) (bool, error)

The function for updating a record via the primary key:

通过主键更新记录的函数:

// add a row to the table
func dbUpdate(db *DB, tdef *TableDef, rec Record, mode int) (bool, error) {
   values, err := checkRecord(tdef, rec, len(tdef.Cols))
   if err != nil {
      return false, err
   }
   key := encodeKey(nil, tdef.Prefix, values[:tdef.PKeys])
   val := encodeValues(nil, values[tdef.PKeys:])
   return db.kv.Update(key, val, mode)
}

Different update modes:

不同的更新模式:

// add a record
func (db *DB) Set(table string, rec Record, mode int) (bool, error) {
    tdef := getTableDef(db, table)
    if tdef == nil {
       return false, fmt.Errorf("table not found: %s", table)
    }
    return dbUpdate(db, tdef, rec, mode)
}
func (db *DB) Insert(table string, rec Record) (bool, error) {
    return db.Set(table, rec, MODE_INSERT_ONLY)
}
func (db *DB) Update(table string, rec Record) (bool, error) {
    return db.Set(table, rec, MODE_UPDATE_ONLY)
}
func (db *DB) Upsert(table string, rec Record) (bool, error) {
    return db.Set(table, rec, MODE_UPSERT)
}

Deleting a row is similar:

删除一行类似:

// delete a record by its primary key
func dbDelete(db *DB, tdef *TableDef, rec Record) (bool, error) {
    values, err := checkRecord(tdef, rec, tdef.PKeys)
    if err != nil {
        return false, err
    }
    key := encodeKey(nil, tdef.Prefix, values[:tdef.PKeys])
    return db.kv.Del(key)
}
func (db *DB) Delete(table string, rec Record) (bool, error) {
    tdef := getTableDef(db, table)
    if tdef == nil {
        return false, fmt.Errorf("table not found: %s", table)
    }
    return dbDelete(db, tdef, rec)
}

8.5 Create New Tables

Three steps:

  1. Check the table definition.

检查表定义。

  1. Allocate the table key prefix.

  2. Store the next table prefix and the table definitions.

存储下一个表前缀和表定义。

func (db *DB) TableNew(tdef *TableDef) error {
     if err := tableDefCheck(tdef); err != nil {
         return err
     }
     // check the existing table
     table := (&Record{}).AddStr("name", []byte(tdef.Name))
     ok, err := dbGet(db, TDEF_TABLE, table)

     assert(err == nil)
     if ok {
         return fmt.Errorf("table exists: %s", tdef.Name)
     }
     // allocate a new prefix
     assert(tdef.Prefix == 0)
     tdef.Prefix = TABLE_PREFIX_MIN
     meta := (&Record{}).AddStr("key", []byte("next_prefix"))
     ok, err = dbGet(db, TDEF_META, meta)
     assert(err == nil)
     if ok {
         tdef.Prefix = binary.LittleEndian.Uint32(meta.Get("val").Str)
         assert(tdef.Prefix > TABLE_PREFIX_MIN)
     } else {
         meta.AddStr("val", make([]byte, 4))
     }
     // update the next prefix
     binary.LittleEndian.PutUint32(meta.Get("val").Str, tdef.Prefix+1)
     _, err = dbUpdate(db, TDEF_META, *meta, 0)
     if err != nil {
         return err
     }
     // store the definition
     val, err := json.Marshal(tdef)
     assert(err == nil)
     table.AddStr("def", val)
     _, err = dbUpdate(db, TDEF_TABLE, *table, 0)
     return err
}

The prefix numbers are allocated incrementally from the next_prefix key of the TDEF_META internal table. The table definitions are stored as a JSON in the TDEF_TABLE table.

前缀编号是从 TDEF_META 内部表的 next_prefix 键增量分配的。表定义以 JSON 形式存储在 TDEF_TABLE 表中。

Although we have added table structures, the result is still pretty much a KV store. Some important aspects are missing:

虽然我们添加了表结构,但结果仍然是一个 KV 存储。缺少一些重要的方面:

  1. Range queries in the next chapter.

范围查询将在下一章中介绍。

  1. Secondary indexes in the next next chapter.

二级索引在下一章。

转载自:https://juejin.cn/post/7310034974424449076
评论
请登录