理解分布式id生成算法SnowFlake

harriszh 發布于2019-08-16 10:48 / 1350人閱讀

摘要：分布式生成算法的有很多種，的就是其中經典的一種。負數的二進制表示在計算機中，負數的二進制是用補碼來表示的。

分布式id生成算法的有很多種，Twitter的SnowFlake就是其中經典的一種。

概述

SnowFlake算法生成id的結果是一個64bit大小的整數，它的結構如下圖：

1位，不用。二進制中最高位為1的都是負數，但是我們生成的id一般都使用整數，所以這個最高位固定是0

41位，用來記錄時間戳（毫秒）。

41位可以表示$2^{41}-1$個數字，

如果只用來表示正整數（計算機中正數包含0），可以表示的數值范圍是：0 至 $2^{41}-1$，減1是因為可表示的數值范圍是從0開始算的，而不是1。

也就是說41位可以表示$2^{41}-1$個毫秒的值，轉化成單位年則是$(2^{41}-1) / (1000 * 60 * 60 * 24 * 365) = 69$年

10位，用來記錄工作機器id。

可以部署在$2^{10} = 1024$個節點，包括5位datacenterId和5位workerId

5位（bit）可以表示的最大正整數是$2^{5}-1 = 31$，即可以用0、1、2、3、....31這32個數字，來表示不同的datecenterId或workerId

12位，序列號，用來記錄同毫秒內產生的不同id。

12位（bit）可以表示的最大正整數是$2^{12}-1 = 4095$，即可以用0、1、2、3、....4094這4095個數字，來表示同一機器同一時間截（毫秒)內產生的4095個ID序號

由于在Java中64bit的整數是long類型，所以在Java中SnowFlake算法生成的id就是long來存儲的。

SnowFlake可以保證：

所有生成的id按時間趨勢遞增

整個分布式系統內不會產生重復id（因為有datacenterId和workerId來做區分）

Talk is cheap, show you the code

以下是Twitter官方原版的，用Scala寫的，（我也不懂Scala，當成Java看即可）：

/** Copyright 2010-2012 Twitter, Inc.*/
package com.twitter.service.snowflake

import com.twitter.ostrich.stats.Stats
import com.twitter.service.snowflake.gen._
import java.util.Random
import com.twitter.logging.Logger

/**
 * An object that generates IDs.
 * This is broken into a separate class in case
 * we ever want to support multiple worker threads
 * per process
 */
class IdWorker(
    val workerId: Long, 
    val datacenterId: Long, 
    private val reporter: Reporter, 
    var sequence: Long = 0L) extends Snowflake.Iface {
    
  private[this] def genCounter(agent: String) = {
    Stats.incr("ids_generated")
    Stats.incr("ids_generated_%s".format(agent))
  }
  private[this] val exceptionCounter = Stats.getCounter("exceptions")
  private[this] val log = Logger.get
  private[this] val rand = new Random

  val twepoch = 1288834974657L

  private[this] val workerIdBits = 5L
  private[this] val datacenterIdBits = 5L
  private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
  private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
  private[this] val sequenceBits = 12L

  private[this] val workerIdShift = sequenceBits
  private[this] val datacenterIdShift = sequenceBits + workerIdBits
  private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
  private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)

  private[this] var lastTimestamp = -1L

  // sanity check for workerId
  if (workerId > maxWorkerId || workerId < 0) {
    exceptionCounter.incr(1)
    throw new IllegalArgumentException("worker Id can"t be greater than %d or less than 0".format(maxWorkerId))
  }

  if (datacenterId > maxDatacenterId || datacenterId < 0) {
    exceptionCounter.incr(1)
    throw new IllegalArgumentException("datacenter Id can"t be greater than %d or less than 0".format(maxDatacenterId))
  }

  log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
    timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId)

  def get_id(useragent: String): Long = {
    if (!validUseragent(useragent)) {
      exceptionCounter.incr(1)
      throw new InvalidUserAgentError
    }

    val id = nextId()
    genCounter(useragent)

    reporter.report(new AuditLogEntry(id, useragent, rand.nextLong))
    id
  }

  def get_worker_id(): Long = workerId
  def get_datacenter_id(): Long = datacenterId
  def get_timestamp() = System.currentTimeMillis

  protected[snowflake] def nextId(): Long = synchronized {
    var timestamp = timeGen()

    if (timestamp < lastTimestamp) {
      exceptionCounter.incr(1)
      log.error("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
      throw new InvalidSystemClock("Clock moved backwards.  Refusing to generate id for %d milliseconds".format(
        lastTimestamp - timestamp))
    }

    if (lastTimestamp == timestamp) {
      sequence = (sequence + 1) & sequenceMask
      if (sequence == 0) {
        timestamp = tilNextMillis(lastTimestamp)
      }
    } else {
      sequence = 0
    }

    lastTimestamp = timestamp
    ((timestamp - twepoch) << timestampLeftShift) |
      (datacenterId << datacenterIdShift) |
      (workerId << workerIdShift) | 
      sequence
  }

  protected def tilNextMillis(lastTimestamp: Long): Long = {
    var timestamp = timeGen()
    while (timestamp <= lastTimestamp) {
      timestamp = timeGen()
    }
    timestamp
  }

  protected def timeGen(): Long = System.currentTimeMillis()

  val AgentParser = """([a-zA-Z][a-zA-Z-0-9]*)""".r

  def validUseragent(useragent: String): Boolean = useragent match {
    case AgentParser(_) => true
    case _ => false
  }
}

Scala是一門可以編譯成字節碼的語言，簡單理解是在Java語法基礎上加上了很多語法糖，例如不用每條語句后寫分號，可以使用動態類型等等。抱著試一試的心態，我把Scala版的代碼“翻譯”成Java版本的，對scala代碼改動的地方如下：

/** Copyright 2010-2012 Twitter, Inc.*/
package com.twitter.service.snowflake

import com.twitter.ostrich.stats.Stats 
import com.twitter.service.snowflake.gen._
import java.util.Random
import com.twitter.logging.Logger

/**
 * An object that generates IDs.
 * This is broken into a separate class in case
 * we ever want to support multiple worker threads
 * per process
 */
class IdWorker(                                        // |
    val workerId: Long,                                // |
    val datacenterId: Long,                            // |<--這部分改成Java的構造函數形式
    private val reporter: Reporter,//日志相關，刪       // |
    var sequence: Long = 0L)                           // |
       extends Snowflake.Iface { //接口找不到，刪       // |     
    
  private[this] def genCounter(agent: String) = {                     // |
    Stats.incr("ids_generated")                                       // |
    Stats.incr("ids_generated_%s".format(agent))                      // |<--錯誤、日志處理相關，刪
  }                                                                   // | 
  private[this] val exceptionCounter = Stats.getCounter("exceptions") // |
  private[this] val log = Logger.get                                  // |
  private[this] val rand = new Random                                 // | 

  val twepoch = 1288834974657L

  private[this] val workerIdBits = 5L
  private[this] val datacenterIdBits = 5L
  private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
  private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
  private[this] val sequenceBits = 12L

  private[this] val workerIdShift = sequenceBits
  private[this] val datacenterIdShift = sequenceBits + workerIdBits
  private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
  private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)

  private[this] var lastTimestamp = -1L

  //----------------------------------------------------------------------------------------------------------------------------//
  // sanity check for workerId                                                                                                  //
  if (workerId > maxWorkerId || workerId < 0) {                                                                                 //
    exceptionCounter.incr(1) //<--錯誤處理相關，刪                                                                               //
    throw new IllegalArgumentException("worker Id can"t be greater than %d or less than 0".format(maxWorkerId))                 //這
    // |-->改成：throw new IllegalArgumentException                                                                              //部
    //            (String.format("worker Id can"t be greater than %d or less than 0",maxWorkerId))                              //分
  }                                                                                                                             //放
                                                                                                                                //到
  if (datacenterId > maxDatacenterId || datacenterId < 0) {                                                                     //構
    exceptionCounter.incr(1) //<--錯誤處理相關，刪                                                                               //造
    throw new IllegalArgumentException("datacenter Id can"t be greater than %d or less than 0".format(maxDatacenterId))         //函
    // |-->改成：throw new IllegalArgumentException                                                                             //數
    //             (String.format("datacenter Id can"t be greater than %d or less than 0",maxDatacenterId))                     //中
  }                                                                                                                             //
                                                                                                                                //
  log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d", //  
    timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId)                                                 //   
  // |-->改成：System.out.printf("worker...%d...",timestampLeftShift,...);                                                      //
  //----------------------------------------------------------------------------------------------------------------------------//

  //-------------------------------------------------------------------//  
  //這個函數刪除錯誤處理相關的代碼后，剩下一行代碼：val id = nextId()      //
  //所以我們直接調用nextId()函數可以了，所以在“翻譯”時可以刪除這個函數      //
  def get_id(useragent: String): Long = {                              // 
    if (!validUseragent(useragent)) {                                  //
      exceptionCounter.incr(1)                                         //
      throw new InvalidUserAgentError                                  //刪
    }                                                                  //除
                                                                       // 
    val id = nextId()                                                  // 
    genCounter(useragent)                                              //
                                                                       //
    reporter.report(new AuditLogEntry(id, useragent, rand.nextLong))   //
    id                                                                 //
  }                                                                    // 
  //-------------------------------------------------------------------//

  def get_worker_id(): Long = workerId           // |
  def get_datacenter_id(): Long = datacenterId   // |<--改成Java函數
  def get_timestamp() = System.currentTimeMillis // |

  protected[snowflake] def nextId(): Long = synchronized { // 改成Java函數
    var timestamp = timeGen()

    if (timestamp < lastTimestamp) {
      exceptionCounter.incr(1) // 錯誤處理相關，刪
      log.error("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp); // 改成System.err.printf(...)
      throw new InvalidSystemClock("Clock moved backwards.  Refusing to generate id for %d milliseconds".format(
        lastTimestamp - timestamp)) // 改成RumTimeException
    }

    if (lastTimestamp == timestamp) {
      sequence = (sequence + 1) & sequenceMask
      if (sequence == 0) {
        timestamp = tilNextMillis(lastTimestamp)
      }
    } else {
      sequence = 0
    }

    lastTimestamp = timestamp
    ((timestamp - twepoch) << timestampLeftShift) | // |<--加上關鍵字return
      (datacenterId << datacenterIdShift) |         // |
      (workerId << workerIdShift) |                 // |
      sequence                                      // |
  }

  protected def tilNextMillis(lastTimestamp: Long): Long = { // 改成Java函數
    var timestamp = timeGen()
    while (timestamp <= lastTimestamp) {
      timestamp = timeGen()
    }
    timestamp // 加上關鍵字return
  }

  protected def timeGen(): Long = System.currentTimeMillis() // 改成Java函數

  val AgentParser = """([a-zA-Z][a-zA-Z-0-9]*)""".r                  // |
                                                                      // | 
  def validUseragent(useragent: String): Boolean = useragent match {  // |<--日志相關，刪
    case AgentParser(_) => true                                       // |
    case _ => false                                                   // |   
  }                                                                   // | 
}

改出來的Java版：

public class IdWorker{

    private long workerId;
    private long datacenterId;
    private long sequence;

    public IdWorker(long workerId, long datacenterId, long sequence){
        // sanity check for workerId
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can"t be greater than %d or less than 0",maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can"t be greater than %d or less than 0",maxDatacenterId));
        }
        System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
                timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);

        this.workerId = workerId;
        this.datacenterId = datacenterId;
        this.sequence = sequence;
    }

    private long twepoch = 1288834974657L;

    private long workerIdBits = 5L;
    private long datacenterIdBits = 5L;
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    private long sequenceBits = 12L;

    private long workerIdShift = sequenceBits;
    private long datacenterIdShift = sequenceBits + workerIdBits;
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private long sequenceMask = -1L ^ (-1L << sequenceBits);

    private long lastTimestamp = -1L;

    public long getWorkerId(){
        return workerId;
    }

    public long getDatacenterId(){
        return datacenterId;
    }

    public long getTimestamp(){
        return System.currentTimeMillis();
    }

    public synchronized long nextId() {
        long timestamp = timeGen();

        if (timestamp < lastTimestamp) {
            System.err.printf("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds",
                    lastTimestamp - timestamp));
        }

        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0;
        }

        lastTimestamp = timestamp;
        return ((timestamp - twepoch) << timestampLeftShift) |
                (datacenterId << datacenterIdShift) |
                (workerId << workerIdShift) |
                sequence;
    }

    private long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }

    private long timeGen(){
        return System.currentTimeMillis();
    }

    //---------------測試---------------
    public static void main(String[] args) {
        IdWorker worker = new IdWorker(1,1,1);
        for (int i = 0; i < 30; i++) {
            System.out.println(worker.nextId());
        }
    }

}

代碼理解

上面的代碼中，有部分位運算的代碼，如：

sequence = (sequence + 1) & sequenceMask;

private long maxWorkerId = -1L ^ (-1L << workerIdBits);

return ((timestamp - twepoch) << timestampLeftShift) |
        (datacenterId << datacenterIdShift) |
        (workerId << workerIdShift) |
        sequence;

為了能更好理解，我對相關知識研究了一下。

負數的二進制表示

在計算機中，負數的二進制是用補碼來表示的。
假設我是用Java中的int類型來存儲數字的，
int類型的大小是32個二進制位（bit），即4個字節（byte）。（1 byte = 8 bit）
那么十進制數字3在二進制中的表示應該是這樣的：

00000000 00000000 00000000 00000011
// 3的二進制表示，就是原碼

那數字-3在二進制中應該如何表示？
我們可以反過來想想，因為-3+3=0，
在二進制運算中把-3的二進制看成未知數x來求解，
求解算式的二進制表示如下：

   00000000 00000000 00000000 00000011 //3，原碼
+  xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx //-3，補碼
-----------------------------------------------
   00000000 00000000 00000000 00000000

反推x的值，3的二進制加上什么值才使結果變成00000000 00000000 00000000 00000000？：

   00000000 00000000 00000000 00000011 //3，原碼                         
+  11111111 11111111 11111111 11111101 //-3，補碼
-----------------------------------------------
 1 00000000 00000000 00000000 00000000

反推的思路是3的二進制數從最低位開始逐位加1，使溢出的1不斷向高位溢出，直到溢出到第33位。然后由于int類型最多只能保存32個二進制位，所以最高位的1溢出了，剩下的32位就成了（十進制的）0。

補碼的意義就是可以拿補碼和原碼（3的二進制）相加，最終加出一個“溢出的0”

以上是理解的過程，實際中記住公式就很容易算出來：

補碼 = 反碼 + 1

補碼 = （原碼 - 1）再取反碼

因此-1的二進制應該這樣算：

00000000 00000000 00000000 00000001 //原碼：1的二進制
11111111 11111111 11111111 11111110 //取反碼：1的二進制的反碼
11111111 11111111 11111111 11111111 //加1：-1的二進制表示（補碼）

用位運算計算n個bit能表示的最大數值

比如這樣一行代碼：

    private long workerIdBits = 5L;
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);

上面代碼換成這樣看方便一點：
long maxWorkerId = -1L ^ (-1L << 5L)

咋一看真的看不準哪個部分先計算，于是查了一下Java運算符的優先級表:

所以上面那行代碼中，運行順序是：

-1 左移 5，得結果a

-1 異或 a

long maxWorkerId = -1L ^ (-1L << 5L)的二進制運算過程如下：

-1 左移 5，得結果a ：

        11111111 11111111 11111111 11111111 //-1的二進制表示（補碼）
  11111 11111111 11111111 11111111 11100000 //高位溢出的不要，低位補0
        11111111 11111111 11111111 11100000 //結果a

-1 異或 a ：

        11111111 11111111 11111111 11111111 //-1的二進制表示（補碼）
    ^   11111111 11111111 11111111 11100000 //兩個操作數的位中，相同則為0，不同則為1
---------------------------------------------------------------------------
        00000000 00000000 00000000 00011111 //最終結果31

最終結果是31，二進制00000000 00000000 00000000 00011111轉十進制可以這么算：
$$ 2^4 + 2^3 + 2^2 + 2^1 + 2^0 = 16 + 8 + 4 + 2 + 1 =31 $$

那既然現在知道算出來long maxWorkerId = -1L ^ (-1L << 5L)中的maxWorkerId = 31，有什么含義？為什么要用左移5來算？如果你看過概述部分，請找到這段內容看看：

5位（bit）可以表示的最大正整數是$2^{5}-1 = 31$，即可以用0、1、2、3、....31這32個數字，來表示不同的datecenterId或workerId

-1L ^ (-1L << 5L)結果是31，$2^{5}-1$的結果也是31，所以在代碼中，-1L ^ (-1L << 5L)的寫法是利用位運算計算出5位能表示的最大正整數是多少

用mask防止溢出

有一段有趣的代碼：

sequence = (sequence + 1) & sequenceMask;

分別用不同的值測試一下，你就知道它怎么有趣了：

        long seqMask = -1L ^ (-1L << 12L); //計算12位能耐存儲的最大正整數，相當于：2^12-1 = 4095
        System.out.println("seqMask: "+seqMask);
        System.out.println(1L & seqMask);
        System.out.println(2L & seqMask);
        System.out.println(3L & seqMask);
        System.out.println(4L & seqMask);
        System.out.println(4095L & seqMask);
        System.out.println(4096L & seqMask);
        System.out.println(4097L & seqMask);
        System.out.println(4098L & seqMask);

        
        /**
        seqMask: 4095
        1
        2
        3
        4
        4095
        0
        1
        2
        */

這段代碼通過位與運算保證計算的結果范圍始終是 0-4095 ！

用位運算匯總結果

還有另外一段詭異的代碼：

return ((timestamp - twepoch) << timestampLeftShift) |
        (datacenterId << datacenterIdShift) |
        (workerId << workerIdShift) |
        sequence;

為了弄清楚這段代碼，

首先需要計算一下相關的值：

    private long twepoch = 1288834974657L; //起始時間戳，用于用當前時間戳減去這個時間戳，算出偏移量

    private long workerIdBits = 5L; //workerId占用的位數：5
    private long datacenterIdBits = 5L; //datacenterId占用的位數：5
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);  // workerId可以使用的最大數值：31
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); // datacenterId可以使用的最大數值：31
    private long sequenceBits = 12L;//序列號占用的位數：12

    private long workerIdShift = sequenceBits; // 12
    private long datacenterIdShift = sequenceBits + workerIdBits; // 12+5 = 17
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; // 12+5+5 = 22
    private long sequenceMask = -1L ^ (-1L << sequenceBits);//4095

    private long lastTimestamp = -1L;

其次寫個測試，把參數都寫死，并運行打印信息，方便后面來核對計算結果：

    //---------------測試---------------
    public static void main(String[] args) {
        long timestamp = 1505914988849L;
        long twepoch = 1288834974657L;
        long datacenterId = 17L;
        long workerId = 25L;
        long sequence = 0L;

        System.out.printf("
timestamp: %d 
",timestamp);
        System.out.printf("twepoch: %d 
",twepoch);
        System.out.printf("datacenterId: %d 
",datacenterId);
        System.out.printf("workerId: %d 
",workerId);
        System.out.printf("sequence: %d 
",sequence);
        System.out.println();
        System.out.printf("(timestamp - twepoch): %d 
",(timestamp - twepoch));
        System.out.printf("((timestamp - twepoch) << 22L): %d 
",((timestamp - twepoch) << 22L));
        System.out.printf("(datacenterId << 17L): %d 
" ,(datacenterId << 17L));
        System.out.printf("(workerId << 12L): %d 
",(workerId << 12L));
        System.out.printf("sequence: %d 
",sequence);

        long result = ((timestamp - twepoch) << 22L) |
                (datacenterId << 17L) |
                (workerId << 12L) |
                sequence;
        System.out.println(result);

    }

    /** 打印信息：
        timestamp: 1505914988849 
        twepoch: 1288834974657 
        datacenterId: 17 
        workerId: 25 
        sequence: 0 
        
        (timestamp - twepoch): 217080014192 
        ((timestamp - twepoch) << 22L): 910499571845562368 
        (datacenterId << 17L): 2228224 
        (workerId << 12L): 102400 
        sequence: 0 
        910499571847892992
    */

代入位移的值得之后，就是這樣：

return ((timestamp - 1288834974657) << 22) |
        (datacenterId << 17) |
        (workerId << 12) |
        sequence;

對于尚未知道的值，我們可以先看看概述中對SnowFlake結構的解釋，再代入在合法范圍的值(windows系統可以用計算器方便計算這些值的二進制)，來了解計算的過程。
當然，由于我的測試代碼已經把這些值寫死了，那直接用這些值來手工驗證計算結果即可：

        long timestamp = 1505914988849L;
        long twepoch = 1288834974657L;
        long datacenterId = 17L;
        long workerId = 25L;
        long sequence = 0L;

設：timestamp  = 1505914988849，twepoch = 1288834974657
1505914988849 - 1288834974657 = 217080014192 (timestamp相對于起始時間的毫秒偏移量)，其(a)二進制左移22位計算過程如下：                                

                        |<--這里開始左右22位                            ?
00000000 00000000 000000|00 00110010 10001010 11111010 00100101 01110000 // a = 217080014192
00001100 10100010 10111110 10001001 01011100 00|000000 00000000 00000000 // a左移22位后的值(la)
                                               |<--這里后面的位補0

設：datacenterId  = 17，其（b）二進制左移17位計算過程如下：

                   |<--這里開始左移17位    
00000000 00000000 0|0000000 ?00000000 00000000 00000000 00000000 00010001 // b = 17
0000000?0 00000000 00000000 00000000 00000000 0010001|0 00000000 00000000 // b左移17位后的值(lb)
                                                    |<--這里后面的位補0

設：workerId  = 25，其（c）二進制左移12位計算過程如下：

             |<--這里開始左移12位    
?00000000 0000|0000 00000000 00000000 00000000 00000000 00000000 00011001? // c = 25
00000000 00000000 00000000 00000000 00000000 00000001 1001|0000 00000000? // c左移12位后的值(lc)                                                                 
                                                          |<--這里后面的位補0

設：sequence = 0，其二進制如下：

00000000 00000000 00000000 00000000 00000000 00000000 0000?0000 00000000? // sequence = 0

現在知道了每個部分左移后的值(la,lb,lc)，代碼可以簡化成下面這樣去理解：

return ((timestamp - 1288834974657) << 22) |
        (datacenterId << 17) |
        (workerId << 12) |
        sequence;
-----------------------------
           |
           |簡化
          |/
-----------------------------
return (la) |
        (lb) |
        (lc) |
        sequence;

上面的管道符號|在Java中也是一個位運算符。其含義是：
x的第n位和y的第n位只要有一個是1，則結果的第n位也為1，否則為0，因此，我們對四個數的位或運算如下：

 1  |                    41                        |  5  |   5  |     12      
    
   0|0001100 10100010 10111110 10001001 01011100 00|00000|0 0000|0000 00000000 //la
   0|000000?0 00000000 00000000 00000000 00000000 00|10001|0 0000|0000 00000000 //lb
   0|0000000 00000000 00000000 00000000 00000000 00|00000|1 1001|0000 00000000 //lc
or 0|0000000 00000000 00000000 00000000 00000000 00|00000|0 0000|?0000 00000000? //sequence
------------------------------------------------------------------------------------------
   0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|?0000 00000000? //結果：910499571847892992

結果計算過程：
1) 從至左列出1出現的下標（從0開始算）：

0000  1   1   00  1   0  1  000  1   0  1  0  1  1  1  1  1  0 1   000 1 00 1  0 1  0   1  1  1  0000 1   000  1  1  1  00  1?   0000 0000 0000
      59  58      55     53      49     47    45 44 43 42 41   39      35   32   30     28 27 26      21       17 16 15     12

2) 各個下標作為2的冪數來計算，并相加：

$ 2^{59}+2^{58}+2^{55}+2^{53}+2^{49}+2^{47}+2^{45}+2^{44}+2^{43}+
2^{42}+2^{41}+2^{39}+2^{35}+2^{32}+2^{30}+2^{28}+2^{27}+2^{26}+
2^{21}+2^{17}+2^{16}+2^{15}+2^{2} $

    2^59}  : 576460752303423488
    2^58}  : 288230376151711744   
    2^55}  :  36028797018963968    
    2^53}  :   9007199254740992     
    2^49}  :    562949953421312      
    2^47}  :    140737488355328
    2^45}  :     35184372088832
    2^44}  :     17592186044416
    2^43}  :      8796093022208
    2^42}  :      4398046511104
    2^41}  :      2199023255552
    2^39}  :       549755813888
    2^35}  :        34359738368
    2^32}  :         4294967296
    2^30}  :         1073741824
    2^28}  :          268435456
    2^27}  :          134217728
    2^26}  :           67108864
    2^21}  :            2097152
    2^17}  :             131072
    2^16}  :              65536
    2^15}  :              32768
+   2^12}  :               4096
---------------------------------------- 
             910499571847892992

計算截圖：

跟測試程序打印出來的結果一樣，手工驗證完畢！

觀察

 1  |                    41                        |  5  |   5  |     12      
    
   0|0001100 10100010 10111110 10001001 01011100 00|     |      |              //la
   0|                                              |10001|      |              //lb
   0|                                              |     |1 1001|              //lc
or 0|                                              |     |      |?0000 00000000? //sequence
------------------------------------------------------------------------------------------
   0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|?0000 00000000? //結果：910499571847892992

上面的64位我按1、41、5、5、12的位數截開了，方便觀察。

縱向觀察發現:

在41位那一段，除了la一行有值，其它行（lb、lc、sequence）都是0，（我爸其它）

在左起第一個5位那一段，除了lb一行有值，其它行都是0

在左起第二個5位那一段，除了lc一行有值，其它行都是0

按照這規律，如果sequence是0以外的其它值，12位那段也會有值的，其它行都是0

橫向觀察發現:

在la行，由于左移了5+5+12位，5、5、12這三段都補0了，所以la行除了41那段外，其它肯定都是0

同理，lb、lc、sequnece行也以此類推

正因為左移的操作，使四個不同的值移到了SnowFlake理論上相應的位置，然后四行做位或運算（只要有1結果就是1），就把4段的二進制數合并成一個二進制數。

結論：
所以，在這段代碼中

return ((timestamp - 1288834974657) << 22) |
        (datacenterId << 17) |
        (workerId << 12) |
        sequence;

左移運算是為了將數值移動到對應的段(41、5、5，12那段因為本來就在最右，因此不用左移)。

然后對每個左移后的值(la、lb、lc、sequence)做位或運算，是為了把各個短的數據合并起來，合并成一個二進制數。

最后轉換成10進制，就是最終生成的id

擴展

在理解了這個算法之后，其實還有一些擴展的事情可以做：

根據自己業務修改每個位段存儲的信息。算法是通用的，可以根據自己需求適當調整每段的大小以及存儲的信息。

解密id，由于id的每段都保存了特定的信息，所以拿到一個id，應該可以嘗試反推出原始的每個段的信息。反推出的信息可以幫助我們分析。比如作為訂單，可以知道該訂單的生成日期，負責處理的數據中心等等。

云服務器 GPU云服務器生成唯一id算法 php生成唯一id算法分布式id生成器 snowflake算法

文章版權歸作者所有，未經允許請勿轉載,若此文章存在違規行為，您可以聯系管理員刪除。

轉載請注明本文地址：http://m.specialneedsforspecialkids.com/yun/70491.html

Snowflake分布式ID生成算法PHP的版本

摘要：所以就誕生了這個項目，以下為線程安全版本和非線程安全版本差別。非線程安全版本線程安全版本安裝示例注意區間在超出范圍將會報告一個致命錯誤協議版權歸屬于請遵守協議 php_snowflake 項目地址什么是 php_snowflake? 推特分布式id生成算法SnowFlake PHP 的實現需求 PHP >= 5.6 (5.5以下的自行測試) 不支持windows 說明純PH...

alphahans 2019-07-01 12:16 評論0 收藏0
Twitter的分布式雪花算法 SnowFlake 每秒自增生成26個萬個可排序的ID (Java版

摘要：原理的雪花算法，使用語言實現。生成的整體上按照時間自增排序，并且整個分布式系統內不會產生碰撞由和作區分，并且效率較高。據說每秒能夠產生萬個。分布式系統中，有一些需要使用全局唯一ID的場景，這種時候為了防止ID沖突可以使用36位的UUID，但是UUID有一些缺點，首先他相對比較長，另外UUID一般是無序的。有些時候我們希望能使用一種簡單一些的ID，并且希望ID能夠按照時間有序生成。 ...

Awbeci 2019-08-15 14:42 評論0 收藏0
分布式id生成方案概述

摘要：序本文主要來聊聊分布式的生成方案。分布式的生成，以為代表的，系列算法采用的就是劃分命名空間并行生成的思路。序本文主要來聊聊分布式id的生成方案。目標業務系統需要什么樣的ID生成器中提出了幾點目標：唯一性時間相關粗略有序可反解可制造主要思路對于每個標識，都需要有一個命名空間（namespace），來保證其相對唯一性。分布式的ID生成，以Twitter Snowf...

Terry_Tai 2019-08-16 10:42 評論0 收藏0
雪花算法 - snowflake

摘要：有些時候我們希望能使用一種簡單一些的，并且希望能夠按照時間有序生成。轉換成字符串后長度最多生成的整體上按照時間自增排序，并且整個分布式系統內不會產生碰撞由和作區分，并且效率較高。經測試每秒能夠產生萬個。概述分布式系統中，有一些需要使用全局唯一ID的場景，這種時候為了防止ID沖突可以使用36位的UUID，但是UUID有一些缺點，首先他相對比較長，另外UUID一般是無序的。有些時候我...

lemon 2019-07-01 11:03 評論0 收藏0

發表評論

登陸后可評論

0條評論

harriszh

男|高級講師

我要關注我要私信

TA的文章

BUI Webapp用于項目中的一點小心得

閱讀 1696·2019-08-30 15:54
前端面試題總結——綜合問題(持續更新中)

閱讀 3347·2019-08-26 17:15
在瀏覽器調起本地應用的方法

閱讀 3540·2019-08-26 13:49
leetcode 鏈表相關題目解析

閱讀 2591·2019-08-26 13:38
【刷算法】丑數

閱讀 2302·2019-08-26 12:08
webstorm預覽html配置localhost為本機ip地址

閱讀 3065·2019-08-26 10:41
籃球即時比分api接口調用示例代碼

閱讀 1380·2019-08-26 10:24
Webpack包教不包會

閱讀 3389·2019-08-23 18:35

国产xxxx99真实实拍_久久不雅视频_高清韩国a级特黄毛片_嗯老师别我我受不了了小说

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優惠，快來選購！

理解分布式id生成算法SnowFlake

相關文章

Snowflake分布式ID生成算法PHP的版本

**Twitter的分布式雪花算法 SnowFlake 每秒自增生成26個萬個可排序的ID (Java版**

分布式id生成方案概述

雪花算法 - snowflake

發表評論

0條評論

harriszh

男|高級講師

TA的文章

BUI Webapp用于項目中的一點小心得

前端面試題總結——綜合問題(持續更新中)

在瀏覽器調起本地應用的方法

leetcode 鏈表相關題目解析

【刷算法】丑數

webstorm預覽html配置localhost為本機ip地址

籃球即時比分api接口調用示例代碼

Webpack包教不包會

最新活動

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優惠，快來選購！

理解分布式id生成算法SnowFlake

相關文章

發表評論

0條評論

男|高級講師

TA的文章

最新活動

上云采購季！| 2核2G4M爆款云服務器低至59元/年，更有多臺、長期優惠，快來選購！