Private data

What is private data?

在某个通道上的一组组织需要对该通道上的其他组织保持数据私密的情况下,它们可以选择创建一个新通道,其中只包含需要访问数据的组织。但是,在每一种情况下创建单独的通道会产生额外的管理开销(维护链码版本、策略、MSPs等),并且不允许在保持部分数据私有的同时,让所有通道参与者都看到交易。

这就是为什么从v1.2开始,Fabric就提供了创建私有数据集合的能力,这使得通道上定义的组织子集能够背书、提交或查询私有数据,而不必创建单独的通道。

What is a private data collection?

一个私有数据集合由两个元素组成:

  1. The actual private data, sent peer-to-peer via gossip protocol to only the organization(s) authorized to see it. This data is stored in a private state database on the peers of authorized organizations (sometimes called a “side” database, or “SideDB”), which can be accessed from chaincode on these authorized peers. The ordering service is not involved here and does not see the private data. Note that because gossip distributes the private data peer-to-peer across authorized organizations, it is required to set up anchor peers on the channel, and configure CORE_PEER_GOSSIP_EXTERNALENDPOINT on each peer, in order to bootstrap cross-organization communication.

  2. A hash of that data, which is endorsed, ordered, and written to the ledgers of every peer on the channel. The hash serves as evidence of the transaction and is used for state validation and can be used for audit purposes.

下图说明了一个被授权拥有私有数据的节点和一个未授权的节点的账本内容。

../_images/PrivateDataConcept-2.pngprivate-data.private-data

如果集合成员陷入争议,或者如果他们想将资产转让给第三方,则可以决定与其他方共享私有数据。然后,第三方可以计算私有数据的散列,并查看它是否与通道账本上的状态匹配,从而证明在某个时间点上集合成员之间的状态存在。

When to use a collection within a channel vs. a separate channel

  • Use channels when entire transactions (and ledgers) must be kept confidential within a set of organizations that are members of the channel.

  • Use collections when transactions (and ledgers) must be shared among a set of organizations, but when only a subset of those organizations should have access to some (or all) of the data within a transaction. Additionally, since private data is disseminated peer-to-peer rather than via blocks, use private data collections when transaction data must be kept confidential from ordering service nodes.

A use case to explain collections

考虑一个由5个组织组成的通道,他们交易农产品:

  • A Farmer selling his goods abroad

  • A Distributor moving goods abroad

  • A Shipper moving goods between parties

  • A Wholesaler purchasing goods from distributors

  • A Retailer purchasing goods from shippers and wholesalers

分销商可能希望与农场主和托运商之间保持私密交易,以对批发商和零售商保密交易条款(以免暴露他们收取的加价)。

分销商还可能希望与批发商建立单独的私有数据关系,因为它向批发商收取的价格比零售商低。

批发商还可能希望与零售商和托运商建立私有数据关系。

与其为这些关系定义许多小通道,不如定义多个私有数据集合(PDC)在以下各方之间共享私有数据:

  1. PDC1: Distributor, Farmer and Shipper

  2. PDC2: Distributor and Wholesaler

  3. PDC3: Wholesaler, Retailer and Shipper

../_images/PrivateDataConcept-1.pngprivate-data.private-data

使用此示例,分销商拥有的节点在其账本中拥有多个私有数据库,其中包括来自分销商、农民和发货人关系以及分销商和批发商关系的私有数据。因为这些数据库与保存通道账本的数据库是分开的,所以私有数据有时被称为“SideDB”。

../_images/PrivateDataConcept-3.pngprivate-data.private-data

Transaction flow with private data

当在链码中引用私有数据集合时,为了保护私有数据的机密性,在提案、背书的和提交交易到账本时,交易流程略有不同。

关于不使用私有数据的交易流程的详细信息,请参阅我们关于交易流程的文档。

  1. The client application submits a proposal request to invoke a chaincode function (reading or writing private data) to endorsing peers which are part of authorized organizations of the collection. The private data, or data used to generate private data in chaincode, is sent in a transient field of the proposal.

  2. The endorsing peers simulate the transaction and store the private data in a transient data store (a temporary storage local to the peer). They distribute the private data, based on the collection policy, to authorized peers via gossip.

  3. The endorsing peer sends the proposal response back to the client. The proposal response includes the endorsed read/write set, which includes public data, as well as a hash of any private data keys and values. No private data is sent back to the client. For more information on how endorsement works with private data, click here.

  4. The client application submits the transaction (which includes the proposal response with the private data hashes) to the ordering service. The transactions with the private data hashes get included in blocks as normal. The block with the private data hashes is distributed to all the peers. In this way, all peers on the channel can validate transactions with the hashes of the private data in a consistent way, without knowing the actual private data.

  5. At block commit time, authorized peers use the collection policy to determine if they are authorized to have access to the private data. If they do, they will first check their local transient data store to determine if they have already received the private data at chaincode endorsement time. If not, they will attempt to pull the private data from another authorized peer. Then they will validate the private data against the hashes in the public block and commit the transaction and the block. Upon validation/commit, the private data is moved to their copy of the private state database and private writeset storage. The private data is then deleted from the transient data store.

Purging private data

对于非常敏感的数据,即使共享私有数据的各方也可能希望 — 或政府法规要求 — 定期“清除”其peer的数据,留下区块链上的数据哈希作为私有数据的不可变证据。

在某些情况下,私有数据只需要存在于节点的私有数据库上,直到可以将其复制到区块链外部的数据库中为止。数据也可能只需要存在于节点上,直到使用它完成链码业务流程(交易结算、合约履行等)。

为了支持这些用例,如果已经持续N个块都没有修改私有数据了,N是可配置的,则可以清除它。清除的私有数据不能从链码查询,并且其他节点也请求不到。

How a private data collection is defined

有关集合定义的更多详细信息,以及关于私有数据和集合的其他更细层级的信息,请参阅私有数据主题。